Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective
arxiv(2023)
摘要
Large Language Models (LLMs) inherently encode a wealth of knowledge within
their parameters through pre-training on extensive corpora. While prior
research has delved into operations on these parameters to manipulate the
underlying implicit knowledge (encompassing detection, editing, and merging),
there remains an ambiguous understanding regarding their transferability across
models with varying scales. In this paper, we seek to empirically investigate
knowledge transfer from larger to smaller models through a parametric
perspective. To achieve this, we employ sensitivity-based techniques to extract
and align knowledge-specific parameters between different LLMs. Moreover, the
LoRA module is used as the intermediary mechanism for injecting the extracted
knowledge into smaller models. Evaluations across four benchmarks validate the
efficacy of our proposed method. Our findings highlight the critical factors
contributing to the process of parametric knowledge transfer, underscoring the
transferability of model parameters across LLMs of different scales. Project
website: https://maszhongming.github.io/ParaKnowTransfer.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要