Contextual Feature Extraction Hierarchies Converge in Large Language Models and the Brain
CoRR(2024)
摘要
Recent advancements in artificial intelligence have sparked interest in the
parallels between large language models (LLMs) and human neural processing,
particularly in language comprehension. While prior research has established
similarities in the representation of LLMs and the brain, the underlying
computational principles that cause this convergence, especially in the context
of evolving LLMs, remain elusive. Here, we examined a diverse selection of
high-performance LLMs with similar parameter sizes to investigate the factors
contributing to their alignment with the brain's language processing
mechanisms. We find that as LLMs achieve higher performance on benchmark tasks,
they not only become more brain-like as measured by higher performance when
predicting neural responses from LLM embeddings, but also their hierarchical
feature extraction pathways map more closely onto the brain's while using fewer
layers to do the same encoding. We also compare the feature extraction pathways
of the LLMs to each other and identify new ways in which high-performing models
have converged toward similar hierarchical processing mechanisms. Finally, we
show the importance of contextual information in improving model performance
and brain similarity. Our findings reveal the converging aspects of language
processing in the brain and LLMs and offer new directions for developing models
that align more closely with human cognitive processing.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要