Have AI-Generated Texts from LLM Infiltrated the Realm of Scientific Writing? A Large-Scale Analysis of Preprint Platforms

Huzi Cheng, Bin Sheng,Aaron Lee,Varun Chaudary,Atanas G. Atanasov,Nan Liu, Yue Qiu,Tien Yin Wong, Yih-Chung Tham,Yingfeng Zheng

biorxiv(2024)

引用 0|浏览5
暂无评分
摘要
Since the release of ChatGPT in 2022, AI-generated texts have inevitably permeated various types of writing, sparking debates about the quality and quantity of content produced by such large language models (LLM). This study investigates a critical question: Have AI-generated texts from LLM infiltrated the realm of scientific writing, and if so, to what extent and in what setting? By analyzing a dataset comprised of preprint manuscripts uploaded to arXiv, bioRxiv, and medRxiv over the past two years, we confirmed and quantified the widespread influence of AI-generated texts in scientific publications using the latest LLM-text detection technique, the Binoculars LLM-detector. Further analyses with this tool reveal that: (1) the AI influence correlates with the trend of ChatGPT web searches; (2) it is widespread across many scientific domains but exhibits distinct impacts within them (highest: computer science, engineering sciences); (3) the influence varies with authors who have different language speaking backgrounds and geographic regions according to the location of their affiliations (Italy, China, etc.); (4) AI-generated texts are used in various content types in manuscripts (most significant: hypothesis formulation, conclusion summarization); (5) AI usage has a positive influence on paper’s impact, measured by its citation numbers. Based on these findings, suggestions about the advantages and regulation of AI-augmented scientific writing are discussed. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要