Statistical Measure of the Effectiveness of the Open Editing Model of Wikipedia

msra(2010)

引用 25|浏览25
暂无评分
摘要
Wikipedia is commonly viewed as the main online encyclo- pedia. Its content quality, however, has often been ques- tioned due to the open nature of its editing model. A high- quality contribution by an expert may be followed by a low- quality contribution made by an amateur or vandal; therefore the quality of each article may fluctuate over time as it goes through iterations of edits by different users. In this study, we model the evolution of content quality in Wikipedia articles in order to estimate the fraction of time during which arti- cles retain high-quality status. The results show that articles tend to have high-quality content 74% of their lifetime and the average article quality increases as articles go through ed- its. To further analyze the open editing model of Wikipedia, we compare the behaviour of anonymous and registered users and show that there is a positive correlation between regis- tration and quality of the contributed content. In addition, we compare the evolution of the content in Wikipedia known high-quality articles (aka. featured articles) and the rest of the articles in order to extract features affecting quality. The results show that the high turnover of the content caused by the open editing model of Wikipedia results in rapid elimi- nation of low-quality content.These results not only suggest that the process underlying Wikipedia can be used for pro- ducing high-quality content, but also to question the viabil- ity of collaborative knowledge repositories that impose high barriers to user participation for the purpose of filtering poor quality contributions from the onset. overall quality in a definitive way, two studies have tried to assess it manually by comparison of Wikipedia articles to their parallel articles in other reputable sources (Giles 2005; Chesney 2006). Nature magazine's comparative analysis of forty-two science articles in both Wikipedia and the Ency- clopedia Britannica showed a surprisingly small difference; Britannica disputed this finding, saying that the errors in Wikipedia were more serious than the Britannica errors and that the source documents for the study included the junior versions of the encyclopedia as well as the Britannica year books 1 . The questions surrounding Wikipedia's open editing model have triggered a new generation of wikis like Citi- zendium 2 and Scholarpedia 3 . These online encyclopedias follow a much more traditional editing model, where a small number of experts produce most of the content, through a peer-reviewing process 4 . However, there is very little ev- idence that these traditional editing models are better than Wikipedia's model for the purpose of creating encyclopedic knowledge. To further address these issues, one must de- velop methods for automatically assessing Wikipedia's qual- ity and the parameters that affect it. Since Wikipedia is a highly dynamic system, the articles are changing very frequently. Therefore, the quality of ar- ticles is a time-dependent function and a single article may contain high- and low-quality content in different spans of its lifetime. The goal of our study is to analyze the evolution of content in Wikipedia articles over time and estimate the fraction of time that articles are in high-quality state. This paper offers two main contributions to the state of the art. First, we develop an automated measure to esti- mate quality of article revisions throughout the entire En- glish Wikipedia. Using this measure, we follow the evolu- tion of content quality and show that the fraction of time that articles are in a high-quality state has an increasing trend over time. Then, we present an empirical study of Wikipedia statistics that may explain the results obtained in our study. We analyze the contributions of registered and anonymous users and show that there is a positive correla-
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要