Wikipedia bi-linear link (WBLM) model: A new approach for measuring semantic similarity and relatedness between linguistic concepts using Wikipedia link structure

Information Processing & Management(2023)

引用 0|浏览7
暂无评分
摘要
Wikipedia links its articles by manually defined semantic relations called the Wikipedia hyperlink (link) structure. The existing Wikipedia link-based semantic similarity (SS) and semantic relatedness (SR) computation models, such as Wikipedia one-way link (WOLM) model and Wikipedia two-way link (WTLM) model, do not assess the strengths of the relationships between a candidate concept and its links (out-links or in-links). These models treat all the links as equally important even though some links are semantically more influential than others and should be given more importance. This phenomenon reduces the accuracy of these models. This paper presents the Wikipedia bi-linear link (WBLM) model that extends the previously proposed WOLM and WTLM models. The WBLM model explores the Wikipedia link structure as a semantic graph and discovers the strongly (bi-linear links) and weakly (out-links or in -links) connected links of a candidate concept. It improves the link-based vector representations of concepts by assigning weights to their connected links according to the strengths of their semantic associations. The experimental results demonstrate that the proposed WBLM model significantly improves the SS and SR computation accuracy of the WOLM model (6.9%, 8%, 24%, 17.3%, 31.2%, 30.6%, 26.5%, and 35.4%) and WTLM model (1.2%, 3.9%, 7.1%, 9.9%, 11%, 6.3%, 12.7%, and 13%), in terms of linear correlations with human judgments on gold standard benchmarks, including MC30, RG65, WS203, SimLex, 353All, MTurk287, MTurk771, and MEN3000, respectively. Moreover, this research offers a deep insight into the Wikipedia link structure and provides an adequate base for understanding it as a semantic graph.
更多
查看译文
关键词
Graph theory,Information content,Semantic similarity,Semantic relatedness,Vector space
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要