Towards a Gold Standard for Evaluating Danish Word Embeddings.

LREC(2020)

引用 0|浏览0
暂无评分
摘要
This paper presents the process of compiling a model-agnostic similarity gold standard for evaluating Danish word embeddings based on human judgments made by 42 native speakers of Danish. Word embeddings resemble semantic similarity solely by distribution (meaning that word vectors do not reflect relatedness as differing from similarity), and we argue that this generalisation poses a problem in most intrinsic evaluation scenarios. In order to be able to evaluate on both dimensions, our human-generated dataset is therefore designed to reflect the distinction between relatedness and similarity. The goal standard is applied for evaluating the "goodness" of six existing word embedding models for Danish, and it is discussed how a relatively low correlation can be explained by the fact that semantic similarity is substantially more challenging to model than relatedness, and that there seems to be a need for future human judgements to measure similarity in full context and along more than a single spectrum.
更多
查看译文
关键词
word embeddings, semantic similarity, Danish
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要