We are Not Groupies⋯ We are Band Aids’: Assessment Reliability in the AI Song Contest

John Ashley Burgoyne,Hendrik Vincent Koops

Transactions of the International Society for Music Information Retrieval（2021）

引用 0|浏览4

暂无评分

摘要

In 2020, inspired by the expectation that Rotterdam would host the Eurovision Song Contest, the Dutch public broadcaster VPRO sponsored an international AI Song Contest. The winner was determined by combining an online public vote, which attracted 3800 voters across 70 countries, with the ratings of three professional judges. In this paper, we analyse the voters’ and judges’ ratings to assess the reliability of the contest results and to make recommendations for evaluating the contest in the future. We focus on Rasch-type models because of their strong measurement characteristics, but also consider a mixture variant to inflate counts for the 46 percent of voters who exhibited ‘groupie’-like behaviour: voting for one team only and giving their team a perfect score. We find that the overall reliability of the AI Song Contest evaluation was excellent (ρ = .90) but that the large number of one-time voters distorted the results. These findings pose a dilemma for organising such a contest in the future: to what extent is a popularity contest desirable and even expected from a broader voting public, and to what extent should such a contest strive for an objective measurement of the quality of AI-composed music?

查看译文

关键词

ai song contest,reliability,music competitions,measurement,rasch models

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要