When in Doubt Ask the Crowd: Employing Crowdsourcing for Active Learning

Mihai Georgescu,Dang Duc Pham,Claudiu S. Firan,Ujwal Gadiraju,Wolfgang Nejdl

WIMS（2014）

引用 13|浏览64

暂无评分

摘要

Crowdsourcing has become ubiquitous in machine learning as a cost effective method to gather training labels. In this paper we examine the challenges that appear when employing crowdsourcing for active learning, in an integrated environment where an automatic method and human labelers work together towards improving their performance at a certain task. By using Active Learning techniques on crowd-labeled data, we optimize the performance of the automatic method towards better accuracy, while keeping the costs low by gathering data on demand. In order to verify our proposed methods, we apply them to the task of deduplication of publications in a digital library by examining metadata. We investigate the problems created by noisy labels produced by the crowd and explore methods to aggregate them. We analyze how different automatic methods are affected by the quantity and quality of the allocated resources as well as the instance selection strategies for each active learning round, aiming towards attaining a balance between cost and performance.

查看译文

关键词

human computation,crowdsourcing,active learning,machine learning,learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要