Towards a Universal Text Classifier: Transfer Learning Using Encyclopedic Knowledge

Miami, FL(2009)

引用 14|浏览0
暂无评分
摘要
Document classification is a key task for many text mining applications. However, traditional text classification requires labeled data to construct reliable and accurate classifiers. Unfortunately, labeled data are seldom available. In this work, we propose a universal text classifier, which does not require any labeled document. Our approach simulates the capability of people to classify documents based on background knowledge. As such, we build a classifier that can effectively group documents based on their content, under the guidance of few words describing the classes of interest. Background knowledge is modeled using encyclopedic knowledge, namely Wikipedia. The universal text classifier can also be used to perform document retrieval. In our experiments with real data we test the feasibility of our approach for both the classification and retrieval tasks.
更多
查看译文
关键词
wikipedia,document retrieval,group document,information retrieval,learning transfer,traditional text classification,universal text classifier,document classification,transfer learning,accurate classifier,data mining,text analysis,text mining application,text mining,encyclopedic knowledge,kernel,cryptography,supervised learning,encyclopedias,electronic publishing,internet
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要