Deep Learning for Character-Based Information Extraction.

ECIR(2014)

引用 35|浏览129
暂无评分
摘要
Table 1 summarizes some statistics of the datasets we used in experiments. (1) The CTB data we used for WS and POS is from Chinese Treebank 6.0 (LDC2007T36), released in 2007, encompasses 2,036 text files, containing 28,295 sentences, 781,351 words and about 1.3M Chinese characters. (2). The CITYU NER data was from SIGHAN3 [6], which includes around 1.8M NE-labeled Chinese characters. (3). The CB513 data for SS task consists of 513 unrelated proteins with known 3D structure. Totally the CB513 includes about 84k amino acid characters labeled with SS target tags [8].
更多
查看译文
关键词
Information Extraction, Deep Learning, Conditional Random Field, Output Label, Deep Neural Network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要