A Sparse Plus Low-Rank Exponential Language Model for Limited Resource Scenarios

IEEE/ACM Transactions on Audio, Speech & Language Processing(2015)

引用 16|浏览100
暂无评分
摘要
This paper describes a new exponential language model that decomposes the model parameters into one or more low-rank matrices that learn regularities in the training data and one or more sparse matrices that learn exceptions (e.g., keywords). The low-rank matrices induce continuous-space representations of words and histories. The sparse matrices learn multi-word lexical items and topic/domain idiosyncrasies. This model generalizes the standard ℓ1-regularized exponential language model, and has an efficient accelerated first-order training algorithm. Language modeling experiments show that the approach is useful in scenarios with limited training data, including low resource languages and domain adaptation.
更多
查看译文
关键词
computational linguistics,matrix algebra,natural language processing,optimisation,ℓ1-regularized exponential language model,limited resource scenario,low-rank matrix,model parameter decomposition,rank-penalized optimization,sparse matrix,language model,exponential,log bilinear,low-hyphen,sparse,data models,low rank matrix,history,sparse matrices,matrix decomposition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要