DSGPT: Domain-Specific Generative Pre-Training of Transformers for Text Generation in E-commerce Title and Review Summarization

Xueying Zhang,Yunjiang Jiang,Yue Shang,Zhaomeng Cheng,Chi Zhang,Xiaochuan Fan,Yun Xiao,Bo Long

Research and Development in Information Retrieval（2021）

引用 10|浏览41

暂无评分

摘要

ABSTRACTWe propose a novel domain-specific generative pre-training (DSGPT) method for text generation and apply it to the product title and review summarization problems on E-commerce mobile display. First, we adopt a decoder-only transformer architecture, which fits well for fine-tuning tasks by combining input and output all together. Second, we demonstrate utilizing only small amount of pre-training data in related domains is powerful. Pre-training a language model from a general corpus such as Wikipedia or the Common Crawl requires tremendous time and resource commitment, and can be wasteful if the downstream tasks are limited in variety. Our DSGPT is pre-trained on a limited dataset, the Chinese short text summarization dataset (LCSTS). Third, our model does not require product-related human-labeled data. For title summarization task, the state of art explicitly uses additional background knowledge in training and predicting stages. In contrast, our model implicitly captures this knowledge and achieves significant improvement over other methods, after fine-tuning on the public Taobao.com dataset. For review summarization task, we utilize JD.com in-house dataset, and observe similar improvement over standard machine translation methods which lack the flexibility of fine-tuning. Our proposed work can be simply extended to other domains for a wide range of text generation tasks.

查看译文

关键词

text generation, generative pre-training, abstractive summarization, decoder transformer

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要