Progressive modality-complement aggregative multitransformer for domain multi-modal neural machine translation

PATTERN RECOGNITION(2024)

引用 0|浏览16
暂无评分
摘要
Domain -specific Multi -modal Neural Machine Translation (DMNMT) aims to translate domain -specific sentences from a source language to a target language by incorporating text -related visual information. Generally, domain -specific text -image data often complement each other and have the potential to collaboratively enhance the representation of domain -specific information. Unfortunately, there is a considerable modality gap between image and text in data format and semantic expression, which leads to distinctive challenges in domain -text translation tasks. Narrowing the modality gap and improving domain -aware representation are two critical challenges in DMNMT. To this end, this paper proposes a progressive modality -complement aggregative MultiTransformer, which aims to simultaneously narrow the modality gap and capture domain -specific multimodal representation. We first adopt a bidirectional progressive cross -modal interactive strategy to effectively narrow the text -to -text, text -to -visual, and visual -to -text semantics in the multi -modal representation space by integrating visual and text information layer -by -layer. Subsequently, we introduce a modality -complement MultiTransformer based on progressive cross -modal interaction to extract the domain -related multi -modal representation, thereby enhancing machine translation performance. Experiment results on the Fashion-MMT and Multi -30k datasets are conducted, and the results show that the proposed approach outperforms the compared state-of-the-art (SOTA) methods on the En-Zh task in E -commerce domain, En -De, En -Fr and En -Cs tasks of Multi -30k in general domain. The in-depth analysis confirms the validity of the proposed modality -complement MultiTransformer and bidirectional progressive cross -modal interactive strategy for DMNMT.
更多
查看译文
关键词
Domain multi-modal neural machine,translation,Multi-modal transformer,Progressive modality-complement,Modality-specific mask
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要