谷歌浏览器插件
订阅小程序
在清言上使用

Automating Translation of Scientific Literature for Different Languages

Rishabh Gupta,Harsh Hemani, N. Sakthivel,U.D. Malshe

crossref(2024)

引用 0|浏览3
暂无评分
摘要
Abstract The increasing volume of scientific research necessitates efficient and accurate methods for processing and disseminating knowledge across languages. This paper presents a pipeline for the comprehensive translation of scientific documents, addressing crucial gaps in existing open-source solutions. Our pipeline encompasses four key stages: PDF-to-image conversion for enhanced document layout analysis, robust text recognition, language-specific translation tailored to scientific terminology, and document reconstruction. Our pipeline establishes a way for one-click translation of scientific documents from other languages into English, addressing a critical need in the scientific community. The pipeline comprises four stages: PDF-to-image conversion, robust text recognition, language-specific translation, and document reconstruction. Text recognition is enhanced through image processing techniques, while translation is tailored to scientific terminology using domain-specific translation models. Experiments demonstrate significant improvement of translation accuracy compared to other open-source models for Russian reaching BLUE score 45 for Scientific documents . We demonstrate the efficacy of our approach through experiments, showing significant improvements in translation accuracy by significantly reducing errors in scientific terminology and by predicting optimized document structure.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要