Nemotron-4 15B Technical Report

Jupinder Parmar,Shrimai Prabhumoye, Joseph Jennings,Mostofa Patwary, Sandeep Subramanian, Dan Su, Chen Zhu,Deepak Narayanan, Aastha Jhunjhunwala, Ayush Dattagupta, Vibhu Jawa, Jiwei Liu, Ameya Mahabaleshwarkar, Osvald Nitski, Annika Brundyn, James Maki, Miguel Martinez, Jiaxuan You, John Kamalu,Patrick LeGresley, Denys Fridman,Jared Casper, Ashwath Aithal,Oleksii Kuchaiev,Mohammad Shoeybi, Jonathan Cohen, Bryan Catanzaro

CoRR(2024)

引用 0|浏览6
暂无评分
摘要
We introduce Nemotron-4 15B, a 15-billion-parameter large multilingual language model trained on 8 trillion text tokens. Nemotron-4 15B demonstrates strong performance when assessed on English, multilingual, and coding tasks: it outperforms all existing similarly-sized open models on 4 out of 7 downstream evaluation areas and achieves competitive performance to the leading open models in the remaining ones. Specifically, Nemotron-4 15B exhibits the best multilingual capabilities of all similarly-sized models, even outperforming models over four times larger and those explicitly specialized for multilingual tasks.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要