Gshard: scaling giant models AUTOMATIC SHARDING,Dmitry Lepikhin,HyoukJoong Lee,Yuanzhong Xu,Dehao Chen,Orhan Firat,Yanping Huang,Maxim Krikun,Noam Shazeer,Zhifeng Chensemanticscholar(2021)Cited 0|Views69No scoreAI Read ScienceMust-Reading TreeExampleGenerate MRT to find the research sequence of this paperChat PaperSummary is being generated by the instructions you defined