Chrome Extension
WeChat Mini Program
Use on ChatGLM

TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization with Estimated Weights

Aiwei Liu,Haoping Bai, Zhiyun Lu,Yanchao Sun, Xiang Kong, Simon Wang,Jiulong Shan, Albin Madappally Jose, Xiaojiang Liu,Lijie Wen,Philip S. Yu,Meng Cao

arxiv(2024)

Cited 0|Views9
No score
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined