σ-GPTs: A New Approach to Autoregressive Models
arxiv(2024)
摘要
Autoregressive models, such as the GPT family, use a fixed order, usually
left-to-right, to generate sequences. However, this is not a necessity. In this
paper, we challenge this assumption and show that by simply adding a positional
encoding for the output, this order can be modulated on-the-fly per-sample
which offers key advantageous properties. It allows for the sampling of and
conditioning on arbitrary subsets of tokens, and it also allows sampling in one
shot multiple tokens dynamically according to a rejection strategy, leading to
a sub-linear number of model evaluations. We evaluate our method across various
domains, including language modeling, path-solving, and aircraft vertical rate
prediction, decreasing the number of steps required for generation by an order
of magnitude.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要