OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking
CoRR(2023)
摘要
Large language models (LLMs) have revolutionized the landscape of Natural
Language Processing systems, but are computationally expensive. To reduce the
cost without sacrificing performance, previous studies have explored various
approaches to harness the potential of Small Language Models (SLMs) as
cost-effective alternatives to their larger counterparts. Driven by findings
that SLMs and LLMs exhibit complementary strengths in a structured knowledge
extraction task, this work presents a novel SLM/LLM routing framework designed
to improve computational efficiency and enhance task performance. First,
exemplar pools are created to represent the types of contexts where each LM
provides a more reliable answer, leveraging a sentence embedding fine-tuned so
that context similarity is close to dialogue state similarity. Then, during
inference, the k-nearest exemplars to the testing instance are retrieved, and
the instance is routed according to majority vote. In dialogue state tracking
tasks, the proposed routing framework enhances performance substantially
compared to relying solely on LLMs, while reducing the computational costs by
over 50
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要