Text2MDT: Extracting Medical Decision Trees from Medical Texts
CoRR(2024)
摘要
Knowledge of the medical decision process, which can be modeled as medical
decision trees (MDTs), is critical to build clinical decision support systems.
However, the current MDT construction methods rely heavily on time-consuming
and laborious manual annotation. In this work, we propose a novel task,
Text2MDT, to explore the automatic extraction of MDTs from medical texts such
as medical guidelines and textbooks. We normalize the form of the MDT and
create an annotated Text-to-MDT dataset in Chinese with the participation of
medical experts. We investigate two different methods for the Text2MDT tasks:
(a) an end-to-end framework which only relies on a GPT style large language
models (LLM) instruction tuning to generate all the node information and tree
structures. (b) The pipeline framework which decomposes the Text2MDT task to
three subtasks. Experiments on our Text2MDT dataset demonstrate that: (a) the
end-to-end method basd on LLMs (7B parameters or larger) show promising
results, and successfully outperform the pipeline methods. (b) The
chain-of-thought (COT) prompting method can improve the
performance of the fine-tuned LLMs on the Text2MDT test set. (c) the
lightweight pipelined method based on encoder-based pretrained models can
perform comparably with LLMs with model complexity two magnititudes smaller.
Our Text2MDT dataset is open-sourced at
, and the source codes are
open-sourced at .
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要