Accuracy of a Popular Online Chat-Based Artificial Intelligence Model in Providing Recommendations on Colorectal Cancer Screening and Polyps: A Patient vs Physician-Centered Approach

The American Journal of Gastroenterology(2023)

引用 0|浏览7
暂无评分
摘要
Introduction: The rise in online chat-based artificial intelligence (AI) interactions raises the question of whether AI can assist in patient education and provide guidelines-based recommendations for physician use. We investigated if ChatGPT 4.0, a popular AI chatbot, could accurately answer patient queries and provide recommendations to physicians on colonoscopy and colorectal cancer (CRC) screening. Methods: We created 2 sets of 15 questions each, one from a patient (PA) and another from a physician’s (PH) perspective. PA questions were formulated after querying “Google Trends” for most searched terms related to colorectal polyps and colonoscopy. PH questions were formulated by expert gastroenterologists. Questions were entered into ChatGPT on 5/10/2023 and the model was asked to provide references to answers. 3 gastroenterologists reviewed answers and references and graded them for accuracy. An answer was considered accurate when appropriate with complete information and inaccurate if information was inaccurate or essential information missing. References were graded as suitable, unsuitable (existent but unrelated to answer) or nonexistent (created by ChatGPT). If >1 reference per answer, the overall grade represented the incorrect reference. Free-marginal Fleiss kappa coefficients (κ) were generated to determine agreement level among reviewers. Answers with disagreement were discussed by all 3 gastroenterologists with a chance to modify the grade after discussion. Results: When looking at accuracy of answers generated, κ among reviewers was 0.83 (95% confidence interval CI 0.35, 0.95) for PH and 0.91 (95% CI 0.58, 1.00) for PA answers, indicating almost perfect agreement (APA). The discrepancies in grading were identified and discussed. Subsequently, 100% of PA answers were graded as accurate (κ = 0.95 (95% CI 0.74, 1.00), APA) and 47% of PH answers graded as accurate (κ = 0.87 (95% CI 0.45, 1.00), APA) (P< 0.00001 by Fisher’s exact test comparing PA vs PH accuracy level) (Table 1). 53% of PA references were suitable, 27% unsuitable and 20% nonexistent. 93% of PH references had at least one nonexistent reference and one was unsuitable. Conclusion: ChatGPT provided largely accurate answers to PA questions on colonoscopy and CRC screening, indicating it could be a useful tool in patient education on CRC screening. The accuracy of answers to PH questions was suboptimal and should not be used by physicians for clinical guidance. Caution should be employed with references provided by the AI model. Table 1. - Assessment of AI-Generated Answers by Experienced Gastroenterologists Patient-Centered Questions Are there screening tests for colon cancer? Provide references. Accurate What is a colonoscopy? Provide references. Accurate When should I have my first colonoscopy? Provide references. Accurate How often should I have a colonoscopy? Provide references. Accurate How should I prepare for a colonoscopy? Provide references. Accurate Are there any risks or complications associated with a colonoscopy? Provide references. Accurate What causes colon polyps? Provide references. Accurate What are the symptoms of colon polyps? Provide references. Accurate Can colon polyps be prevented? Provide references. Accurate Are there different types of colon polyps? Provide references. Accurate What is the risk of colon polyps turning into cancer? Provide references. Accurate What should I do if I have a family history of colon polyps or colon cancer? Provide references. Accurate Is a colonoscopy procedure painful? Provide references. Accurate How are colon polyps removed, and is the process painful? Provide references. Accurate What is Lynch Syndrome? Provide references. Accurate Physician-Centered Questions A patient has 1 tubular adenoma on colonoscopy, when should they be back for surveillance? Provide references. Accurate A patient has one 8mm tubular adenoma on colonoscopy, when should they be back for surveillance? Provide references. Inaccurate A patient has one 10mm and one 2mm tubular adenomas on colonoscopy, when should they be back for surveillance? Provide references. Accurate A patient has one 10mm hyperplastic polyp on colonoscopy, when should they be back for surveillance? Provide references. Inaccurate A patient has 8 tubular adenomas on colonoscopy, when should they be back for surveillance? Provide references. Accurate A patient has one 5mm sessile serrated polyp on pathology, when should they be back for surveillance? Provide references. Accurate A patient has one 8mm traditional serrated adenoma on pathology, when should they be back? Provide references. Inaccurate A patient has two 15mm sessile serrated polyps in the ascending colon, three 6 to 8mm sessile serrated polyps in the sigmoid colon on colonoscopy, when should they be back for surveillance? Provide references. Inaccurate A patient has no polyps on screening colonoscopy, bowel preparation is graded per the Boston bowel preparation scale as 5, when should they be back for surveillance? Provide references. Inaccurate A patient has one 5mm sessile serrated polyp, one 10mm hyperplastic polyp in the sigmoid colon, and one 3mm tubular adenoma in the rectum on screening colonoscopy. When should they be back for surveillance? Provide references. Inaccurate A patient has no polyps on colonoscopy. A colonoscopy 3 years ago showed one tubulovillous adenoma. When should they be back for surveillance? Provide references. Accurate A patient had a screening colonoscopy, but the cecum could not be intubated due to technical difficulties, when should they be back for surveillance? Provide references. Accurate A patient had a screening colonoscopy and was found to have 11 adenomas, what should be done next? When should they have their next colonoscopy? Provide references. Inaccurate A patient had a screening colonoscopy and was found to have a 3mm polyp that was resected. Pathology comes back as lymphoid aggregate. When should they be back for surveillance? Provide references. Accurate A patient with ulcerative colitis had a colonoscopy and was found to have a 3mm hyperplastic polyp, when should they be back for surveillance? Provide references. Inaccurate
更多
查看译文
关键词
colorectal cancer screening,artificial intelligence model,colorectal cancer,artificial intelligence,chat-based,physician-centered
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要