Natural Language Processing Accurately Identifies Colorectal Dysplasia in a National Cohort of Veterans with Inflammatory Bowel Disease

semanticscholar(2019)

引用 2|浏览3
暂无评分
摘要
Background: Although practice guidelines recommend colorectal cancer surveillance for inflammatory bowel disease (IBD) patients, the natural history of patient with dysplasia is poorly described. Assembling large cohorts of IBD patients with dysplasia is difficult as administrative codes are lacking. The aim of this study was to use natural language processing (NLP) in a large electronic health records (EHR) to identify IBD patients with colonic dysplasia. Methods: We conducted a retrospective cohort study using administrative data from the national Veterans Health Administration (VHA) Corporate Data Warehouse for patients with IBD. Full-text histopathology reports from patients who underwent colonoscopy in the VHA were obtained and a validation cohort was created using a random sample of 2000 reports. An NLP algorithm to identify the presence and grade of dysplasia was developed and performance tested in a validation cohort. The final NLP algorithm was applied to the entire IBD cohort to identify all cases of colonic dysplasia. Results: We identified a total of 44,099 Veterans with IBD, with 22,431 colonoscopy related histopathology reports. NLP had an accuracy of 97.1% for detection of low grade dysplasia, with a precision of 87%, recall of 96.6%, and F- measure of 91.5%. When applied to the entire cohort, a total of 1,762 cases of colonic dysplasia were identified. Conclusions: NLP accurately identifies colonic low-grade dysplasia in IBD patients from a national EHR. NLP can be used to identify large cohorts of IBD patients with dysplasia to further study the natural history and outcomes of colonic dysplasia in patients with IBD.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要