Mammographic breast density classification using a deep neural network: assessment on the basis of inter-observer variability.

Proceedings of SPIE(2019)

引用 6|浏览5
暂无评分
摘要
Mammographic breast density is an important risk marker in breast cancer screening. The ACR BI-RADS guidelines (5th ed.) define four breast density categories that can be dichotomized by the two super-classes dense" and not dense". Due to the qualitative description of the categories, density assessment by radiologists is characterized by a high inter-observer variability. To quantify this variability, we compute the overall percentage agreement (OPA) and Cohen's kappa of 32 radiologists to the panel majority vote based on the two super-classes. Further, we analyze the OPA between individual radiologists and compare the performances to an automated assessment via a convolutional neural network (CNN). The data used for evaluation contains 600 breast cancer screening examinations with four views each. The CNN was designed to take all views of an examination as input and trained on a dataset with 7186 cases to output one of the two super-classes. The highest agreement to the panel majority vote (PMV) achieved by a single radiologist is 99%, the lowest score is 71% with a mean of 89%. The OPA of two individual radiologists ranges from a maximum of 97.5% to a minimum of 50.5% with a mean of 83%. Cohen's kappa values of radiologists to the PMV range from 0.97 to 0.47 with a mean of 0.77. The presented algorithm reaches an OPA to all 32 radiologists of 88% and a kappa of 0.75. Our results show that inter-observer variability for breast density assessment is high even if the problem is reduced to two categories and that our convolutional neural network can provide labelling comparable to an average radiologist. We also discuss how to deal with automated classification methods for subjective tasks.
更多
查看译文
关键词
Mammography,breast density,deep learning,inter-observer variability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要