The Limits of Fair Medical Imaging AI In The Wild
CoRR(2023)
摘要
As artificial intelligence (AI) rapidly approaches human-level performance in
medical imaging, it is crucial that it does not exacerbate or propagate
healthcare disparities. Prior research has established AI's capacity to infer
demographic data from chest X-rays, leading to a key concern: do models using
demographic shortcuts have unfair predictions across subpopulations? In this
study, we conduct a thorough investigation into the extent to which medical AI
utilizes demographic encodings, focusing on potential fairness discrepancies
within both in-distribution training sets and external test sets. Our analysis
covers three key medical imaging disciplines: radiology, dermatology, and
ophthalmology, and incorporates data from six global chest X-ray datasets. We
confirm that medical imaging AI leverages demographic shortcuts in disease
classification. While correcting shortcuts algorithmically effectively
addresses fairness gaps to create "locally optimal" models within the original
data distribution, this optimality is not true in new test settings.
Surprisingly, we find that models with less encoding of demographic attributes
are often most "globally optimal", exhibiting better fairness during model
evaluation in new test environments. Our work establishes best practices for
medical imaging models which maintain their performance and fairness in
deployments beyond their initial training contexts, underscoring critical
considerations for AI clinical deployments across populations and sites.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要