Seeing Unseen: Discover Novel Biomedical Concepts via Geometry-Constrained Probabilistic Modeling
CVPR 2024(2024)
摘要
Machine learning holds tremendous promise for transforming the fundamental
practice of scientific discovery by virtue of its data-driven nature. With the
ever-increasing stream of research data collection, it would be appealing to
autonomously explore patterns and insights from observational data for
discovering novel classes of phenotypes and concepts. However, in the
biomedical domain, there are several challenges inherently presented in the
cumulated data which hamper the progress of novel class discovery. The
non-i.i.d. data distribution accompanied by the severe imbalance among
different groups of classes essentially leads to ambiguous and biased semantic
representations. In this work, we present a geometry-constrained probabilistic
modeling treatment to resolve the identified issues. First, we propose to
parameterize the approximated posterior of instance embedding as a marginal von
MisesFisher distribution to account for the interference of distributional
latent bias. Then, we incorporate a suite of critical geometric properties to
impose proper constraints on the layout of constructed embedding space, which
in turn minimizes the uncontrollable risk for unknown class learning and
structuring. Furthermore, a spectral graph-theoretic method is devised to
estimate the number of potential novel classes. It inherits two intriguing
merits compared to existent approaches, namely high computational efficiency
and flexibility for taxonomy-adaptive estimation. Extensive experiments across
various biomedical scenarios substantiate the effectiveness and general
applicability of our method.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要