Variational Classification

Shehzaad Dhuliawala,Mrinmaya Sachan,Carl E. Allen

arXiv (Cornell University)（2023）

引用 0|浏览3

暂无评分

摘要

We present a latent variable model for classification that provides a novel probabilistic interpretation of neural network softmax classifiers. We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders, that generalises the cross-entropy loss used to train classification models. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency between their anticipated distribution, required for accurate label predictions to be output, and the empirical distribution found in practice. We augment the variational objective to mitigate such inconsistency and encourage a chosen latent distribution, instead of the implicit assumption in off-the-shelf softmax classifiers. Overall, we provide new theoretical insight into the inner workings of widely-used softmax classification. Empirical evaluation on image and text classification datasets demonstrates that our proposed approach, variational classification, maintains classification accuracy while the reshaped latent space improves other desirable properties of a classifier, such as calibration, adversarial robustness, robustness to distribution shift and sample efficiency useful in low data settings.

查看译文

关键词

classification

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要