Chrome Extension
WeChat Mini Program
Use on ChatGLM

Deep Anytime-Valid Hypothesis Testing

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238(2024)

Univ Amsterdam | Carnegie Mellon Univ

Cited 0|Views13
Abstract
We propose a general framework for constructing powerful, sequential hypothesis tests for a large class of nonparametric testing problems. The null hypothesis for these problems is defined in an abstract form using the action of two known operators on the data distribution. This abstraction allows for a unified treatment of several classical tasks, such as two-sample testing, independence testing, and conditional-independence testing, as well as modern problems, such as testing for adversarial robustness of machine learning (ML) models. Our proposed framework has the following advantages over classical batch tests: 1) it continuously monitors online data streams and efficiently aggregates evidence against the null, 2) it provides tight control over the type I error without the need for multiple testing correction, 3) it adapts the sample size requirement to the unknown hardness of the problem. We develop a principled approach of leveraging the representation capability of ML models within the testing-by-betting framework, a game-theoretic approach for designing sequential tests. Empirical results on synthetic and real-world datasets demonstrate that tests instantiated using our general framework are competitive against specialized baselines on several tasks.
More
Translated text
Key words
Robust Learning
PDF
Bibtex
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Related Papers
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本文提出了一种强大的顺序假设检验通用框架,适用于广泛非参数检验问题,能够有效监控数据流并控制错误率,同时适应问题难度调整样本量要求,创新性地将机器学习模型表征能力与博弈论测试框架结合。

方法】:通过抽象定义数据分布上两个已知操作的作用来构建零假设,该方法统一处理了多种经典任务和现代问题,如两样本测试、独立性测试和条件独立性测试,以及机器学习模型对抗稳健性测试。

实验】:在合成数据集和现实世界数据集上的实验结果表明,使用该框架实例化的测试在多个任务上与专业基线具有竞争力。具体数据集名称未在文中提及。