Model Misspecification and Robust Analysis for Outcome‐dependent Sampling Designs under Generalized Linear Models

Jacob M. Maronge,Jonathan S. Schildcrout,Paul J. Rathouz

Statistics in medicine（2023）

引用 0|浏览13

暂无评分

摘要

Outcome‐dependent sampling (ODS) is a commonly used class of sampling designs to increase estimation efficiency in settings where response information (and possibly adjuster covariates) is available, but the exposure is expensive and/or cumbersome to collect. We focus on ODS within the context of a two‐phase study, where in Phase One the response and adjuster covariate information is collected on a large cohort that is representative of the target population, but the expensive exposure variable is not yet measured. In Phase Two, using response information from Phase One, we selectively oversample a subset of informative subjects in whom we collect expensive exposure information. Importantly, the Phase Two sample is no longer representative, and we must use ascertainment‐correcting analysis procedures for valid inferences. In this paper, we focus on likelihood‐based analysis procedures, particularly a conditional‐likelihood approach and a full‐likelihood approach. Whereas the full‐likelihood retains incomplete Phase One data for subjects not selected into Phase Two, the conditional‐likelihood explicitly conditions on Phase Two sample selection (ie, it is a “complete case” analysis procedure). These designs and analysis procedures are typically implemented assuming a known, parametric model for the response distribution. However, in this paper, we approach analyses implementing a novel semi‐parametric extension to generalized linear models (SPGLM) to develop likelihood‐based procedures with improved robustness to misspecification of distributional assumptions. We specifically focus on the common setting where standard GLM distributional assumptions are not satisfied (eg, misspecified mean/variance relationship). We aim to provide practical design guidance and flexible tools for practitioners in these settings.

查看译文

关键词

efficiency,generalized linear models,outcome-dependent sampling,semi-parametric models,two-phase studies

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要