Bayesian models for sparse regression analysis of high dimensional data

msra(2011)

引用 50|浏览18
暂无评分
摘要
Summary This paper considers the task of building efficient regression models for sparse multivariate analysis of high dimensional data sets, in particular it focuses on cases where the numbers q of responses Y = (y k , 1 ≤ k ≤ q) and p of predictors X = (xj , 1 ≤ j ≤ p) to analyse jointly are both large with re- spect to the sample size n, a challenging bi-directional task. The analysis of such data sets arise commonly in genetical genomics, with X linked to the DNA characteristics and Y corresponding to measurements of fundamental biological processes such as transcription, protein or metabolite production. Building on the Bayesian variable selection set-up for the linear model and associated efficient MCMC algorithms developed for single responses, we dis- cuss the generic framework of hierarchical related sparse regressions, where parallel regressions of y k on the set of covariates X are linked in a hierarchical fashion, in particular through the prior model of the variable selection indica- tors γkj , which indicate among the covariates xj those which are associated to the response y k in each multivariate regression. Structures for the joint model of the γkj , which correspond to different compromises between the aims of controlling sparsity and that of enhancing the detection of predictors that are associated with many responses ('hot spots'), will be discussed and a new mul- tiplicative model for the probability structure of the γkj will be presented. To perform inference for these models in high dimensional set-ups, novel adap- tive MCMC algorithms are needed. As sparsity is paramount and most of the associations expected to be zero, new algorithms that progressively focus on part of the space where the most interesting associations occur are of great interest. We shall discuss their formulation and theoretical properties, and demonstrate their use on simulated and real data from genomics.
更多
查看译文
关键词
eqtl,variable selection.,and phrases: adaptive mcmc scanning,hierarchically related regressions,genomics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要