Omics feature selection with the extended SIS R package: identification of a body mass index epigenetic multi-marker in the Strong Heart Study.

American journal of epidemiology(2024)

引用 0|浏览5
暂无评分
摘要
The statistical analysis of omics data poses a great computational challenge given its ultra-high dimensional nature and frequent between-features correlation. In this work, we extended the Iterative Sure Independence Screening (ISIS) algorithm by pairing ISIS with elastic-net (Enet) and two versions of adaptive Enet (AEnet and MSAEnet) to efficiently improve feature selection and effect estimation in omics research. We subsequently used genome-wide human blood DNA methylation data from American Indians of the Strong Heart Study (N=2,235 participants), measured in 1989-1991, to compare the performance (predictive accuracy, coefficient estimation and computational efficiency) of SIS-paired regularization methods to Bayesian shrinkage and traditional linear regression to identify epigenomic multi-marker of body mass index. ISIS-AEnet outperformed the other methods in prediction. In biological pathway enrichment analysis of genes annotated to BMI-related differentially methylated positions, ISIS-AEnet captured most of the enriched pathways in common for at least two of all the evaluated methods. ISIS-AEnet can favor biological discovery because it identifies the most robust biological pathways while achieving an optimal balance between bias and efficient feature selection. In the extended SIS R package, we also implemented ISIS paired with Cox and logistic regression for time-to-event and binary endpoints, respectively, and bootstrap confidence intervals for the estimated regression coefficients.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要