On data normalization and batch-effect correction for tumor subtyping with microRNA data

NAR GENOMICS AND BIOINFORMATICS(2023)

引用 0|浏览7
暂无评分
摘要
The discovery of new tumor subtypes has been aided by transcriptomics profiling. However, some new subtypes can be irreproducible due to data artifacts that arise from disparate experimental handling. To deal with these artifacts, methods for data normalization and batch-effect correction have been utilized before performing sample clustering for disease subtyping, despite that these methods were primarily developed for group comparison. It remains to be elucidated whether they are effective for sample clustering. We examined this issue with a re-sampling-based simulation study that leverages a pair of microRNA microarray data sets. Our study showed that (i) normalization generally benefited the discovery of sample clusters and quantile normalization tended to be the best performer, (ii) batch-effect correction was harmful when data artifacts confounded with biological signals, and (iii) their performance can be influenced by the choice of clustering method with the Prediction Around Medoidmethod based on Pearson correlation being consistently a best performer. Our study provides important insights on the use of data normalization and batch-effect correction in connection with the design of array-to-sample assignment and the choice of clustering method for facilitating accurate and reproducible discovery of tumor subtypes with microRNAs.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要