A Clustering System for Gene Expression Data Based upon Genetic Programming and the HS-Model

CSO '10 Proceedings of the 2010 Third International Joint Conference on Computational Science and Optimization - Volume 01(2010)

引用 2|浏览6
暂无评分
摘要
Cluster analysis is a major method to study gene function and gene regulation information for there is a lack of prior knowledge for gene data. Many clustering methods existed at present usually need manual operations or pre-determined parameters, which are difficult for gene data. Besides, gene data possess their own characteristics, such as large scale, high-dimension, and noise. Therefore, a systematic clustering algorithm should be proposed to effectively deal with gene data. In this paper, a novel genetic programming (GP) clustering system for gene data based on hierarchical statistical model (HS-model) is proposed. And an appropriate fitness function is also proposed in this system. This clustering system can largely eliminate the infection of data scale and dimension. The proposed GP clustering system is applied to cluster the whole intact yeast gene data without dimensionality reduction. The experimental results indicate that the algorithm is highly efficient and can effectively deal with missing values in gene dataset.
更多
查看译文
关键词
clustering system,genetic programming,gene dataset,data scale,systematic clustering algorithm,whole intact yeast gene,gene function,gene data,clustering method,gene regulation information,proposed gp clustering system,gene expression data,information analysis,statistical model,clustering algorithms,gene expression,gene regulation,bioinformatics,cluster analysis,missing values,genetic algorithms,statistical analysis,fitness function,computer science
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要