A statistical model for locating regulatory regions in genomic DNA

Evelyn M Crowley,Kathryn Roeder,Minou Bina

Journal of Molecular Biology（1997）

Cited 71|Views4

No score

Abstract

In addition to genes, chromosomal DNA contains sequences that serve as signals for turning on and off gene expression. These signals are thought to be distributed as clusters in the regulatory regions of genes. We develop a Bayesian model that views locating regulatory regions in genomic DNA as a change-point problem, with the beginning of regulatory and non-regulatory regions corresponding to the change points. The model is based on a hidden Markov chain. The data consist of nucleotide positions of protein-binding elements in a genomic DNA sequence. These positions are identified using a reference catalogue containing elements that interact with transcription factors implicated in controlling the expression of protein-encoding genes. Among the protein-binding elements in a genomic DNA sequence, the statistical model automatically selects those that tend to predict regulatory regions. We test the model using viral sequences that include known regulatory regions and provide the results obtained for human genomic DNA corresponding to the β globin locus on chromosome 11.

Translated text

Key words

hidden Markov chains,Bayesian statistics,HIV-1 regulatory regions,adenovirus regulatory regions,LCR

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined