Pseudogenes limit the identification of novel common transcripts generated by their parent genes

biorxiv(2022)

引用 2|浏览36
暂无评分
摘要
Genomic sequences with high sequence similarity, such as parent-pseudogene pairs, cause short sequencing reads to align to multiple locations, thus complicating genomic analyses (1). However, their impact on transcriptomic analyses, including the estimation of gene expression and transcript annotation, has been less studied. Here, we investigated the impact of pseudogenes on transcriptomic analyses by focusing on the disease-relevant example of GBA1 and its expressed pseudogene GBAP1. Using short-read RNA-sequencing data from human brain samples (2), we found that only 42% of all reads mapping to GBA1 did so uniquely, with the remaining reads mapping primarily to GBAP1. This resulted in a significant misestimation of the relative expression of GBA1 to GBAP1. Using targeted long-read RNA-sequencing of 12 human brain regions we identified 18 GBA1 transcripts that had a novel open reading frame (ORF) and 7 GBAP1 transcripts predicted to encode a protein, despite GBAP1 being classified as a pseudogene. Furthermore, we demonstrated the ability of these transcripts to generate stable protein that lacked GBAs important function as a lysosomal glucocerebrosidase (GCase). However, we found that transcripts were surprisingly common, collectively accounting for 32% of transcription from the GBA1 locus in the caudate nucleus, and their usage showed cell type selectivity in human brain. Finally, we used annotation-independent analyses of both long and short-read RNA-sequencing data sets to show that parent genes were more likely to have evidence of incomplete annotation. Given that 734 (17%) genes causing Mendelian disease have at least one pseudogene, these findings significantly impact our understanding of human disease and highlight the need for long-read RNA-sequencing analyses at many loci. ### Competing Interest Statement S.S., Y.G., J.E., H.S. and C.F.B. are employed by Astex Pharmaceuticals. The other authors declare no competing interests.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要