谷歌浏览器插件
订阅小程序
在清言上使用

Optimization of the de novo assembly of the transcriptome of the venom gland of Pamphobeteus verdolaga, prospecting novel bioactive peptides

crossref(2020)

引用 0|浏览8
暂无评分
摘要
Abstract Background: Spiders are among the most venomous animals in nature. Their venom constitutes a source of novel and innovative peptides and proteins with medicinal and biotechnological interest. However, their potential as antimicrobial, anti-cancerous, anti-hypertensive and even in the modulation of nociception is under-studied, mainly because handling the venom is technically challenging and there is paucity of next-generation-sequencing (NGS) data. Due to the increasing evidence of underestimation of the number of genes by the use of a single transcriptome assembler, we re-assembled and optimized the de novo transcriptome of the venom gland of the recently described Colombian spider P. verdolaga, by using three free access algorithms: Trinity, Soapdenovo and SPAdes. All the assemblies were evaluated by statistical parameters (e.g. contigs, GC%, max and min length and N50), by applying BUSCO´s terms retrieval against the arthropod data set to determine the best assembly for each software.Results: Our analyses show that while approximately 54% of all the assembled and structurally annotated sequences could be found in all three algorithms, around 23% of these were unique for Trinity and 21% were unique for SPAdes. The non-redundant merge of all three assemblies’ output permitted the annotation of 8640 sequences; at least 15% more when compared to each software separately, and an increase of 20% when compared to a previous P. verdolaga assembly. Analysis of the annotated genes allowed the identification of unreported lectins, kinins and over 200 peptides and proteins with potential antimicrobial and protease inhibition activities. Furthermore, homology search against the Arachnoserver database and the EROP knowledgebase allowed the identification of 135 novel theraphotoxins of biotechnological interest.Conclusion: Transcriptomic data is of utmost importance for spiders, as it is one of the more feasible and scalable ways to characterize these organisms. However, the use of a single de novo assembler implies an under representation of the expressed sequences, as it has been shown here. In the generation of data for non-model organisms as well as in the search for novel peptides and proteins with biotechnological interest, it is highly recommended that at least two different assemblers are employed.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要