M3S-ALG: Improved and Robust Prediction of Allergenicity of Chemical Compounds by Using a Novel Multi-Step Stacking Strategy

Phasit Charoenkwan,Nalini Schaduangrat, Le Thi Phan, Balachandran Manavalan,Watshara Shoombuatong

Future Generation Computer Systems(2024)

引用 0|浏览0
暂无评分
摘要
A wide variety of chemicals cannot be introduced to the marketplace because of their high allergenicity. Therefore, it is fundamentally crucial to assess the allergenic potential of chemicals before introducing them into clinical therapeutics. However, assessing the allergenicity of chemical compounds experimentally is time-consuming and costly. To tackle this challenge, we propose M3S-ALG, a novel multi-step stacking strategy (M3S) for rapid and accurate identification of the allergenicity of chemical compounds by using only the SMILES notation. The proposed M3S method involves three steps, as follows. First, ten different balanced datasets were constructed using an under-sampling approach. Second, for each balanced dataset, 144 base-classifiers were trained and optimized to generate the prediction scores of allergenic chemical compounds considered as new probabilistic features. Third, we selected the important probabilistic features and employed them to construct the final stacked model (M3S-ALG). Experimental results show that M3S-ALG outperforms conventional ensemble strategies and its constituent base-classifiers on both the training and independent test datasets. This indicates the effectiveness and robustness of our proposed strategy in identifying the allergenicity of chemical compounds. In addition, M3S-ALG exhibited excellent prediction performance compared to existing methods on the independent test dataset, achieving a balanced accuracy of 0.877, MCC of 0.712, and AUC of 0.931. Finally, we developed a user-friendly online web server at https://pmlabqsar.pythonanywhere.com/M3SALG. This new approach is anticipated to facilitate the drug discovery and development community for the large-scale identification of chemical compounds with no allergenic properties.
更多
查看译文
关键词
Chemical allergens,Allergy,Cheminformatics,Feature selection,Machine learning,Stacking strategy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要