A Stacking Based Ensemble Learning Approach for Accurate Identification of Tumor Homing Peptides in Precision Cancer Therapeutics

Authors

DOI:

https://doi.org/10.4108/airo.10265

Keywords:

Tumor-Homing Peptides, Stacking Ensemble Learning, Feature Extraction, Cancer Therapies, Precision Medicine

Abstract

The identification of tumor-homing peptides (THPs) plays a pivotal role in the development of targeted cancer therapies and precision medicine. Current THP identification methods still suffer from limited feature representation, moderate predictive performance, and insufficient generalization, highlighting the need for more robust ensemble frameworks. In this study, we propose STHPP, an innovative stacking-based ensemble machine learning approach designed to improve the accuracy and reliability of THP discovery. Two benchmark datasets, referred to as the "main" and "small" datasets of Shoombuatong were collected, merged, and pre-processed in preparation to create a large dataset and then split for training and testing. The STHPP model applies a two-layer ensemble architecture: first layer that aggregates three heterogenous baseline classifiers, Random Forest (RF), Light Gradient Boosting Machine (LightGBM), Extreme Gradient Boosting (XGBoost), and then second layer applies CatBoost as a meta-classifier for post-processing predictive results of the base models. The two-layer architecture uses model diversity and concepts in ensemble learning to enhance generalization performance. The STHPP framework proposed got outstanding performance with accuracy 0.98, precision 0.97, sensitivity 0.99, specificity 0.97, and a Matthews Correlation Coefficient (MCC) of 0.98. These are better than the performances of current state-of-the-art approaches, which illustrates the effectiveness of using the stacking strategy in complicated peptide classification problems. The finding showcases the potential of STHPP as a strong and scalable computational platform for propelling peptide-based drug discovery research and targeted oncology.

Downloads

Download data is not yet available.

References

[1] Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. cell. 2011 Mar 4;144(5):646-74.

[2] Global cancer burden growing, amidst mounting need for services (1 February 2024) World Health Organization. Available at: https://www.iarc.who.int/news-events/global-cancer-burden-growing-amidst-mounting-need-for-services (Accessed: 07 December 2025)

[3] Ferlay J, Colombet M, Soerjomataram I, Parkin DM, Piñeros M, Znaor A, Bray F. Cancer statistics for the year 2020: An overview. International journal of cancer. 2021 Aug 15;149(4):778-89.

[4] Mäe M, Myrberg H, El-Andaloussi S, Langel Ü. Design of a tumor homing cell-penetrating peptide for drug delivery. International Journal of peptide research and therapeutics. 2009 Mar;15(1):11-5.

[5] Svensen N, Walton JG, Bradley M. Peptides for cell-selective drug delivery. Trends in pharmacological sciences. 2012 Apr 1;33(4):186-92.

[6] Khongorzul P, Ling CJ, Khan FU, Ihsan AU, Zhang J. Antibody–drug conjugates: a comprehensive review. Molecular Cancer Research. 2020 Jan 1;18(1):3-19.

[7] Kondo E, Iioka H, Saito K. Tumor‐homing peptide and its utility for advanced cancer medicine. Cancer science. 2021 Jun;112(6):2118-25.

[8] Guan J, Yao L, Chung CR, Chiang YC, Lee TY. Stackthpred: identifying tumor-homing peptides through gbdt-based feature selection with stacking ensemble architecture. International Journal of Molecular Sciences. 2023 Jun 19;24(12):10348.

[9] Shoombuatong W, Schaduangrat N, Pratiwi R, Nantasenamat C. THPep: A machine learning-based approach for predicting tumor homing peptides. Computational biology and chemistry. 2019 Jun 1;80:441-51.

[10] Charoenkwan P, Chiangjong W, Nantasenamat C, Moni MA, Lio’ P, Manavalan B, Shoombuatong W. SCMTHP: A new approach for identifying and characterizing of tumor-homing

peptides using estimated propensity scores of amino acids. Pharmaceutics. 2022 Jan 4;14(1):122.

[11] Sharma A, Kapoor P, Gautam A, Chaudhary K, Kumar R, Chauhan JS, Tyagi A, Raghava GP. Computational approach for designing tumor homing peptides. Scientific reports. 2013 Apr 5;3(1):1607.

[12] Zou H, Yang F, Yin Z. Identification of tumor homing peptides by utilizing hybrid feature representation. Journal of Biomolecular Structure and Dynamics. 2023 May 24;41(8):3405-12.

[13] Charoenkwan P, Schaduangrat N, Moni MA, Manavalan B, Shoombuatong W. NEPTUNE: a novel computational approach for accurate and large-scale identification of tumor homing peptides. Computers in Biology and Medicine. 2022 Sep 1;148:105700.

[14] Guyon I, Elisseeff A. An introduction to variable and feature selection. Journal of machine learning research. 2003;3(Mar):1157-82.

[15] Wei L, Zhou C, Chen H, Song J, Su R. ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics. 2018 Dec 1;34(23):4007-16.

[16] Bhasin M, Raghava GP. Classification of nuclear receptors based on amino acid composition and dipeptide composition. Journal of Biological Chemistry. 2004 May 1;279(22):23262-6.

[17] Saravanan V, Gautham N. BCIgEPRED—a dual-layer approach for predicting linear IgE epitopes. Molecular Biology. 2018 Mar;52(2):285-93.

[18] Chou KC. Prediction of protein cellular attributes using pseudo‐amino acid composition. Proteins: Structure, Function, and Bioinformatics. 2001 May 15;43(3):246-55.

[19] Tuj Jannat F, Biplob KB, Bitto AK. Predicting bangladesh life expectancy using multiple depend features and regression models. InInternational Conference on Machine Intelligence and Signal Processing 2022 Mar 12 (pp. 47-58). Singapore: Springer Nature Singapore.

[20] Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY. Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems. 2017;30.

[21] Chen T, Guestrin C. Xgboost: A scalable tree boosting system. InProceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining 2016 Aug 13 (pp. 785-794).

[22] Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A. CatBoost: unbiased boosting with categorical features. Advances in neural information processing systems. 2018;31.

[23] Sanni RR, Guruprasad HS. Analysis of performance metrics of heart failured patients using Python and machine learning algorithms. Global transitions proceedings. 2021 Nov 1;2(2):233-7.

[24] Chicco D, Tötsch N, Jurman G. The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData mining. 2021 Feb 4;14(1):13.

[25] Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC genomics. 2020 Jan 2;21(1):6.

[26] Abu Kowshir Bitto, Rezwana Karim, M. H. Begum, M. F. I. K. Khan, Dr. Md. Maruf Hassan, and Prof. Dr. Abdul kadar Muhammad Masum, “Explainable AI Based Deep Ensemble Convolutional Learning for Multi-Categorical Ocular Disease Prediction”, EAI Endorsed Transactions on AI and Robotics, vol. 4, Jul. 2025.

[27] Prof. Dr. Abdul kadar Muhammad Masum, A. K. Bitto, S. I. Talukder, M. F. I. Khan, M. S. Alam, and K. M. Mohi Uddin, “An Explainable AI Based Deep Ensemble Transformer Framework for Gastrointestinal Disease Prediction from Endoscopic Images”, EAI Endorsed Transactions on AI and Robotics, vol. 4, Aug. 2025.

[28] K. Kalaivani, P. Deepan, G. Ganesh, J. Ravichandran, and S. Dhiravidaselvi, “Hyperband-Optimized Convolutional Neural Network Model for Efficient Brain Tumor Classification and Prediction”, EAI Endorsed Transactions on AI and Robotics, vol. 4, Sep. 2025.

[29] Badruzzaman Biplob KB, Sammak MH, Bitto AK, Mahmud I. COVID-19 and Suicide Tendency: Prediction and Risk Factor Analysis Using Machine Learning and Explainable AI. EAI Endorsed Transactions on Pervasive Health & Technology. 2024 Jan 1;10(1).

[30] Mia R, Hasan T, Bitto AK, Mahadi Hassan M, Shamsul Alam M, Abdul Kadar Muhammad Masum. Enhancing the Prediction of IL-4 Inducing Peptides Using Stacking Ensemble Model. EAI Endorsed Trans AI Robotics [Internet]. 2025 Oct. 27 [cited 2026 Feb. 9];4. Available from: https://publications.eai.eu/index.php/airo/article/view/9867

Downloads

Published

11-02-2026

How to Cite

1.
Jahid Hassan Akash, Mia R, Bitto AK, Masum AKM, Ahmed J, Khan FI. A Stacking Based Ensemble Learning Approach for Accurate Identification of Tumor Homing Peptides in Precision Cancer Therapeutics. EAI Endorsed Trans AI Robotics [Internet]. 2026 Feb. 11 [cited 2026 Feb. 13];5. Available from: https://publications.eai.eu/index.php/airo/article/view/10265

Most read articles by the same author(s)