Enhancing the Prediction of IL-4 Inducing Peptides Using Stacking Ensemble Model

Authors

DOI:

https://doi.org/10.4108/airo.9867

Keywords:

Immunoinformatic, Peptides, Interleukin-4, Artificial Intelligence, Machine Learning , Stacking Ensemble

Abstract

Interleukin-4 (IL-4) plays a critical role in immune regulation and inflammation suppression, and therefore precise prediction is important in immunotherapy and vaccine design. In this work, we present an innovative stacking ensemble-based predictive model for IL-4-inducing peptide discovery. The method combines the group of feature extraction techniques, i.e., Amino Acid Composition (AAC), Amphiphilic Pseudo Amino Acid Composition (APAAC), and their combinations, and their pruning using SHAP (SHapley Additive exPlanations) with only the most relevant features being retained. To solve the class imbalance problem inherent in the peptide data, the ADASYN (Adaptive Synthetic Sampling) algorithm was applied for synthetic oversampling. We applied eight machine learning classifiers: Logistic Regression, Random Forest, Support Vector Classifier, Decision Tree, K-Nearest Neighbors, XGBoost, LightGBM, and a stacking ensemble model, enabling the strong prediction on both imbalanced and balanced datasets. Our evaluation demonstrates the stacking model's better performance on the imbalanced and balanced dataset. Surprisingly, with combined characteristics, the stacking model over the independent test set yielded accuracy of 89.97% and Matthew's Correlation Coefficient (MCC) as 0.79. Accurate comparisons of performance over AAC and APAAC feature spaces indicate that the stacking model performs better than other classifiers in all instances, albeit more so under balanced scenarios, referring to data rebalancing requirements. This research not only highlights the precision of stacking ensembles in peptide classification tasks but also urges the integration of interpretable feature selection and data balancing in future immunoinformatic pipelines.

Downloads

Download data is not yet available.

References

[1] Sharma A, Rudra D. Emerging functions of regulatory T cells in tissue homeostasis. Frontiers in immunology. 2018 Apr 25;9:883.

[2] Bernstein ZJ, Shenoy A, Chen A, Heller NM, Spangler JB. Engineering the IL‐4/IL‐13 axis for targeted immune modulation. Immunological reviews. 2023 Nov;320(1):29-57.

[3] Romagnani S. Type 1 T helper and type 2 T helper cells: functions, regulation and role in protection and disease. International Journal of Clinical and Laboratory Research. 1992 Jun;21(2):152-8.

[4] Brown MA, Hural J. Functions of IL-4 and control of its expression. Critical Reviews™ in Immunology. 2017;37(2-6).

[5] Simbirtsev AS. Cytokines and their role in immune pathogenesis of allergy. Russian Medical Inquiry. 2021;5(1):32-7.

[6] León B. A model of Th2 differentiation based on polarizing cytokine repression. Trends in immunology. 2023 Jun 1;44(6):399-407.

[7] Accogli T, Bruchard M, Végran F. Modulation of CD4 T cell response according to tumor cytokine microenvironment. Cancers. 2021 Jan 20;13(3):373.

[8] Chakraborty AK. A perspective on the role of computational models in immunology. Annual review of immunology. 2017 Apr 26;35(1):403-39.

[9] Arif M, Ahmed S, Ge F, Kabir M, Khan YD, Yu DJ, Thafar M. StackACPred: Prediction of anticancer peptides by integrating optimized multiple feature descriptors with stacked ensemble approach. Chemometrics and Intelligent Laboratory Systems. 2022 Jan 15;220:104458.

[10] Riccio J, Presotto L, Doniza L, Inverso D, Nevo U, Chirico G. Predictive modeling and experimental control of macrophage pro-inflammatory dynamics.

[11] Farooq S, Khurshid J, Nazeer I. Targeted Immunization: Application of Machine Learning in Prediction of IL-4 Inducing Peptides. InComputational Techniques for Biological Sequence Analysis 2025 Jun 17 (pp. 148-170). CRC Press.

[12] Yetgin A. Revolutionizing multi‐omics analysis with artificial intelligence and data processing. Quantitative Biology. 2025 Sep;13(3):e70002.

[13] Zhou X, Liu G, Cao S, Lv J. Deep Learning for Antimicrobial Peptides: Computational Models and Databases. Journal of Chemical Information and Modeling. 2025 Feb 10;65(4):1708-17.

[14] Musaazi IG, Liu L, Shaw A, Zaniolo M, Stadler LB, Vela JD. Optimizing models for the prediction of one step ahead extreme flows to wastewater treatment plants using different synthetic sampling methods. Journal of Environmental Management. 2025 Sep 1;392:126592.

[15] Xie C, Wei Y, Luo X, Yang H, Lai H, Dao F, Feng J, Lv H. NeXtMD: a new generation of machine learning and deep learning stacked hybrid framework for accurate identification of anti-inflammatory peptides. BMC biology. 2025 Jul 15;23(1):212.

[16] Miller B, de Souza EV, Pai VJ, Kim H, Vaughan JM, Lau CJ, Diedrich JK, Saghatelian A. ShortStop: a machine learning framework for microprotein discovery. BMC Methods. 2025 Aug 1;2(1):16.

[17] Tantoh DM, Yu JC, Chien CH, Yeh WY, Chu YW. Ubigo-X: Protein ubiquitination site prediction using ensemble learning with image-based feature representation and weighted voting. Computational and Structural Biotechnology Journal. 2025 Jul 14.

[18] Attanasio S, Kwasigroch J, Rooman M, Pucci F. SOuLMuSiC, a novel tool for predicting the impact of mutations on protein solubility. Scientific Reports. 2025 Jul 29;15(1):27531.

[19] Kao HJ, Weng TH, Chen CH, Yu CL, Chen YC, Huang CC, Huang KY, Weng SL. iDNS3IP: Identification and Characterization of HCV NS3 Protease Inhibitory Peptides. International Journal of Molecular Sciences. 2025 Jun 3;26(11):5356.

[20] Dholaniya PS, Rizvi S. Effect of various sequence descriptors in predicting human proteinprotein interactions using ANN-based prediction models. Current Bioinformatics. 2021 Oct 1;16(8):1024-33.

[21] Ullah F, Salam A, Nadeem M, Amin F, AlSalman H, Abrar M, Alfakih T. Extended dipeptide composition framework for accurate identification of anticancer peptides. Scientific Reports. 2024 Jul 29;14(1):17381.

[22] Esmaeili M, Mohabatkar H, Mohsenzadeh S. Using the concept of Chou's pseudo amino acid composition for risk type prediction of human papillomaviruses. Journal of theoretical biology. 2010 Mar 21;263(2):203-9.

[23] Badruzzaman Biplob KB, Sammak MH, Bitto AK, Mahmud I. COVID-19 and Suicide Tendency: Prediction and Risk Factor Analysis Using Machine Learning and Explainable AI. EAI Endorsed Transactions on Pervasive Health & Technology. 2024 Jan 1;10(1).

[24] Bitto, AK, Karim, R., Begum, MH, Khan, MFIK., Hassan, M M, & Masum, AKM. Explainable AI based deep ensemble convolutional learning for multi-categorical ocular disease prediction. EAI Endorsed Transactions on AI and Robotics, 4, Jul. 2025.

[25] Masum, AKM., Bitto, AK, Talukder, SI, Khan, MFI., Alam, MS, & Uddin, KMM. An explainable AI based deep ensemble transformer framework for gastrointestinal disease prediction from endoscopic images. EAI Endorsed Transactions on AI and Robotics, 4, Aug. 2025.

[26] Masum AKM, Khan MF, Hassan MM, Farid DM, Bitto AK, Rahman MA. Multi-Model Ensemble Approach for Accurate Classification of Ocular Disorders. In2025 International Conference on Quantum Photonics, Artificial Intelligence, and Networking (QPAIN) 2025 Jul 31 (pp. 1-6). IEEE.

[27] Masum AKM, Khan MF, Hassan MM, Bitto AK, Farid DM, Rahman T. Enhancing Dengue Fever Diagnosis: A Machine Learning Framework with Stacking Ensemble and SHAP Explainability. In2025 International Conference on Quantum Photonics, Artificial Intelligence, and Networking (QPAIN) 2025 Jul 31 (pp. 1-6). IEEE.

Downloads

Published

27-10-2025

How to Cite

[1]
R. Mia, T. Hasan, A. K. Bitto, M. Mahadi Hassan, M. Shamsul Alam, and Abdul Kadar Muhammad Masum, “Enhancing the Prediction of IL-4 Inducing Peptides Using Stacking Ensemble Model”, EAI Endorsed Trans AI Robotics, vol. 4, Oct. 2025.

Most read articles by the same author(s)