An Explainable AI Based Deep Ensemble Transformer Framework for Gastrointestinal Disease Prediction from Endoscopic Images
DOI:
https://doi.org/10.4108/airo.9795Keywords:
Gastrointestinal Disease, Medical Image Processing, Transformer Models, Ensemble Model, Explainable AIAbstract
Gastrointestinal diseases such as gastroesophageal reflux disease (GERD) and polyps remain prevalent and challenging to diagnose accurately due to overlapping visual features and inconsistent endoscopic image quality. In this study, we investigate the application of transformer-based deep learning models—Vision Transformer (ViT), Swin Transformer, and a novel Ensemble Transformer model—for classifying four categories: GERD, GERD Normal, Polyp, and Polyp Normal from endoscopic images. The dataset was curated and collected in collaboration with Zainul Haque Sikder Women's Medical College & Hospital, ensuring high-quality clinical annotations. All models were evaluated using precision, recall, F1 score, and overall classification accuracy. Our proposed Ensemble Transformer model, which fuses the outputs of ViT and Swin Transformer, achieved superior performance by delivering well-balanced F1 scores across all classes, reducing misclassification, and improving robustness with an overall accuracy of 87%. Furthermore, we incorporated explainable AI (XAI) techniques such as Grad-CAM and Grad-CAM++ to generate visual explanations of the model’s predictions, enhancing interpretability for clinical validation. This work demonstrates the potential of integrating global and local attention mechanisms along with XAI in building reliable, real-time, AI-assisted diagnostic support systems for gastrointestinal disorders, particularly in resource-limited healthcare settings.
Downloads
References
[1] Al-Worafi YM. Epidemiology and Burden of Respiratory Diseases in Developing Countries. InHandbook of Medical and Health Sciences in Developing Countries: Education, Practice, and Research 2023 Dec 19 (pp. 1-24). Cham: Springer International Publishing.
[2] Habib SH, Saha S. Burden of non-communicable disease: global overview. Diabetes & Metabolic Syndrome: Clinical Research & Reviews. 2010 Jan 1;4(1):41-7.
[3] Peery AF, Crockett SD, Barritt AS, Dellon ES, Eluri S, Gangarosa LM, Jensen ET, Lund JL, Pasricha S, Runge T, Schmidt M. Burden of gastrointestinal, liver, and pancreatic diseases in the United States. Gastroenterology. 2015 Dec 1;149(7):1731-41.
[4] Siegel RL, Wagle NS, Cercek A, Smith RA, Jemal A. Colorectal cancer statistics, 2023. CA: a cancer journal for clinicians. 2023 May;73(3):233-54.
[5] Al-Worafi YM. Gastroesophageal Reflux Disease Management in Developing Countries. InHandbook of Medical and Health Sciences in Developing Countries: Education, Practice, and Research 2024 Feb 6 (pp. 1-43). Cham: Springer International Publishing.
[6] Islam MN, Hasan M, Hossain MK, Alam MG, Uddin MZ, Soylu A. Vision transformer and explainable transfer learning models for auto detection of kidney cyst, stone and tumor from CT-radiography. Scientific Reports. 2022 Jul 6;12(1):11440.
[7] Alsulami AA, Albarakati A, Al-Ghamdi AA, Ragab M. Identification of anomalies in lung and colon cancer using computer vision-based Swin Transformer with ensemble model on histopathological images. Bioengineering. 2024 Sep 28;11(10):978.
[8] Islam MN, Hasan M, Hossain MK, Alam MG, Uddin MZ, Soylu A. Vision transformer and explainable transfer learning models for auto detection of kidney cyst, stone and tumor from CT-radiography. Scientific Reports. 2022 Jul 6;12(1):11440.
[9] Atabansi CC, Nie J, Liu H, Song Q, Yan L, Zhou X. A survey of Transformer applications for histopathological image analysis: New developments and future directions. BioMedical Engineering OnLine. 2023 Sep 25;22(1):96.
[10] Horie Y, Yoshio T, Aoyama K, Yoshimizu S, Horiuchi Y, Ishiyama A, Hirasawa T, Tsuchida T, Ozawa T, Ishihara S, Kumagai Y. Diagnostic outcomes of esophageal cancer by artificial intelligence using convolutional neural networks. Gastrointestinal endoscopy. 2019 Jan 1;89(1):25-32.
[11] Yang YJ, Bang CS. Application of artificial intelligence in gastroenterology. World journal of gastroenterology. 2019 Apr 14;25(14):1666.
[12] Suzuki H, Yoshitaka T, Yoshio T, Tada T. Artificial intelligence for cancer detection of the upper gastrointestinal tract. Digestive Endoscopy. 2021 Jan;33(2):254-62.
[13] Visaggi P, De Bortoli N, Barberio B, Savarino V, Oleas R, Rosi EM, Marchi S, Ribolsi M, Savarino E. Artificial intelligence in the diagnosis of upper gastrointestinal diseases. Journal of Clinical Gastroenterology. 2022 Jan 1;56(1):23-35.
[14] Nogueira-Rodríguez A, Domínguez-Carbajales R, López-Fernández H, Iglesias Á, Cubiella J, Fdez-Riverola F, Reboiro-Jato M, Glez-Pena D. Deep neural networks approaches for detecting and classifying colorectal polyps. Neurocomputing. 2021 Jan 29;423:721-34.
[15] Quan SY, Wei MT, Lee J, Mohi-Ud-Din R, Mostaghim R, Sachdev R, Siegel D, Friedlander Y, Friedland S. Clinical evaluation of a real-time artificial intelligence-based polyp detection system: a US multi-center pilot study. Scientific Reports. 2022 Apr 21;12(1):6598.
[16] Wang P, Berzin TM, Brown JR, Bharadwaj S, Becq A, Xiao X, Liu P, Li L, Song Y, Zhang D, Li Y. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut. 2019 Oct 1;68(10):1813-9.
[17] Al-Otaibi S, Rehman A, Mujahid M, Alotaibi S, Saba T. Efficient-gastro: optimized EfficientNet model for the detection of gastrointestinal disorders using transfer learning and wireless capsule endoscopy images. PeerJ Computer Science. 2024 Mar 11;10:e1902.
[18] Zhang R, Zheng Y, Poon CC, Shen D, Lau JY. Polyp detection during colonoscopy using a regression-based convolutional neural network with a tracker. Pattern recognition. 2018 Nov 1;83:209-19.
[19] Wesp P, Grosu S, Graser A, Maurus S, Schulz C, Knösel T, Fabritius MP, Schachtner B, Yeh BM, Cyran CC, Ricke J. Deep learning in CT colonography: differentiating premalignant from benign colorectal polyps. European Radiology. 2022 Jul;32(7):4749-59.
[20] Owais M, Arsalan M, Mahmood T, Kang JK, Park KR. Automated diagnosis of various gastrointestinal lesions using a deep learning–based classification and retrieval framework with a large endoscopic database: model development and validation. Journal of medical Internet research. 2020 Nov 26;22(11):e18563.
[21] Nogueira-Rodríguez A, Domínguez-Carbajales R, Campos-Tato F, Herrero J, Puga M, Remedios D, Rivas L, Sánchez E, Iglesias A, Cubiella J, Fdez-Riverola F. Real-time polyp detection model using convolutional neural networks. Neural Computing and Applications. 2022 Jul;34(13):10375-96.
[22] Li R, Li J, Wang Y, Liu X, Xu W, Sun R, Xue B, Zhang X, Ai Y, Du Y, Jiang J. The artificial intelligence revolution in gastric cancer management: clinical applications. Cancer Cell International. 2025 Mar 21;25(1):111.
[23] Lei C, Sun W, Wang K, Weng R, Kan X, Li R. Artificial intelligence-assisted diagnosis of early gastric cancer: present practice and future prospects. Annals of medicine. 2025 Dec 31;57(1):2461679.
[24] Chaudhary RG, Dhangar P, Chaudhary AG. Artificial Intelligence in Gastrointestinal Endoscopy: A Comprehensive Systematic Review. medRxiv. 2025:2025-07.
[25] Abu Kowshir Bitto, Rezwana Karim, M. H. Begum, M. F. I. K. Khan, Dr. Md. Maruf Hassan, and Prof. Dr. Abdul kadar Muhammad Masum, “Explainable AI Based Deep Ensemble Convolutional Learning for Multi-Categorical Ocular Disease Prediction”, EAI Endorsed Trans AI Robotics, vol. 4, Jul. 2025.
[26] Badruzzaman Biplob KB, Sammak MH, Bitto AK, Mahmud I. COVID-19 and Suicide Tendency: Prediction and Risk Factor Analysis Using Machine Learning and Explainable AI. EAI Endorsed Transactions on Pervasive Health & Technology. 2024 Jan 1;10(1).
[27] Bitto AK, Bijoy MH, Shakil KH, Das A, Biplob KB, Mahmud I, Hossain SM. GastroEndoNet: Comprehensive endoscopy image dataset for GERD and polyp detection. Data in Brief. 2025 Jun 1;60:1115
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Prof. Dr. Abdul kadar Muhammad Masum, Abu Kowshir Bitto, Shafiqul Islam Talukder, Md Fokrul Islam Khan, Mohammed Shamsul Alam, Khandaker Mohammad Mohi Uddin

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.