Explainable Transformer Models for Early Prediction of Chronic Diseases Using Longitudinal Electronic Health Records (EHRs)
DOI:
https://doi.org/10.4108/airo.11517Keywords:
Chronic Disease Prediction, Electronic Health Records, Explainable Artificial Intelligence, Longitudinal Data, Transformer ModelsAbstract
Early prediction of chronic diseases using longitudinal electronic health records (EHRs) is critical for enabling timely interventions and improving patient outcomes. However, existing deep learning approaches often function as black-box models, limiting their clinical adoption due to a lack of transparency and interpretability. This study proposes an explainable transformer-based framework for early chronic disease prediction that effectively models temporal dependencies in longitudinal EHR data while providing clinically meaningful explanations. The proposed approach integrates a time-aware transformer architecture with attention-based interpretability mechanisms to capture complex patient trajectories across heterogeneous clinical events, including diagnoses, laboratory results, medications, and demographic attributes. To enhance explainability, we incorporate feature-level and temporal attention visualization, enabling identification of influential clinical factors and critical time windows contributing to disease onset predictions. Extensive experiments conducted on large-scale longitudinal EHR datasets demonstrate that the proposed model consistently outperforms state-of-the-art machine learning and deep learning baselines in terms of predictive accuracy, recall, and early risk detection capability. Furthermore, qualitative evaluation with clinician-oriented explanation analyses confirms that the generated explanations align with established medical knowledge, enhancing trust and clinical usability. This work advances the integration of explainable artificial intelligence in healthcare by offering a robust and interpretable transformer-based solution for early chronic disease prediction, supporting data-driven decision-making in real-world clinical settings.
Downloads
References
[1] L. A. Carrasco-Ribelles et al., “Prediction models using artificial intelligence and longitudinal data from electronic health records: a systematic methodological review,” Journal of the American Medical Informatics Association, vol. 30, no. 12, pp. 2072–2082, 2023.
[2] T. Hama et al., “Enhancing patient outcome prediction through deep learning with sequential diagnosis codes from structured electronic health record data: Systematic review,” J. Med. Internet Res., vol. 27, p. e57358, 2025.
[3] Y. Li et al., “BEHRT: transformer for electronic health records,” Sci. Rep., vol. 10, no. 1, p. 7155, 2020.
[4] Y. Li et al., “Hi-BEHRT: hierarchical transformer-based model for accurate prediction of clinical events using multimodal longitudinal electronic health records,” IEEE J. Biomed. Health Inform., vol. 27, no. 2, pp. 1106–1117, 2022
[5] E. Antikainen et al., “Transformers for cardiac patient mortality risk prediction from heterogeneous electronic health records,” Sci. Rep., vol. 13, no. 1, p. 3517, 2023.
[6] Z. Kraljevic et al., “Foresight—a generative pretrained transformer for modelling of patient timelines using electronic health records: a retrospective modelling study,” Lancet Digit. Health, vol. 6, no. 4, pp. e281–e290, 2024.
[7] S. Rao et al., “An explainable transformer-based deep learning model for the prediction of incident heart failure,” IEEE J. Biomed. Health Inform., vol. 26, no. 7, pp. 3362–3372, 2022.
[8] S. Xian et al., “Transformer patient embedding using electronic health records enables patient stratification and progression analysis,” NPJ Digit. Med., vol. 8, no. 1, p. 521, 2025.
[9] Y. Meng, W. Speier, M. K. Ong, and C. W. Arnold, “Bidirectional representation learning from transformers using multimodal electronic health record data to predict depression,” IEEE J. Biomed. Health Inform., vol. 25, no. 8, pp. 3121–3129, 2021.
[10] Y. Kumar, A. Ilin, H. Salo, S. Kulathinal, M. K. Leinonen, and P. Marttinen, “Self-supervised forecasting in electronic health records with attention-free models,” IEEE Transactions on Artificial Intelligence, vol. 5, no. 8, pp. 3926–3938, 2024.
[11] J.-E. Ding et al., “Large language multimodal models for new-onset type 2 diabetes prediction using five-year cohort electronic health records,” Sci. Rep., vol. 14, no. 1, p. 20774, 2024.
[12] H. Hoghooghi Esfahani, S. Toyonaga, and K. Oyibo, “The application of explainable artificial intelligence in the prediction, diagnoses, treatment, and management of chronic diseases: A systematic review,” Digit. Health, vol. 11, p. 20552076251355668, 2025.
[13] R. Grout et al., “Predicting disease onset from electronic health records for population health management: a scalable and explainable Deep Learning approach,” Front. Artif. Intell., vol. 6, p. 1287541, 2024.
[14] [14] N. de Lacy, M. Ramshaw, and W. Y. Lam, “RiskPath: Explainable deep learning for multistep biomedical prediction in longitudinal data,” Patterns, 2025.
[15] I. Drozdov, B. Szubert, C. Murphy, K. Brooksbank, and D. J. Lowe, “Early detection of heart failure using in-patient longitudinal electronic health records,” PLoS One, vol. 19, no. 12, p. e0314145, 2024.
[16] C.-T. Dao et al., “CURENet: combining unified representations for efficient chronic disease prediction,” Health Inf. Sci. Syst., vol. 14, no. 1, p. 7, 2025.
[17] A. Abugabah, P. K. Shukla, P. K. Shukla, and A. Pandey, “An intelligent healthcare system for rare disease diagnosis utilizing electronic health records based on a knowledge-guided multimodal transformer framework,” BioData Min., vol. 18, no. 1, p. 70, 2025.
[18] A. Saxena, S. Z. Hassan, and J. Bhardwaj, “AI Chronic Diseases Preventive Care: Integrating Electronic Health Records, Genomic Data, and Real-Time Patient Monitoring with AI for Enhanced Early Detection of Chronic Diseases and Optimization of Peptide Drug Manufacturing,” in International Conference of Global Innovations and Solutions, Springer, 2025, pp. 424–434.
[19] M. Lentzen et al., “A transformer-based model trained on large scale claims data for prediction of severe COVID-19 disease progression,” IEEE J. Biomed. Health Inform., vol. 27, no. 9, pp. 4548–4558, 2023.
[20] Y. K. Ahmed and A. N. A. Naji, “Smart feature extraction using deep learning for early diagnosis of chronic diseases in next-generation medical decision support systems,” Network Modeling Analysis in Health Informatics and Bioinformatics, vol. 14, no. 1, p. 140, 2025.
[21] L. Dai, H. Xu, and Y. Zhang, “Automated classification of clinical diagnoses in electronic health records using transformer,” PLoS One, vol. 20, no. 9, p. e0329963, 2025.
[22] L. Gerrard, X. Peng, A. Clarke, and G. Long, “Claimsformer: Pretrained Transformer for Administrative Claims Data to Predict Chronic Conditions,” in Australasian Joint Conference on Artificial Intelligence, Springer, 2024, pp. 348–362.
[23] A. Shmatko et al., “Learning the natural history of human disease with generative transformers,” Nature, pp. 1–9, 2025.
[24] H. M. Zangana, F. M. Mustafa, and M. Omar, “A Hybrid Approach for Robust Object Detection: Integrating Template Matching and Faster R-CNN,” EAI Endorsed Transactions on AI and Robotics, vol. 3, 2024.
[25] H. M. Zangana and F. M. Mustafa, “Hybrid Image Denoising Using Wavelet Transform and Deep Learning,” 2024.
[26] A. Gupta, “Improved hybrid preprocessing technique for effective segmentation of wheat canopies in chlorophyll fluorescence images,” EAI Endorsed Trans. AI Robot., vol. 3, 2024.
[27] S. Mishra and R. K. Dwivedi, “Designing Automation for Pickup and Delivery Tasks in Modern Warehouses Using Multi Agent Path Finding (MAPF) and Multi Agent Reinforcement Learning (MARL) Based Approaches,” EAI Endorsed Transactions on AI and Robotics, vol. 3, 2024.
[28] E. Aghajari and A. A. AbdulRahim, “Prediction of short circuit current of wind turbines based on artificial neural network model,” EAI Endorsed Trans. AI Robot, vol. 3, 2024.
[29] A. Mohamed, R. AlAleeli, and K. Shaalan, “Advancing Predictive Healthcare: A Systematic Review of Transformer Models in Electronic Health Records,” Computers, vol. 14, no. 4, p. 148, 2025.
[30] S. Kaur and H. Sharma, “Causal Representation Learning for Predicting Autoimmune Disease Progression from Longitudinal Multimodal Clinical Data,” IEEE Access, 2025.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Hewa Majeed Zangana, Maryam A. Sulaiman

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.