Harnessing the Power of Ensemble Machine Learning for the Heart Stroke Classification
DOI:
https://doi.org/10.4108/eetpht.9.4617Keywords:
Ensemble Machine Learning, Machine Learning, Performance Metrics, Stroke PredictionAbstract
A heart stroke, also known as a myocardial infarction or heart attack, is a critical medical condition that arises when there is an obstruction in the coronary arteries that provide blood to the heart muscles. This blockage results in a diminished flow of blood and oxygen to a specific area of the heart. This abrupt interruption initiates a gradual sequence of heart muscle damage, which can lead to varying degrees of functional impairment. The severity of these impairments is primarily determined by the precise location of the heart muscle affected. Therefore, it is of utmost importance to identify the warning signs and symptoms of a stroke as soon as possible. This is the objective of this paper is to early recognition and prompt action can significantly improve the chances of a healthy and fulfilling life following a stroke. In this research work, the Stroke dataset is pre-processed and on pre-processed dataset machine learning and ensemble machine learning techniques were employed to develop and assess several models aimed at creating a stable framework for predicting the enduring stroke risk. And various matrices like accuracy, F1 score, ROC, precision, and recall are calculated. Among all models, AdaBoost model demonstrated exceptional performance validated through multiple metrics, including Precision, AUC, recall, accuracy, and F1-measure. The results underscored superiority of the AdaBoost classification method, achieving an impressive Accuracy of 99%. AdaBoost model may serve as a stable framework for predicting enduring stroke risk, emphasizing its potential utility in clinical settings for identifying individuals at higher risk of experiencing a stroke.
Downloads
References
Gorelick, P.B., Scuteri, A., Black, S.E., DeCarli, C., Greenberg, S.M., Iadecola, C., Launer, L.J., Laurent, S., Lopez, O.L., Nyenhuis, D., Petersen, R.C., Schneider, J.A., Tzourio, C., Arnett, D.K., Bennett, D.A., Chui, H.C., Higashida, R.T., Lindquist, R., Nilsson, P.M., Roman, G.C., Sellke, F.W., Seshadri, S.: Vascular Contributions to Cognitive Impairment and Dementia: A Statement for Healthcare Professionals From the American Heart Association/American Stroke Association. Stroke. 42, 2672–2713 (2011). https://doi.org/10.1161/STR.0b013e3182299496. DOI: https://doi.org/10.1161/STR.0b013e3182299496
Das, M.C., Liza, F.T., Pandit, P.P., Tabassum, F., Mamun, M.A., Bhattacharjee, S., Kashem, M.S.B.: A comparative study of machine learning approaches for heart stroke prediction. In: 2023 International Conference on Smart Applications, Communications and Networking (SmartNets). pp. 1–6. IEEE, Istanbul, Turkiye (2023). https://doi.org/10.1109/SmartNets58706.2023.10216049. DOI: https://doi.org/10.1109/SmartNets58706.2023.10216049
Learn about Stroke: . [(accessed on 25 May 2022)]. Available online: https://www.world-stroke.org/world-stroke-day-campaign/why-stroke-matters/learn-about-stroke.
European Stroke Initiative Executive Committee, EUSI Writing Committee, Olsen, T.S., Langhorne, P., Diener, H.C., Hennerici, M., Ferro, J., Sivenius, J., Wahlgren, N.G., Bath, P.: European Stroke Initiative Recommendations for Stroke Management-update 2003. Cerebrovasc Dis. 16, 311–337 (2003). https://doi.org/10.1159/000072554. DOI: https://doi.org/10.1159/000072554
Emon, M.U., Keya, M.S., Meghla, T.I., Rahman, Md.M., Mamun, M.S.A., Kaiser, M.S.: Performance Analysis of Machine Learning Approaches in Stroke Prediction. In: 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA). pp. 1464–1469. IEEE, Coimbatore, India (2020). https://doi.org/10.1109/ICECA49313.2020.9297525. DOI: https://doi.org/10.1109/ICECA49313.2020.9297525
Dev, S., Wang, H., Nwosu, C.S., Jain, N., Veeravalli, B., John, D.: A predictive analytics approach for stroke prediction using machine learning and neural networks. Healthcare Analytics. 2, 100032 (2022). https://doi.org/10.1016/j.health.2022.100032. DOI: https://doi.org/10.1016/j.health.2022.100032
Uttam, A.K.: Analysis of Uneven Stroke Prediction Dataset using Machine Learning. In: 2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS). pp. 1209–1213. IEEE, Madurai, India (2022). https://doi.org/10.1109/ICICCS53718.2022.9788309. DOI: https://doi.org/10.1109/ICICCS53718.2022.9788309
Khosla, A., Cao, Y., Lin, C.C.-Y., Chiu, H.-K., Hu, J., Lee, H.: An integrated machine learning approach to stroke prediction. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. pp. 183–192. ACM, Washington DC USA (2010). https://doi.org/10.1145/1835804.1835830. DOI: https://doi.org/10.1145/1835804.1835830
Paikaray, D., Mehta, A.K.: An Extensive Approach Towards Heart Stroke Prediction Using Machine Learning with Ensemble Classifier. In: Dua, M., Jain, A.K., Yadav, A., Kumar, N., and Siarry, P. (eds.) Proceedings of the International Conference on Paradigms of Communication, Computing and Data Sciences. pp. 767–777. Springer Singapore, Singapore (2022). https://doi.org/10.1007/978-981-16-5747-4_66. DOI: https://doi.org/10.1007/978-981-16-5747-4_66
Kumar, K.L., Reddy, B.E.: Heart Disease Detection System Using Gradient Boosting Technique. In: 2021 International Conference on Computing Sciences (ICCS). pp. 228–233. IEEE, Phagwara, India (2021). https://doi.org/10.1109/ICCS54944.2021.00052. DOI: https://doi.org/10.1109/ICCS54944.2021.00052
Singh, M.S., Choudhary, P., Thongam, K.: A Comparative Analysis for Various Stroke Prediction Techniques. In: Nain, N., Vipparthi, S.K., and Raman, B. (eds.) Computer Vision and Image Processing. pp. 98–106. Springer Singapore, Singapore (2020). https://doi.org/10.1007/978-981-15-4018-9_9. DOI: https://doi.org/10.1007/978-981-15-4018-9_9
Bandi, V., Bhattacharyya, D., Midhunchakkravarthy, D.: Prediction of Stroke Severity Using Machine Learning. RIA. 34, 753–761 (2020). https://doi.org/10.18280/ria.340609. DOI: https://doi.org/10.18280/ria.340609
Kaur, M., Sakhare, S.R., Wanjale, K., Akter, F.: Early Stroke Prediction Methods for Prevention of Strokes. Behavioural Neurology. 2022, 1–9 (2022). https://doi.org/10.1155/2022/7725597. DOI: https://doi.org/10.1155/2022/7725597
Govindarajan, P., Soundarapandian, R.K., Gandomi, A.H., Patan, R., Jayaraman, P., Manikandan, R.: Classification of stroke disease using machine learning algorithms. Neural Comput & Applic. 32, 817–828 (2020). https://doi.org/10.1007/s00521-019-04041-y. DOI: https://doi.org/10.1007/s00521-019-04041-y
Sailasya, G., Kumari, G.L.A.: Analyzing the Performance of Stroke Prediction using ML Classification Algorithms. IJACSA. 12, (2021). https://doi.org/10.14569/IJACSA.2021.0120662. DOI: https://doi.org/10.14569/IJACSA.2021.0120662
Chin, C.-L., Lin, B.-J., Wu, G.-R., Weng, T.-C., Yang, C.-S., Su, R.-C., Pan, Y.-J.: An automated early ischemic stroke detection system using CNN deep learning algorithm. In: 2017 IEEE 8th International Conference on Awareness Science and Technology (iCAST). pp. 368–372. IEEE, Taichung (2017). https://doi.org/10.1109/ICAwST.2017.8256481. DOI: https://doi.org/10.1109/ICAwST.2017.8256481
Li, X., Bian, D., Yu, J., Li, M., Zhao, D.: Using machine learning models to improve stroke risk level classification methods of China national stroke screening. BMC Med Inform Decis Mak. 19, 261 (2019). https://doi.org/10.1186/s12911-019-0998-2. DOI: https://doi.org/10.1186/s12911-019-0998-2
Stroke Prediction Dataset: https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset.
Al-Zubaidi, H., Dweik, M., Al-Mousa, A.: Stroke Prediction Using Machine Learning Classification Methods. In: 2022 International Arab Conference on Information Technology (ACIT). pp. 1–8. IEEE, Abu Dhabi, United Arab Emirates (2022). https://doi.org/10.1109/ACIT57182.2022.10022050. DOI: https://doi.org/10.1109/ACIT57182.2022.10022050
Singh, D., Singh, B.: Feature wise normalization: An effective way of normalizing data. Pattern Recognition. 122, 108307 (2022). https://doi.org/10.1016/j.patcog.2021.108307. DOI: https://doi.org/10.1016/j.patcog.2021.108307
Pawlovsky, A.P.: An ensemble based on distances for a kNN method for heart disease diagnosis. In: 2018 International Conference on Electronics, Information, and Communication (ICEIC). pp. 1–4. IEEE, Honolulu, HI, USA (2018). https://doi.org/10.23919/ELINFOCOM.2018.8330570. DOI: https://doi.org/10.23919/ELINFOCOM.2018.8330570
Çınar, A., Tuncer, S.A.: Classification of normal sinus rhythm, abnormal arrhythmia and congestive heart failure ECG signals using LSTM and hybrid CNN-SVM deep neural networks. Computer Methods in Biomechanics and Biomedical Engineering. 24, 203–214 (2021). https://doi.org/10.1080/10255842.2020.1821192. DOI: https://doi.org/10.1080/10255842.2020.1821192
Majumder, A.B., Gupta, S., Singh, D.: An Ensemble Heart Disease Prediction Model Bagged with Logistic Regression, Naïve Bayes and K Nearest Neighbour. J. Phys.: Conf. Ser. 2286, 012017 (2022). https://doi.org/10.1088/1742-6596/2286/1/012017. DOI: https://doi.org/10.1088/1742-6596/2286/1/012017
Yang, Z., Liang, Y., Zhang, H., Chai, H., Zhang, B., Peng, C.: Robust Sparse Logistic Regression With the $L_{q}$ ($0 < text{q} < 1$ ) Regularization for Feature Selection Using Gene Expression Data. IEEE Access. 6, 68586–68595 (2018). https://doi.org/10.1109/ACCESS.2018.2880198. DOI: https://doi.org/10.1109/ACCESS.2018.2880198
Babu, G.H., Jayasree, G., Ashika, C., Ahalya, V., Niroopa, K.A.: Heart Disease Prediction System Using Random Forest Technique. IJRASET. 11, 1133–1141 (2023). https://doi.org/10.22214/ijraset.2023.48764. DOI: https://doi.org/10.22214/ijraset.2023.48764
Li, R., Shen, S., Chen, G., Xie, T., Ji, S., Zhou, B., Wang, Z.: Multilevel Risk Prediction of Cardiovascular Disease based on Adaboost+RF Ensemble Learning. IOP Conf. Ser.: Mater. Sci. Eng. 533, 012050 (2019). https://doi.org/10.1088/1757-899X/533/1/012050. DOI: https://doi.org/10.1088/1757-899X/533/1/012050
Chicco, D., Jurman, G.: The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 21, 6 (2020). https://doi.org/10.1186/s12864-019-6413-7. DOI: https://doi.org/10.1186/s12864-019-6413-7
Mishra, I., Mohapatra, S.: An enhanced approach for analyzing the performance of heart stroke prediction with machine learning techniques. Int. j. inf. tecnol. 15, 3257–3270 (2023). https://doi.org/10.1007/s41870-023-01321-8. DOI: https://doi.org/10.1007/s41870-023-01321-8
Sharma, C., Sharma, S., Kumar, M., Sodhi, A.: Early Stroke Prediction Using Machine Learning. In: 2022 International Conference on Decision Aid Sciences and Applications (DASA). pp. 890–894. IEEE, Chiangrai, Thailand (2022). https://doi.org/10.1109/DASA54658.2022.9765307. DOI: https://doi.org/10.1109/DASA54658.2022.9765307
Rana, C., Chitre, N., Poyekar, B., Bide, P.: Stroke Prediction Using Smote-Tomek and Neural Network. In: 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT). pp. 1–5. IEEE, Kharagpur, India (2021). https://doi.org/10.1109/ICCCNT51525.2021.9579763. DOI: https://doi.org/10.1109/ICCCNT51525.2021.9579763
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Purnima Pal, Manju Nandal, Srishti Dikshit, Aarushi Thusu, Harsh Vikram Singh
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.