Predicting Breast Cancer with Ensemble Methods on Cloud

Authors

DOI:

https://doi.org/10.4108/eetcasa.v8i2.2788

Keywords:

Bagging, Boosting, Stacking, Random Forest, Ensemble methods

Abstract

There are many dangerous diseases and high mortality rates for women (including breast cancer). If the disease is detected early, correctly diagnosed and treated at the right time, the likelihood of illness and death is reduced. Previous disease prediction models have mainly focused on methods for building individual models. However, these predictive models do not yet have high accuracy and high generalization performance. In this paper, we focus on combining these individual models together to create a combined model, which is more generalizable than the individual models. Three ensemble techniques used in the experiment are: Bagging; Boosting and Stacking (Stacking include three models: Gradient Boost, Random Forest, Logistic Regression) to deploy and apply to breast cancer prediction problem. The experimental results show the combined model with the ensemble methods based on the Breast Cancer Wisconsin dataset; this combined model has a higher predictive performance than the commonly used individual prediction models.

References

Saleh H, Abdelghany FS, Alyami H, Alosaimi W. Predicting Breast Cancer Based on Optimized Deep Learning Approach. Hindavi. 2022; Article ID 1820777:11 pages. DOI: https://doi.org/10.1155/2022/1820777

Asri H, Mousannif H, Al HM, Noel T. Using machine learning algorithms for breast cancer risk prediction and diagnosis. Procedia Computer Science. 2016; vol 83: pp 1064–1069.

Yang R. Enterprise Network Marketing Prediction Using the Optimized GA-BP Neural Network. Complexity Article. 2020; ID 6682296. DOI: https://doi.org/10.1155/2020/6682296

Zang C, Ma Y. Ensemble Machine Learning Methods and Applications. Springer Science+Business Media. 2012. DOI: https://doi.org/10.1007/978-1-4419-9326-7

Rosly R, Makhtar M, Awang M H. Rahman N D, Deris M H. Comparison of Ensemble Classifiersfor Water Quality Dataset. Proceedings of the UniSZA Research Conference 2015 (URC ’15). 2015; Universiti Sultan Zainal Abidin.

Drucker H, Cortes C, Jackel L, LeCun Y. Boosting and Other Ensemble Methods. Neural Computation. 1994; vol 6: 1289-130. DOI: https://doi.org/10.1162/neco.1994.6.6.1289

Todorovski L, Dzeroski S. Combining classifiers with meta decision trees. Researchgate. 2003; 50(3): 223-249. DOI: https://doi.org/10.1023/A:1021709817809

Wolpert DH. Stacked generalization. Researchgate. 1992; vol5(2): 241-259. DOI: https://doi.org/10.1016/S0893-6080(05)80023-1

Adele C, David R, John R. Random Forests. Springer. 2011; vol 45(1): pp 157-176.

Pintelas P, Livieris E I. Ensemble Algorithms and Their Applications. Mdpi AG. 2020; ISBN 978-3-03936-959-1

Aldhyani HHT, AI-Yaari M, Hasan Alkahtanni, Mashael Maashi. Water Quality Prediction Using Artificial Intelligence Algorithms. Hindawi. 2020; vol. 2020: Article ID 6659314: 12 pages. DOI: https://doi.org/10.1155/2020/6659314

Rokach L, Maimon O. Decision Tree. researchGate, (2005).

SOCIAL-SCIENCES https://www.encyclopedia.com/social-sciences/applied-and-social-sciences-magazines/bootstrap-method, (2022).

Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. December 19, 1996. DOI: https://doi.org/10.1007/3-540-59119-2_166

Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. August 2016. DOI: https://doi.org/10.1145/2939672.2939785

Nakano FK, Mastelini SM, Barbon S, Cerri R. Stacking Methods for Hierarchical Classification. IEEE 2017; vol 2017: 289-296. DOI: https://doi.org/10.1109/ICMLA.2017.0-145

Robert E. Schapire. The strength of weak learnability. Manufactured in The Netherlands; 2017; vol 5 (2) :197-227 DOI: https://doi.org/10.1007/BF00116037

Sultana J. Predicting Breast Cancer using Logistic Regression and Multi-Class Classifiers. Researchgate . 2018; vol 7. DOI: https://doi.org/10.14419/ijet.v7i4.20.22115

Cheng X, Whan W, Liang Y, Lin X, Luo J, Zhong W, Chen D. Risk Prediction of Coronary Artery Stenosis in Patients with Coronary Heart Disease Based on Logistic Regression and Artificial Neural Network. Computational and Mathematical Methods in Medicine. 2022; Article ID 3684700. DOI: https://doi.org/10.1155/2022/3684700

Asri H, Mousannif H, Al Moatassime H, Noel T. Using machine learning algorithms for breast cancer risk prediction and diagnosis. Sciencedirect. 2016; vol: 83: 1064-1069. DOI: https://doi.org/10.1016/j.procs.2016.04.224

Chen H, Du M, Zhang Y, Yang C. Research on Disease Prediction Method Based on R-Lookahead-LSTM. Computational Intelligence and Neuroscience. 2022; vol: 2022, Article ID 8431912. DOI: https://doi.org/10.1155/2022/8431912

Islam M Md, Haque Md R, Iqbal H, Hasan Md M, Hasan M, Kabir MN. Breast cancer prediction: a comparative study using machine learning techniques. Original research. 2020; vol: 1; no: 5; pp: 1–14. DOI: https://doi.org/10.1007/s42979-020-00305-w

Prananda AR, Nugroho HA, Frannita EL. Rapid assessment of breast cancer malignancy using deep neural network. Springer, Surabaya, Indonesia Cairo, Egypt, October 2021; pp. 639–649. DOI: https://doi.org/10.1007/978-981-33-6926-9_56

Alickovic E, Subasi A. Breast cancer diagnosis using ga feature selection and rotation forest. Researchgate. 2017; vol: 28; no. 4; pp: 753–763. DOI: https://doi.org/10.1007/s00521-015-2103-9

Leo Breiman. Bagging predictors. Machine learning. 1996; 24(2):123–140. DOI: https://doi.org/10.1007/BF00058655

Sahran S, Qasem A, Omar K, Albashih D, Adam A, Abdullah SNHS, Abdullah A, Hussain RI, Ismail F, Abdullah N, Pauzi Md HS, Shukor Adb N. Machine Learning Methods for Breast Cancer Diagnostic. 2018, Avialable: http://dx.doi.org/10.5772/intechopen. 79446, retrieved on 13th September, 2020. DOI: https://doi.org/10.5772/intechopen.79446

Quinlan J R. Induction of Decision Trees. Mach. Learn. 1, 1 (Mar. 1986), 81-106, 1986. DOI: https://doi.org/10.1007/BF00116251

Jerome H. Friedman. Stochastic Gradient Boosting. Jscimedcentral. 29 October 2018.

Downloads

Published

29-03-2023

How to Cite

1.
Pham A, Tran T, Tran P, Huynh H. Predicting Breast Cancer with Ensemble Methods on Cloud. EAI Endorsed Trans Context Aware Syst App [Internet]. 2023 Mar. 29 [cited 2024 Apr. 25];9. Available from: https://publications.eai.eu/index.php/casa/article/view/2788