A Comprehensive Feature Engineering Approach for Breast Cancer Dataset
DOI:
https://doi.org/10.4108/eetpht.10.5327Keywords:
Breast Cancer, Univariate Analysis, Bivariate Analysis, Heat Map, CorrelationAbstract
Breast cancer continues to pose a significant challenge in the field of healthcare, serving as the primary cause of cancer-related deaths in women on a global scale. The present study aims to investigate the intricate relationship between breast cancer, statistical analysis, and feature engineering. By conducting an extensive analysis of a comprehensive dataset and employing sophisticated statistical methodologies, this research endeavor aims to unveil concealed insights that can enrich the medical community's existing knowledge base. Through the implementation of rigorous feature selection and extraction methodologies, the overarching aim is to augment the comprehension of breast cancer. Moreover, the study showcases the successful incorporation of univariate and bivariate analysis in order to enhance the accuracy of diagnostic procedures. The convergence of these disciplines exhibits considerable promise in the realm of breast cancer detection and prediction, facilitating cooperative endeavours aimed at addressing this widespread malignancy.
Downloads
References
N. Sharma, M. Mangla, M. Ishaque and S. N. Mohanty, "Inferential Statistics and Visualization Techniques for Aspect Analysis," 2023 1st International Conference on Advanced Innovations in Smart Cities (ICAISC), Jeddah, Saudi Arabia, 2023, pp. DOI: https://doi.org/10.1109/ICAISC56366.2023.10085093
https://www.cancer.net/cancer-types/breast-cancer/statistics.
Dubey, A. K., Gupta, U., & Jain, S. (2015). Breast cancer statistics and prediction methodology: a systematic review and analysis. Asian Pacific journal of cancer prevention, 16(10), 4237-4245. DOI: https://doi.org/10.7314/APJCP.2015.16.10.4237
Lewis, J. T., Hartmann, L. C., Vierkant, R. A., Maloney, S. D., Pankratz, V. S., Allers, T. M., ... & Visscher, D. W. (2006). An analysis of breast cancer risk in women with single, multiple, and atypical papilloma. The American journal of surgical pathology, 30(6), 665-672. DOI: https://doi.org/10.1097/00000478-200606000-00001
YK Ng, LN Ung, FC Ng, LSJ Sim, E. (2001). Statistical analysis of healthy and malignant breast thermography. Journal of medical engineering & technology, 25(6), 253-263. DOI: https://doi.org/10.1080/03091900110086642
Aruna, S., Rajagopalan, S. P., & Nandakishore, L. V. (2011). Knowledge based analysis of various statistical tools in detecting breast cancer. Computer Science & Information Technology, 2(2011), 37-45.
Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R. L., Torre, L. A., & Jemal, A. (2018). Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians, 68(6), 394-424. DOI: https://doi.org/10.3322/caac.21492
Chen, H., Boutros, P. C., & Vennettilli, A. (2016). Characterizing heterogeneous subtype by integrating gene expression data and pathway markers. BMC Bioinformatics, 17(Suppl 13), 323.
Anderson, W. F., Luo, S., Chatterjee, N., Rosenberg, P. S., & Matsuno, R. K. (2019). J Natl Cancer Inst, 111(3), 310-320.
Li, H., Pang, B., & Wu, N. (2018). A hybrid method for breast cancer diagnosis based on feature selection and ensemble learning. Frontiers in Genetics, 9, 597.
Wang, X., Janowczyk, A., Zhou, Y., Thawani, R., Fu, P., Schalper, K., ... & Yao, J. (2020). Prediction of recurrence in early stage non-small cell lung cancer using computer extracted nuclear features from digital H&E images. Scientific Reports, 10(1), 1-12.
Zhu, W., Zeng, N., Wang, N., Yang, Y., & Wu, F. (2017). A review on region-based object detection algorithms. Pattern Recognition, 70, 167-183.
https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data
D. Jalan, A. Tuli, V. Chaudhary, N. Sharma and M. Rakhra, "Machine Learning Models for Life Expectancy," 2023 International Conference on Artificial Intelligence and Applications (ICAIA) Alliance Technology Conference (ATCON-1), Bangalore, India, 2023, pp. 1-6, doi: 10.1109/ICAIA57370.2023.10169737. DOI: https://doi.org/10.1109/ICAIA57370.2023.10169737
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Shambhvi Sharma, Monica Sahni
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.