Breast cancer early detection in TP53 SNP protein sequences based on a new Convolutional Neural Network model
DOI:
https://doi.org/10.4108/eetpht.9.3218Keywords:
CNN classification, breast cancer, scalogram, ORB, SNP, tumor suppressor genes, TP53 geneAbstract
INTRODUCTION: Breast cancer (BC) is the most commonly occurring cancer and the second leading cause for women’s disease death. The BC cases are associated with genital mutations which are inherited from older generations or acquired overtime. If the diagnosis is done at the first stage, effects associated with certain treatments can be limited, costs can be saved and the diagnostic time can be minimized. This can also help specialists target the best treatment to increase the rate of cures. Nevertheless, its discovery in patients is very challenging due to silent symptoms aside from the fact the routine screening is not recommended for women under 40 years old.
OBJECTIVES: Several efforts are aimed at the BC early detection using machine and deep learning systems. The proposed algorithms use different data types to distinguish between cancerous and non-cancerous cases; as: mammography, ultrasound and MRI (magnetic resonance imaging) images. Then, different learning tools were applied on this data for the classification task. Despite the classification rates which exceed 90%, the major drawback of all these methods is that they are applicable only after the appearance of the cancerous tumors, which reduces the cure rates.
METHODS: We propose a new technique for early breast cancer screening. For the data, we focus on cancerous and non-cancerous SNP (Single Nucleotide Polymorphism) protein sequences of the TP53 gene in chromosome 17. This gene is shown to be linked to different single amino acid mutations on which we will shed light here. The method we propose transforms SNP textual sequences into digital vectors via coding. Then, RGB scalogram images are generated using the continuous wavelet transform. A pretreatment of color coefficients is applied to scalograms aiming at creating four different databases. Finally, a CNN deep learning network is used for the binary classification of cancerous and non-cancerous images.
RESULTS: During the validation process, we reached good performance with specificity of 97.84%, sensitivity of 96.45%, an overall accuracy of 95.29% and an equal run time of 12 minutes 3 seconds. These values ensure the efficiency of our method.To enhance more these results, we used the ORB feature detection technique. Consequently, the classification rates have been improved to reach 95.9% as accuracy
CONCLUSION: Our method will allow significant savings time and lives by detecting the disease in patients whose genetic mutations are beginning to appear.
Downloads
References
https://www.who.int/news-room/fact-sheets/detail/breast-cancer.
https://www.cdc.gov/cancer/breast/basic_info/index.htm.
https://www.breastcancer.org/research-news/chemical-exposure-early-in-life-increases-risk.
Nkondjock, A. and Ghadirian, P. (2005) Facteurs de risque du cancer du sein. médecine/sciences 21(2): 175–180. DOI: https://doi.org/10.1051/medsci/2005212175
www2.le.ac.uk/projects/vgec/highereducation/topics/dna-genesprotectdiscretionary{char hyphencharfont}{}{}chromosomes/resources.
ADAKO, O.P. (2014) KNOWLEDGE OF BREAST CAN-CER AND PREFERENCE OF EARLY DETECTION SCREENING MEASURES AMONG FEMALE UNDER-GRADUATE STUDENTS OF EKITI STATE UNIVERSITY, ADO EKITI, NIGERIA. Ph.D. thesis.
Khuriwal, N. and Mishra, N. (2018) Breast cancer detection from histopathological images using deep learning. In 2018 3rd international conference and work-shops on recent advances and innovations in engineering (ICRAIE) (IEEE): 1–4. DOI: https://doi.org/10.1109/ICRAIE.2018.8710426
Ragab, D.A., Sharkas, M., Marshall, S. and Ren, J.(2019) Breast cancer detection using deep convolutional neural networks and support vector machines. PeerJ 7: e6201.
Shen, R., Yan, K., Xiao, F., Chang, J., Jiang, C. and Zhou, K. (2018) Automatic pectoral muscle region segmentation in mammograms using genetic algorithm and morphological selection. Journal of digital imaging 31: 680–691. DOI: https://doi.org/10.1007/s10278-018-0068-9
Moreira, I.C., Amaral, I., Domingues, I., Cardoso, A., Cardoso, M.J. and Cardoso, J.S. (2012) Inbreast: toward a full-field digital mammographic database. Academic radiology 19(2): 236–248. DOI: https://doi.org/10.1016/j.acra.2011.09.014
Ma, Y., Yang, C., Zhang, J., Wang, Y., Gao, F. and Gao, F. (2020) Human breast numerical model generation based on deep learning for photoacoustic imaging. In 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) (IEEE): 1919–1922. DOI: https://doi.org/10.1109/EMBC44109.2020.9176298
Yoon, W.B., Oh, J.E., Chae, E.Y., Kim, H.H., Lee, S.Y. and Kim, K.G. (2016) Automatic detection of pectoral muscle region for computer-aided diagnosis using mias mammograms. BioMed research international 2016. DOI: https://doi.org/10.1155/2016/5967580
Alkhaleefah, M., Ma, S.C., Chang, Y.L., Huang, B., Chittem, P.K. and Achhannagari, V.P. (2020) Double-shot transfer learning for breast cancer classification from x-ray images. Applied Sciences 10(11): 3999. DOI: https://doi.org/10.3390/app10113999
Krishna, K.S. and Prince, P.G.K. (2020) Deep learning based breast cancer detection-a survey .
Ragab, D.A., Sharkas, M., Marshall, S. and Ren, J.(2019) Breast cancer detection using deep convolutional neural networks and support vector machines. PeerJ 7: e6201. DOI: https://doi.org/10.7717/peerj.6201
Kathale, P. and Thorat, S. (2020) Breast cancer detection and classification. In 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE) (IEEE): 1–5. DOI: https://doi.org/10.1109/ic-ETITE47903.2020.367
Al-Azzam, N. and Shatnawi, I. (2021) Comparing super-vised and semi-supervised machine learning models on diagnosing breast cancer. Annals of Medicine and Surgery 62: 53–64. DOI: https://doi.org/10.1016/j.amsu.2020.12.043
Khan, S., Islam, N., Jan, Z., Din, I.U. and Rodrigues, J.J.C. (2019) A novel deep learning based framework for the detection and classification of breast cancer using transfer learning. Pattern Recognition Letters 125: 1–6. DOI: https://doi.org/10.1016/j.patrec.2019.03.022
Aly, G.H., Marey, M., El-Sayed, S.A. and Tolba, M.F.(2021) Yolo based breast masses detection and clas-sification in full-field digital mammograms. Computer Methods and Programs in Biomedicine 200: 105823. DOI: https://doi.org/10.1016/j.cmpb.2020.105823
Hamed, G., Marey, M., Amin, S. and Tolba, M.(2021) Comparative study and analysis of recent computer aided diagnosis systems for masses detection in mammograms. International Journal of Intelligent Computing and Information Sciences 21(1): 33–48. DOI: https://doi.org/10.21608/ijicis.2021.56425.1050
Zhang, M., Wang, B., Xu, J., Wang, X., Xie, L., Zhang, B., Li, Y. et al. (2017) Canprovar 2.0: an updated database of human cancer proteome variation. Journal of proteome research 16(2): 421–432. DOI: https://doi.org/10.1021/acs.jproteome.6b00505
Li, J., Duncan, D.T. and Zhang, B. (2010) Canprovar: a human cancer proteome variation database. Human mutation 31(3): 219–228. DOI: https://doi.org/10.1002/humu.21176
da Cunha Santos, G., Dhani, N., Tu, D., Chin, K., Ludkovski, O., Kamel-Reid, S., Squire, J. et al. (2010) Molecular predictors of outcome in a phase 3 study of gemcitabine and erlotinib therapy in patients with advanced pancreatic cancer: National cancer institute of canada clinical trials group study pa. 3. Cancer 116(24): 5599–5607. DOI: https://doi.org/10.1002/cncr.25393
Weinberg, R.A. (1996) How cancer arises. Scientific American 275(3): 62–70. DOI: https://doi.org/10.1038/scientificamerican0996-62
Chen, X. and Gonçalves, M.A. (2018) Dna, rna, and protein tools for editing the genetic information in human cells. Iscience 6: 247–263. DOI: https://doi.org/10.1016/j.isci.2018.08.001
Durand-Dubief, M. (2005) Régulations génétique et moléculaire par ARN interférence chez Trypanosoma brucei. Ph.D. thesis, Museum national d’histoire naturelle-MNHN PARIS.
Snustad, D.P. and Simmons, M.J. (2015) Principles of genetics (John Wiley & Sons).
Nasr, S.B., Messaoudi, I., Oueslati, A.E. and Lachiri, Z. (2021) Cnn model applied on snp protein sequences for intestinal cancer early detection. In 2021 18th International Multi-Conference on Systems, Signals & Devices (SSD) (IEEE): 255–263. DOI: https://doi.org/10.1109/SSD52085.2021.9429415
Pavlopoulou, A., Spandidos, D.A. and Michalopoulos, I. (2015) Human cancer databases. Oncology reports 33(1): 3–18. DOI: https://doi.org/10.3892/or.2014.3579
Ghosal, R., Kloer, P. and Lewis, K. (2009) A review of novel biological tools used in screening for the early detection of lung cancer. Postgraduate medical journal 85(1005): 358–363. DOI: https://doi.org/10.1136/pgmj.2008.076307
Feng, Y., Spezia, M., Huang, S., Yuan, C., Zeng, Z., Zhang, L., Ji, X. et al. (2018) Breast cancer development and progression: Risk factors, cancer stem cells, signaling pathways, genomics, and molecular pathogenesis. Genes & diseases 5(2): 77–106. DOI: https://doi.org/10.1016/j.gendis.2018.05.001
Alwi, Z.B. (2005) The use of snps in pharmacogenomics studies. The Malaysian journal of medical sciences: MJMS 12(2): 4.
Barkur, S.S. (2007) Snps in disease gene mapping, medicinal drug development and evolution. Journal of human genetics/Japan Society of Human Genetics 52(11): 871–880.
Shastry, B.S. (2007) Snps in disease gene mapping, medicinal drug development and evolution. Journal of human genetics 52: 871–880. DOI: https://doi.org/10.1007/s10038-007-0200-z
Meher, J.K., Dash, G.N., Meher, P.K., Raval, M.K. et al.(2011) A reduced computational load protein coding predictor using equivalent amino acid sequence of dna string with period-3 based time and frequency domain analysis. American Journal of Molecular Biology 1(02): 79. DOI: https://doi.org/10.4236/ajmb.2011.12010
Messaoudi, I., Oueslati, A.E. and Lachiri, Z. (2014) Wavelet analysis of frequency chaos game signal: a time-frequency signature of the c. elegans dna. EURASIP Journal on Bioinformatics and Systems Biology 2014(1): 1–13. DOI: https://doi.org/10.1186/s13637-014-0016-z
Messaoudi, I., Elloumi, A. and Lachiri, Z. (2013) Detection of the 6.5-base periodicity in the c. elegans introns based on the frequency chaos game signal and the complex morlet wavelet analysis. International Journal of Scientific Engineering and Technology 2(12): 1247–1251.
Sivangi, K.B., Dasari, C.M., Amilpur, S. and Bhukya, R.(2022) Noas-ds: Neural optimal architecture search for detection of diverse dna signals. Neural Networks 147: 63–71. DOI: https://doi.org/10.1016/j.neunet.2021.12.009
https://www.http://canprovar2.zhang-lab.org/.
Olivier, M., Langer d, A., Carrieri, P., Bergh, J., Klaar, S., Eyfjord, J., Theillet, C. et al. (2006) The clinical value of somatic tp53 gene mutations in 1,794 patients with breast cancer. Clinical cancer research: an official journal of the American Association for Cancer Research 12(4): 1157–1167. DOI: https://doi.org/10.1158/1078-0432.CCR-05-1029
Langerød, A., Zhao, H., Borgan, Ø., Nesland, J.M., Bukholm, I.R., Ikdahl, T., Kåresen, R. et al. (2007) Tp53 mutation status and gene expression profiles are powerful prognostic markers of breast cancer. Breast cancer research 9(3): 1–16. DOI: https://doi.org/10.1186/bcr1675
Nasr, S.B., Messaoudi, I., Oueslati, A.E. and Lachiri, Z. (2022) Pre-mirna sequence prediction using convolutional neural network. In 2022 6th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP) (IEEE): 1–6. DOI: https://doi.org/10.1109/ATSIP55956.2022.9805882
Abbadi, N. and Razaq, E. (2020) Automatic gray images colorization based on lab color space. Indonesian Journal of Electrical Engineering and Computer Science 18(3): 1501–1509. DOI: https://doi.org/10.11591/ijeecs.v18.i3.pp1501-1509
Kekre, H.B. and Thepade, S.D. (2008) Color traits transfer to grayscale images. In 2008 first international conference on emerging trends in engineering and technology (IEEE): 82–85. DOI: https://doi.org/10.1109/ICETET.2008.107
Gupta, S., Kumar, M. and Garg, A. (2019) Improved object recognition results using sift and orb feature detector. Multimedia Tools and Applications 78: 34157–34171. DOI: https://doi.org/10.1007/s11042-019-08232-6
O’Shea, K. and Nash, R. (2015) An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458 .
Lee, K. W., Yoon, H.S., Song, J. M. and Park, K.R. (2018) Convolutional neural network-based classification of driver’s emotion during aggressive and smooth driving using multi-modal camera sensors. Sensors 18(4): 957. DOI: https://doi.org/10.3390/s18040957
Liu, J. J., Yu, C. S., Wu, H. W., Chang, Y. J., Lin, C. P. and Lu, C. H. (2021) The structure-based cancer-related single amino acid variation prediction. Scientific reports 11(1): 13599. DOI: https://doi.org/10.1038/s41598-021-92793-w
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Saifeddine Ben Nasr, Imen Messaoudi, Afef Elloumi Oueslati, Zied Lachiri
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.