A Hybrid Optimization Approach for Pulmonary Nodules Segmentation and Classification using Deep CNN

Lung Cancer, due to a lower survival rate, is a deadly disease as compared to other cancers. The prior determination of the lung cancer tends to increase the survival rate. Though there are numerous lung cancer detection techniques, they are all insufficient to detect accurate cancer due to variations in the intensity of the CT scan image. For more accuracy in segmentation of CT images, the proposed Elephant-Based Bald Eagle Optimization (EBEO) algorithm is used. This proposed research concentrates on developing a lung nodule detection technique based on Deep learning. To obtain an effective result, the segmentation process will be carried out using the proposed algorithm. Further, the proposed algorithm will be utilized to tune the hyper parameter of the deep learning classifier to increase detection accuracy. It is expected that the proposed state-of-art method will exceed all conventional methods in terms of detection accuracy due to the effectiveness of the proposed algorithm. This survey will be helpful for the healthcare research communities with sufficient knowledge to understand the concepts of the EBEO algorithm and the Deep Convolutional Neural Network for improving the overall human healthcare system.


Introduction
Lung cancer is one of the most, deadly cancers that has been the most catastrophic to human lives.Studies for 2019 indicate that this disease is the reason for a high number of deaths all over the world [1].The best discovered remedies that are effective against lung cancer are radiation therapy and the use of chemotherapeutics'.Even with the above-proven therapies, the survival rate of the survivors of lung cancer is only 16%.With the discoveries and research made in this field of deadly disease, the rates of people recovering and living more have comparatively risen, with detection of the disease also being an associative add-on.Yet, when compared to other kinds of cancer, the rate of peers being able to survive lung cancer is still very low.The calculation of this rate is completely dependent on the stage and how early the detection can be made, which will help to start the treatment [2].A technique widely known as CT scans and associated imaging techniques is thus made use of in order to find and identify all of the possible and unidentified nodules in the lungs.This aids in recognizing lung cancer.However, the probability of not being able to identify a cell with cancer growth has two important reasons, which include the improper calculation of the physiological structure and the variety that is seen in the images generated by a CT scan [3].However, as the computers and the relative-aided measures continue to improve, they prove to be reliable devices for helping radiologists and doctors with the process of identifying cancer's existence in bodily cells.With computers and systems being worked upon and advancements made, accuracy and completeness in the identification of lung cancer are still a huge point of consideration.For this, image, its process, and its techniques are the most effective [4].What's more significant in detecting lung cancer with the technique of computed tomography scanning is the correct rate of segmentation.In this method of obtaining images with the help of CT scans, the primary work performed is quoting the endpoints of lung nodules, which aids in quantitative scrutiny of them in later stages [5].Segmentation of nodules in the lung is an important subject to be taken into account in the process of using CAD.
Age-old methods of performing this are a difficult and timeconsuming process as the internal parts of the lungs are varied from different planes.The organs that have had a great impact on lung cancer need to be segmented in an efficient and productive manner [6].The data-driven segmentation model helps to lessen the number of variations, which are caused by the observer processing the manual findings.Lately, quite a number of researchers have put forth the segmentation algorithm for the detection of nodules in the lungs.Ultimately, being able to acquire greater accuracy in the process is still a hurdle because of the variations present in the pulmonary nodule in terms of dimensions and appearance.In the recent past, ways that include potency and specific sector growth have been something that researchers have put forth [7].A study has also been made on the techniques that make use of Level Set and Graph Cut, which are the heirs of energy optimizers.Even with them in use, the hurdle of accurate results and the resilience of nodule detection still exists.This case comes up when similar traits and looks are compared between juxtapleural and pulmonary nodules.It also reflects the same in the case of operation through morphology, which deals with the dimensions of elements and their difficulty in setting.Research in recent times is based on having generated quick, effective, and robust results from segmentation and CT imaging.Thresholding, method with Morphology, model of deformability, method of using clustering, networks with neural science, technique of graph cut, contours which are active, watershed, segmentation which is based on fuzzy logic, histograms, growing of section, and random field by Markov are some of the methods for segmentation of the nodule in pulmonary [9].Technique and way of threshold is an age-old method of segmentation.A deep learning way of demonstrating the feature of hierarchy mentioned is representation learning.Deep learning benefits in the process of depicting the feature, which is at a much higher tier than the data from the image.Computing data with great accuracy is one of the best features of efficiency, which has had a sky-rocketing effect on the growth of deep learning in the field of segmentation of medical images [10].
The research organization is: Section II details the related works and the challenges; Section III represents the problem statement; Section IV elucidates the proposed methodology; and Sections V and VI detail the dataset for lung cancer and experimental results.Finally, section VII concludes the work.

Related Works
A thorough literature survey has been carried out through various sources.A complete review of the literature is presented below.Sunyi Zheng et al. [2] used the CNN classifier to detect nodules in the lungs.Identification of pulmonary nodules can be improved significantly if MIP (Maximum Intensity Projection) images are used in the cases of CT scans as a result of radiological evaluation.Miniature pulmonary nodules ranging from 3mm to 10mm can be detected if thick MIP images are taken into account for usage.The results generated have fewer false positives.This entirety of using "screening," which is clinical in nature, and CNN in a conjoined form stands as a ground of viability to enhance the identifying process of nodules in the lungs.Medical linkage is an aspect that falls short in cases of non-nodule and comparatively smaller nodule identification, which is why they were bypassed [1].Supiksha Jain et al. [5] proposed her vision to refine the segmentation process and its preciseness.SSSOA-GAN [Salp Shuffled Shepherd Optimization Algorithm-Based Generative Adversarial Network] was the concept foretold.This algorithm is a well-made combination of the Slap Swarm Algorithm [SSA] and SSOA.Elements present in the CT picture are removed by the pre-requisite process by utilizing a Gaussian filter as a helping aid.The pre-processed picture is put in for the lung lobe segmenting process.This is performed with a deep joint segment as an assist.This helps to fragment the correct parts.Lung nodules and their identification are done by making the most of GAN.The GAN is made better by utilizing the properties of SSSOA to have efficient segmentation of lung nodules present in lobe images.Amitava Halder et al. [11] put forth the idea of using AMST.The adaptive morphology segmentation technique is a system that can identify various separated, deformed, and smallmicron nodules in the lungs.This effort made use of this AMST as a main part, inside of which an adaptive morphological filter was developed for optimizing the segmentation of the lung nodule area.This filter helps to identify the area by making use of the adaptive structure element [ASE].The higher success rates are risen as well by lessening the false positives (FPs) from CT scans.The nodule area is then handled to extricate the exact requirement.This technique and method, however, fall apart for working towards segmentation and identification for JV and miniature nodules.This is a result of complicated architecture and lowpotent variation.
Ganesh Singadkar et al. [12] made use of DDRN.The Deep Convolutional Residual Network is a way of relying on two primary perceptions.The idea for a deep De-convolutional residual network is upskilled in all ways to catch the sundry variation of nodules from the two-dimensional set of the CT results.Summation-based long skip connections from the convolutional to the de-convolutional parts of the network.Keep the distributed information that vanished while the pop operation was conducted.Sights with full-resolution attributes are also saved.The tool put forth made the manual detection of pulmonary nodules better and reduced the analysis time while making use of the aided tool.However, the lower precision is still a downside of the system.
Zhitong Wu et al. [13] recommended hyper parameter optimization technique for lung nodule segmentation.The suggested image enrichment technique is for making the process of network learning better.Dual-Branch neural network is oriented towards deep-studying multi-view knowledge.Unfortunately, due to rare spots of nonunaccompanied nodules in the record, the refined training of the model with respect to hard cases is difficult to match with.
Haichao Caoet et al. [14] made use of DB-ResNet.Dual-Branch Residual Network was proposed with the view that it could make the process of lung nodule identification potent.This type of technique addresses and integrates two of the latest schemas.Both of them help to refine the potential for generalization of the model vision.1] The model put forth concurrently captures multi-view and scaled features of various nodules present in the CT images.2] Features of intensity and the CNN are integrated together for better results.To evade overfitting at the time of training, the EST [Early Stopping Training] is the plan of action used.Despite this, the examined results for mini nodules are appalling.
Quan Chen et al. [15] proposed an FMGA network for nodule identification in the lungs.Fast Multi-crop Guided Attention is a technique that makes use of multi-crop nodule pieces as input to summate context knowledge; this includes 2D factors from the present image and 3D factors from adjacent axial slices.Then, this method exploits a comprehensive convolutional layer for immersing the pixel of a nodule and matching parallels.The model is capable of making the ultimate decision by taking into account the information present / provided by points in its vicinity while categorizing the hole points.In addition, they are efficient and speedy in nature with respect to time.For future enhancement for nodule segmentation, improvements and changes are vital in dataset.

Veronica et al. [16] utilized ANN for lung nodule identification. A Model based on an Artificial Neural
Network accomplished a notable accuracy and took the least time to complete its functioning.The parameters used to test the performance of the algorithm and classifier were datasetbased and included, but were not limited to, sensitivity, particularity, and precision.End ratings depicted by ANN with OALO [oppositional-based Ant Lion Optimization] attain higher precision and a comparatively low execution period when compared with other algorithms.Yet, the two drawbacks that remain to evade the data over-training phenomenon include: Lack of graphical representation and network enhancement.
Prasad Dutande et al. [17] proposed SquExUNet fragmentation and 3D-NodNet categorization modes for identifying nodules in the lungs.The process is simplified due to the presence of binary class segmentation.Two and Three Dimensional were combined for the CNN scheme.This was performed for nodule identification, which produces an outcome with a nodule that is precisely fragmented and categorized.Results obtained with this technique demonstrate that the efforts for the detection of segmented nodules in the lung were very successful and effective.This was understood when making comparisons with other nodule detection algorithms.However, when it's about larger areas like lung fields, their segmentation is simple with the architecture suggested; however, the portrayal of nodules was a failure.Yu Gu et al. [18] proposed an architecture after the known CAD for identifying lung nodules by a three-dimensional deep CNN.This was put into multi-scale speculation strategy.The objective behind this was to assist radiologists by giving a second suggestion on the precise detection of nodules in the lung.This is a vital procedure in step of a prior diagnosis of lung cancer.A three-dimensional CNN combined with a multi-scale speculation technique was used to identify nodules after the lungs were segmented from the CT scan.An all-inclusive method for results was used.For the property aspect, 3D CNN takes over 2D CNN when it's about making use of enhanced spatial context information.Also, 3D CNN generates more distinguishing features when trained with samples taken from 3D scans to completely portray the nodules in the lungs.However, the drawback that made it to where the system was not usable was its FP rate [falsepositive].
Haichao Cao et al. [19] made use of the TSCNN framework for nodule detection inside the lungs.A Two-Stage convolutional neural network is an advanced CNN.The CNN framework, in its initial stage, was based on the enhanced version of UNet network segmentation to start an initial identification of lung nodules.Concurrently, to achieve a high recall rate without more false positive nodules, a novel sampling technique was introduced, which was combined with offline hard mining to train and speculate as per the cascade prediction technique.In the next stage, CNN was on the grounds of 3D categorization networks for reducing the false positive rating.As training the network needs a massive and reliable amount of data for learning, it was integrated with the data augment technique with reference to random masking to complete the research.The generalizing capability of the system was improved by making use of combined learning with a false positive reduction model.four various techniques of fusion for categorization.This method that was proposed was precise, resilient, and likely to be put into practice of in real clinics.The method was practiced for a large CXR database.This was researched and gathered from many hospitals and machines for future training of the network.This step was made towards the enhancement of the performance and resilience of the network.The CNN architecture was to be enhanced for databases to be used in large-scale sectors of health and society.Thereafter, considerations for the nodules present outside the larger areas of the lungs are to be made.
Ying Su et al. [21] said that faster R-CNN can be feasible to identify the lung nodules as the training set is designed in a way to support the system and technique used.In writing, parameter enhancement can significantly refine the structure of the network as well as the precision of identification.The entirety of this procedure was on GPU.The appreciable parameters were detection speed and full-length, endwise identification.This ensured the accuracy of the end-test results.The major pull-off was interference, which caused the detection coherence to be lessened.Thus, 3D CNN can be utilized for performing tests like these, which helps to evade the possible interference in 2D images.This in turn considerably enhances the effect of detection.
Wu, Zhitong et al. [22] suggested that the U-Net based segmentation of the lesion region, in which the better performance was acquired through efficient training along with the artefact removal technique, here, the consideration of the information concerning multi-view imaging provides a more robust performance.The lack of data for the training causes sparsity issues and degrades the performance of the model.
Xiaoyu Zhu et al. [23] conducted experiments on the LUNA16 dataset and achieved low false positives and better performance in sensitivity.However, the sensitivity at the high FP level drops slightly.At the FP level, the detection rate needs further improvement.

Problem Statement
Some of the issues experienced in the research are enumerated as: a) Although the CAD systems showed high efficiency and benefits in lung nodule detection, only a few studies have developed approaches that take the routine workflow of radiologists into consideration.b) Most of the models developed for the detection of lung cancer are not suitable in the presence of datasets of larger dimensions.Hence, it is necessary to develop an automatic strategy that can handle even datasets of larger dimensions.c) While using certain features of the image, the texture of the image may be affected, leading to poor performance in the detection of lung cancer.Hence, only the significant features needed to be extracted to preserve the texture of the image.

Proposed Method
The ultimate aim of the proposed method is to design and develop a deep learning-based segmentation model for lung nodule detection using CT images.Initially, the input lung CT image is pre-processed to remove the unwanted artefacts from the image in such a way as to enhance the quality of the image.Then the segmentation process is carried out using the proposed EBEO algorithm, which inherits the herding characteristics of the elephant [24] and the food hunting characteristics of the bald eagle [25].After segmentation, the step named nodule candidate detection is carried out to make the image suitable for the detection process

Elephant Herding Optimization (EHO)
In order to resolve the interruptions with respect to enhancement, the herd of elephants and their corresponding behavior can be considered.The following are some of the rules to make understanding this process easy.i) The elephant community is comprised of gens and a specified number of elephants in each of the kinship groups.ii) Male elephants in a decided number will separate from their group and travel away alone from the clan at each new generation.iii) Inside each kinship group, under the supervision of the matron, all the elephants survive together.
Based on this behavior of elephants staying together in herds, EHO was developed.It's an algorithm that takes its complete inspiration from the environment around forests.
In order for distance updating amidst elephants in each group in reference to where matron elephant is, a clan operator is used in EHO.

i) Clan Updating Operator
According to the elephant's natural pattern, the group is led by a matron.Considering an elephant as in the group "ci", the equation can be derived as:- x new,ci, j = x ci,j + α × ( x best,ci -x ci,j) × r …………….. (1) where x new,ci,j and x ci,j propose the current location and previous location for elephant j in clan ci, respectively.The Scale parameter {α € [0,1]} is used to find the impact of matron ci on x ci,j, x best,ci represents matron ci, which is the alpha elephant independent in clan ci.r € [0, 1].Here, a uniform disposition is used.The competent elephant in each clan cannot be reconditioned by Eq. ( 1), i.e., x ci , j = x best,ci .
For the suited one, updating can be made as: x new, ci, j = β × x center,ci ……………………… (2) where β € [0,1] is an element that determines the impact of the x center,ci on x new,ci,j .It can be seen that the latest individual x new,ci,j in Eq. ( 2) is produced by the knowledge attained by every elephant in clan ci.x center,ci is the represents clan ci middle, and for the d th dimension, it can be computation can be done as: where 1≤ d ≤D indicates the d th dimension, and D is its summed-up dimension.n ci represents the number of elephants in clan ci.x ci,j,d is the d th of the elephantindependent x ci,j .The center of clan ci, x center,ci , can be computed through D estimation by Eq. ( 3).
Based on the statements and equations depicted above, the pseudo code can be certainly calculated and prepared for the clan as shown below: for ci =1 to n Clan (for all clans in elephant population) do for j = 1 to nci (for all elephants in clan ci) do Update x ci,j and generate x new,ci,j by Eq. ( 1). if x ci,j =x best,ci then Update x ci,j and generate x new,ci,j by Eq. ( 2).
end if end for j end for ci x best,ci is matriarch ci which represent best elephant in the clan.

ii) Separating operator
When elephants reach their adolescent age, they tend to leave their cluster and live independently to the probable start of a new clan.The separation is possible to imitate with a separation operator.This can certainly help to resolve the problems related to optimization.To enhance the ability to search in the EHO technique, consider that elephant independents with the least fitness will make use of the separation operator for generation individually as shown in Eq. ( 4).
Xworst,ci=xmin +(x max -x min +1)× rand………..( 4) where x max and x min are respectively the upper and lower bounds of the location of an independent elephant.x worst,ci is the worst independent elephant in group ci.rand € [0, 1] is a kind of distribution that is arbitrary yet uniform in nature, ranging in the brackets of values [0, 1].Hence, the above statements help to build the pseudocode of the separating operator, as mentioned below: for ci =1 to n Clans (all the clan in the elephant community) do Replace the worst elephant in clan ci by Eq. 4.
End for ci Ultimately, EHO is the algorithm that is developed on the grounds of clan updating operator and separating operators [24].

Bald Eagle Optimization (BEO)
The Bald eagle is an apex predator that when searching for prey over a spot surrounded with by water, takes flight in a particular direction and selects a specific area with the main region for their search.Associating this fact, the process of searching for the area of prey is done by self-searching and following the trail of other birds.The amount of fish population in an area (alive/dead) is also accounted by bald eagle.The BEO algorithm is completely imitating these actions of the bald eagle of hunting down the prey to support and elaborate on the co-sequences of every hunting stage.
Based upon this, the BEO algorithm is categorized into 3 parts: search space selection, searching inside of the space that was selected for search, and diving / swooping stage.

i) Select stage
It's the initial stage in which, the bald eagle has the task of recognizing the best area [for food quantity] within the space that is selected for search, wherein the eagle can huntdown a target.To put forth this behavior mathematically, we use the following equation (1).
Where a is the element to control changes in location.It inputs values from 1.5 to 2, r is a random float that takes a value between 0 and 1.In the selection stage, bald eagles finalize areas on grounds of information that is priory available from the stages prior to the current.If the current area proves not sufficient enough, a different search area is randomly chosen; however, its position is not very different or far away than the previous search area.P best denotes the search space that is presently used by bald eagles, which is on the ground of best location identified at the time of previous search.There are many points available near the previous area, however they are selected randomly by bald eagles.Meanwhile, P mean shows that these eagles have utilized all of the available information by using points prior to the current.The latest movement of bald eagles is found out by multiplying the information that was searched previously in random fashion by a. Thus, all of the search points are changed in a random manner.

ii) Search stage
The Search for a target is done in at this stage.This process is performed within the selected area of the space.The  The tensor of the feature is gathered by the convolutional layer with the help of a kernel filters.Using the stride, the kernel filters combine all the input features to make the output an integer.After performing the convolution along with the striding process, the dimension of the feature becomes minimized, and hence the zero padding is included to match the dimension similar to the input feature.Thus, the feature mapping with the low-level feature is acquired in the convolutional layer.b) Pooling Layer: To reduce dimensionality, the down sampling operation is performed in the pooling layer.The pooling is performed with max pooling, min pooling, or average pooling, depending on the requirement.In the proposed lung nodule detection and grading system, min pooling is utilized, or the reduction of dimensionality.c) Fully Connected Layer: The decision-making is performed in this layer for the detection of the lung nodule and its grading.The output of the fully connected layer is the different grades of the lung nodule.In the proposed method, three different grades like grade-0, grade-1 and grade-2 lung nodules are identified [26].

LIDC/IDRI Dataset of Lung Cancer CT Images
A data set of LIDC/IDIR in reference to cancer will be made use of in this research.This will be done to train the Deep CNN classifier.Lung Image Database Consortium and Image Database Resource Initiative is a database for thoraxrelated CT scans.Associatively, three research organizations were created prosperously: the NCI (National Cancer Institute), the FNIH (Foundation for the National Institutes of Health), and the FDA (Food and Drug Administration) due to their notable implementation.This is a dataset of 1:18 cases in total.However, in practical cases, the number stands at only 1:10 in various CT scans.The facts supporting this number shows 8 cases were duplicated, unintentionally, in the process of collecting CT scans.The entirety of collected data from images gets stored in DICOM format, which has a fixed size of 512 x 512.The range of how thick the images can be is from 0.5 to 5mm, which is undeniable to layer complexion.The thickness size of majorly reiterated image was 1mm, 1.25mm, and 2.5mm respectively.A notable fact is, 50% and more contemporary lung cancer studies have adopted LIDC/IDRI dataset for treatment.Each dataset in this contains many different images, which are thousands in numbers.An XML file is included as well, which is responsible for holding the description of detected lung lesion.The diameter of the lesion detected in the lung was computed using an electronic calliper.Each was examined, and making use of those as grounds for research, 3 main categories were formed: Nodules with 3mm -30mm of diameter, non-nodules of diameter >= 3mm and finally micro-nodules of diameter < 3mm.[27].

Experimental results
The experimental results of the proposed EBEO-based deep convolutional neural network for lung nodule detection and grading are portrayed in Figure 5. Here, the slices of the lungs are chosen randomly, and the segmentation based on the proposed EBEO-based deep convolutional neural network, the nodule candidate detection, and the lung nodule detection output are portrayed in figure 5.

Conclusion
The proposed system is helpful for expert radiologist and pulmonologists in the process of lung cancer detection.In this survey and proposed work, a deep learning-based model has been introduced for pulmonary nodule segmentation and classification using deep CNN from CT images.In this work, the segmentation of the CT images is done using the proposed Elephant-based Bald eagle optimization (EBEO) algorithm that inherits the prey-catching characteristics of the bald eagle search agents and the herding behavior of the elephants.The nodule is finally classified using the Deep CNN classifier, the weights of which are optimally tuned using the proposed EBEO algorithm.Our proposed system not only identifies the presence of nodules but also gives a summary of the possible shapes of detected nodules and also determines whether they are benign or malignant by using the classification of detected nodules.It has been trained and evaluated using the LIDC dataset and the clinical dataset.The proposed method is evaluated based on Specificity, Sensitivity and Accuracy.The accuracy needs to be further elevated for the real-time application of lung nodule detection.Hence, a novel method will be devised in the future based on the hybrid architecture of deep learning.

Xuechen
Li et al. [20] thought of a lung nodule detection technique that was based on deep learning.Multi-Resolution CNN was also used to understand the attributes.It utilized EAI Endorsed Transactions on Pervasive Health and Technology | Volume 10 | 2024 | . The significant features, such as statistical features, shape-based features, intensity-based features, and texture features like Local Directional Pattern (LDP), and the Local Optimal Oriented Pattern (LOOP) will be extracted.The features thus extracted acts as the input to the proposed EBEO-based Deep convolutional neural network (Deep CNN) [26] to detect the input image as nodule or non-nodule.In the proposed classifier, the weights of the deep CNN classifier are optimally tuned using the proposed EBEO algorithm, which enhances the detection performance of the proposed model in lung nodule detection.The flow diagram and architecture of the proposed model is depicted in figures 1 and 2 respectively.

Figure 1 .
Figure 1.Flow diagram of proposed methodology

Figure 2 :
Figure 2: Architecture of proposed EBEO Based Deep CNN

Figure 4 :
Figure 4: Structure of Deep CNN for lung nodule detection a) Convolutional Layer: The tensor of the feature isgathered by the convolutional layer with the help of a kernel filters.Using the stride, the kernel filters combine all the input features to make the output an integer.After performing the convolution along with the striding process, the dimension of the feature becomes minimized, and hence the zero padding is included to match the dimension similar to the input feature.Thus, the feature mapping with the low-level feature is acquired in the convolutional layer.b) Pooling Layer: To reduce dimensionality, the down sampling operation is performed in the pooling layer.The pooling is performed with max pooling, min pooling, or average pooling, depending on the requirement.In the proposed lung nodule detection and grading system, min pooling is utilized, or the reduction of dimensionality.c) Fully Connected Layer: The decision-making is performed in this layer for the detection of the lung nodule and its grading.The output of the fully connected layer is the different grades of the lung nodule.In the proposed method, three different grades like grade-0, grade-1 and grade-2 lung nodules are identified[26].

Figure 5 :
Figure 5: Experimental analysis of the proposed EBEO-based Deep Convolutional Neural Network