Diabetic Retinopathy Classification Using Deep Learning

One of the main causes of adult blindness and a frequent consequence of diabetes is diabetic retinopathy (DR). To avoid visual loss, DR must be promptly identified and classified. In this article, we suggest an automated DR detection and classification method based on deep learning applied to fundus pictures. The suggested technique uses transfer learning for classification. On a dataset of 3,662 fundus images with real-world DR severity labels, we trained and validated our model. According to our findings, the suggested technique successfully detected and classified DR with an overall accuracy of 78.14%. Our model fared better than other recent cutting-edge techniques, illuminating the promise of deep learning-based strategies for DR detection and management. Our research indicates that the suggested technique may be employed as a screening tool for DR in a clinical environment, enabling early illness diagnosis and prompt treatment.


Introduction
A primary cause of blindness in working-age adults, diabetic retinopathy (DR) affects about 93 million individuals globally [1].To avoid vision loss and enhance patient outcomes, DR must be identified early and treated promptly.The manual grading of fundus pictures used in current DR screening procedures is time-consuming, expensive, and vulnerable to inter-and intra-observer variability [2].Therefore, there is a need for automated screening techniques that can quickly and affordably detect and classify DR.The blood vessels in the retina are harmed by the chronic consequence of diabetes mellitus known as diabetic retinopathy.It is brought on by persistently high blood sugar levels, which damage the fragile blood vessels in the eye and cause bleeding, fluid buildup, and the development of new blood vessels [4].If unchecked, these alterations might result in blindness and visual loss.A serious public health issue is diabetic retinopathy, particularly in low-and middle-income nations where access to screening and care is restricted.
Currently, dilated eye exams performed by ophthalmologists or optometrists are the main method of DR screening.However, this approach is timeconsuming, costly, and labor-intensive and calls for highly qualified personnel [5,6].Deep learning-based automated screening techniques, for example, offer the ability to get beyond these restrictions and increase accessibility to DR screening in environments with limited resources.Automated DR identification and categorization has demonstrated promising results because to recent developments in deep learning and computer vision [7].Deep learning-based techniques employ convolutional neural networks (CNNs) to extract important traits from fundus images and classify them into different DR severity categories.These techniques have shown great sensitivity and accuracy that is on par with or better than that of human experts.
In this article, we suggest an automated DR deep learning-based detection and classification methodology applied to fundus pictures.Our strategy ResNet152 model outperforms all other deep learning models on a series of 3,662 fundus photographs with labels for the real-world DR severity, we trained and validated our model using a variety of performance criteria, such as overall accuracy, precision, recall and AUC-ROC, we assessed the performance of our model.Deep learningbased automated DR screening has the potential to transform DR management by expanding access to screening, lowering costs, and enhancing patient outcomes.Our work makes a contribution to this field by putting out a fresh strategy for DR detection and classification that achieves well accuracy.The objective of this research is to assess the effectiveness of our suggested automated DR detection and categorization technique based on deep learning.The suggested method may be applied as a DR screening tool in a clinical context, enabling early illness identification and prompt treatment.

Literature Survey
[3] N. S. Firke the author of this article used the Asia Pacific Tele Ophthalmology Society's 2019 Blindness Identification (APTOS 2019 BD) database.The photographs in this post have been downsized to 64*64 after the data has been pre-processed through several procedures including resizing, rescaling, and label encoding.While 80% of the data is utilized for training, only 20% is used for testing.The categorization of images was done using the CNN architecture.The confusion matrix was employed to keep categorization and misclassification under control.On the training set, 94% accuracy was attained after about 30 epochs and then stays steady at around 96%.

Creation of an Improved Deep Learning Model
The creation of an enhanced deep learning model for DR classification is the first part of the proposed effort.There is still potential for improvement even though the study's model attained good accuracy, precision, recall, and AUC-ROC values.Deep learning architectures such as convolutional neural networks (CNNs) may be investigated in the future to further enhance categorization performance [8].The model's generalizability and suitability for clinical usage may both be enhanced by the use of bigger and more varied datasets.

Model's Explainability
A drawback of deep learning models is that they are difficult to understand.In clinical settings, it may be problematic to grasp how the model came to its conclusion.Future research might investigate the application of explainable AI approaches, such as attention processes, to enhance the model's interpretability and provide doctors better understanding of the model's decision-making process shown in Fig. 1.

Dataset
We obtained a Kaggle dataset of 3,662 fundus photographs from patients who had been diagnosed with DR.Two ophthalmologists rated all of the photos utilizing the Early Treatment Diabetic Retinopathy Study (ETDRS) to assess the severity of DR categorization system after they had been taken with a non-mydriatic retinal camera.Images with various degrees of DR The dataset contained severity types such as no DR, mild, moderate, severe, and proliferative DR.
Finding data sources, assessing data quality, and annotating the data for use in creating deep learning models are all phases in the process of assembling datasets for diabetic retinopathy.Electronic health records and publicly accessible datasets are examples of possible data sources [10,13].Researchers must make sure that the pictures are clear enough to diagnose diabetic retinopathy accurately while assessing the data quality [14].Each image in the data must be graded according to how serious the problem is, and this is usually done by qualified professionals using established grading schemes.

Data preprocessing
Using a circular mask centered on the fovea, we preprocessed the pictures by cropping them to the area of interest (ROI).The photographs were then made grayscale and shrunk to 180 by 180 pixels in resolution.

Model construction
For feature extraction, we employed a Convolutional Neural Network (CNN) with ResNet152 pre-training.The entirely linked layers of the ResNet152 network were removed and replaced with one fully connected layer, each with 1024 units, before a final softmax layer for classification was added.We trained the model across 30 epochs using the Adam optimizer, with a batch size of 32 and a learning rate of 0.001.

Evaluation of the model
We evaluated the performance of our model using a range of performance metrics, including accuracy, precision, recall, and area under the receiver operating characteristic curve (AUC-ROC).

Proposed classification methods and techniques
In this approach, a previously trained deep learning model serves as the basis for a fresh classification task.In the proposed study, the weights of a pre-trained model were used to initialise the weights of the deep learning model.The accuracy of the final model and the efficiency of the training process may both be improved using this technique.

(i) VGG16
As a pre-trained model for transfer learning in several computer vision applications, such as the categorization of diabetic retinopathy, the VGG16 architecture has been extensively employed.Transfer learning involves tuning the pre-trained VGG16 model on a smaller dataset unique to the target application, such a dataset of retinal pictures for the classification of diabetic retinopathy.As a starting point the pre-trained VGG16 model's weights are used for the training procedure.The model is then further trained on the target dataset to adjust it to the unique properties of the new dataset [19].This strategy can assist to increase the accuracy of the final model and the efficiency and efficacy of the training process.

(ii) VGG19
Similar in structure to VGG16, the VGG19 design includes more convolutional layers than VGG16, enabling better feature extraction and greater classification job accuracy.
Greater receptive field made possible by the deeper network can aid in capturing more intricate features in the input images.However, this also increases the network's computational cost and difficulty in training.
The pre-trained VGG19 model may be fine-tuned on a smaller dataset particular to the intended application, such a dataset of retinal pictures for the classification of diabetic retinopathy, in a manner similar to VGG16.This enables the model to adjust to the unique characteristics of the new dataset and can assist to increase the final model's accuracy.

(iii) ResNet50 101 152
The original ResNet design, known as ResNet-50, had 50 levels and residual connections.Deeper variations of ResNet, ResNet-101 and ResNet-152, have 101 and 152 layers, respectively, and even more residual connections.These more complex designs were created to deal with the issue of vanishing gradients, As shown in Fig. 2 which may arise in extremely deep networks and make them challenging to train.
By enabling gradients to pass straight through the network and lessening the effects of the vanishing gradient problem [15,16], the residual connections in the ResNet design make it very deep networks may be easier to train.The classification of diabetic retinopathy benefits from the ResNet architecture's superior performance in a range of computer vision applications.Deep CNN designs such as DenseNet-121, DenseNet-169, and DenseNet-201 leverage dense connection between layers to optimize gradient flow and information flow.In contrast to other architectures, this one connects every layer in a feed-forward fashion to every other layer manner, enabling better feature reuse and greater accuracy.These models have proven successful for a variety of computer vision applications, including the categorization of diabetic retinopathy, and they may be fine-tuned on particular retinal image datasets for increased accuracy.
We can create effective and precise deep learning models for DR classification by using pre-trained DenseNet models and refining them on particular datasets [17].These models may enhance the detection and management of diabetic retinopathy, the leading cause of blindness worldwide.

Performance Evaluation
Performance measures for deep learning models include precision, recall, accuracy, F1 score, and the area under the ROC curve (AUC-ROC) for diabetic retinopathy.These metrics track many elements of the model's performance, including its accuracy in recognizing real  positives, its ability to accurately identify positive and negative situations, and its overall performance across various thresholds.
Researchers often utilize a dataset called as the validation set, which is distinct from the one used for training, to evaluate a model's effectiveness.The training set is used to create the model, while the validation set is used to evaluate the model's effectiveness to improve the model's learning rate, learning rate of the hyperparameters, and layer count.Researchers may also employ methods like cross-validation, which divides the dataset into many folds and trains and tests the model on each fold separately, to prevent overfitting [18].This makes it more likely that the model will be able to generalize to new data and avoid being overfit to the training set.
Finally, a different test set is utilized to assess the model's effectiveness in order to give a fair assessment of that performance.The test set is wholly unrelated to the training sets and is only used to evaluate the final model mentioned in Table .2.

Conclusion and Future remarks
The correct diagnosis and treatment of diabetic retinopathy have showed considerable promise for deep learning, in conclusion.Deep learning models like Xception, VGG, ResNet, and DenseNet have the potential to revolutionize the diagnosis and treatment of diabetic retinopathy due to their high accuracy rates as mentioned in Fig. 3.The construction and optimization of deep learning models for diabetic retinopathy still have space for improvement, despite the encouraging outcomes.To enhance the functionality and usefulness of these models, researchers must keep looking into novel and creative strategies, such as the utilization of transfer learning and attention processes.Additionally, in order to improve comparisons between studies, datasets and evaluation metrics need to be standardized.This would facilitate the creation of more precise and trustworthy models by assisting in the identification of the most effective models.Overall, the creation of deep learning models for diabetic retinopathy has great promise for enhancing patient outcomes and easing the strain on healthcare systems.To enable the full realization of these advantages, more research and development in this field are necessary.

Fig. 3 .
Fig. 3. Metrics comparison of different deep learning models

Table 1 .
Comparative analysis with state-of-the-art models

Table 2 .
Performance metrics of the models A. S. Sathwik EAI Endorsed Transactions on Pervasive Health and Technology 2023 | Volume 9