Using Deep Learning and Machine Learning: Real-Time Discernment and Diagnostics of Rice-Leaf Diseases in Bangladesh

Bangladesh is heavily reliant on rice production, but a staggering annual decline of 37% in rice output due to insufficient knowledge in recognizing and managing rice plant diseases has raised concerns. As a result, there is a pressing need for a system that can accurately identify and control rice plant diseases automatically. CNNs have demonstrated their effectiveness in detecting plant diseases, thanks to their exceptional image classification capabilities. Nevertheless, research on rice plant disease identification remains scarce. This study offers a comprehensive overview of rice plant ailments and explores DL techniques used for their detection. By evaluating the advantages and disadvantages of various systems found in the literature, the study aims to identify the most accurate means of detecting and controlling rice plant diseases using DL techniques. We present a real-time detection and diagnostic system for rice lead diseases that utilizes ML methods. This system is designed to identify three prevalent rice plant diseases, specially leaf smut, bacterial leaf blight and brown spot diseases. Clear images of affected rice leaves against a white background serve as input data for the system. To train the dataset, several ML algorithms were employed including KNN, Naive Bayes, J48 and Logistic Regression. Following the pre-processing stage, the decision tree algorithm demonstrated an accurateness of over 97% when claimed to test dataset. In conclusion, implementing an automated system that leverages ML techniques is vital for reducing the time and labor required for detecting and managing rice plant diseases. Such a system would contribute significantly to ensuring the healthy growth of rice plants in Bangladesh, ultimately boosting the nation’s rice production.


Introduction
Rice holds a central position in Bangladesh's agricultural landscape, serving as the primary source of sustenance and a crucial component of the national economy [1].Nearly 75% of the country's agricultural land is utilized for rice cultivation, which plays an essential role in ensuring food security and offering employment opportunities, particularly in rural areas.Over the past few years, Bangladesh has seen a notable increase in rice production, becoming the world's fourth-largest rice producer [2].The nation produced 34.5 million tonnes of rice in 2019, which rose to 35.7 million tonnes in 2020 despite COVID-19-related challenges.
In 2021, Bangladesh reached a record production of 56.9 million tonnes of rice, a significant improvement over previous years.The production target for 2022 is 60 million tonnes, with an ambitious goal of 65 million tonnes by 2023.However, the country still grapples with challenges in maintaining disease-free rice cultivation, which is essential for continuous economic growth and meeting the rising demand for rice.In the context of the ongoing fourth industrial revolution, Bangladesh must adopt innovative technologies to enhance the agricultural sector's efficiency and productivity.This paper examines the expansion of a motorized system for real-time detection and diagnosis diseases of rice leaves in Bangladesh utilizing ML and DL methods.The objective is to combat prevalent rice diseases, such as bacterial blight, brown spot, and leaf smut, which can cause considerable losses in rice production.Traditional crop disease management techniques involve manual detection, expert classification, and the suggestion of suitable treatments.These tasks can be labor-intensive and timeconsuming, especially for large-scale farming operations.By employing ML and DL methods, the proposed system aims to automatically identify and classify diseases from rice leaf images, significantly reducing the time and effort required for disease management.This paper provides a relative survey of various ML classification algorithms and deep learning models, focusing on their effectiveness and accuracy in detecting and diagnosing rice leaf diseases.It also emphasizes the importance of early detection and monitoring of these diseases to minimize losses and ensure stable rice production.By implementing a sophisticated, automated system for rice disease detection, Bangladesh can maintain its current growth trajectory in rice production and strengthen its position in the global rice market.In conclusion, the successful integration of ML and DL techniques in rice disease discernment and diagnosis has the potential to bring about a transformative impact on Bangladesh's agricultural sector.This could contribute to food security, rural employment, and overall economic development.

Related Works
In recent years, there has been a surge in the application of ML and DL models in the agricultural sector, particularly in diagnosing diseases in rice plants.The following is a synopsis of recent studies in this area, presented in the order you provided.Bari  In a recent study, Rathore and colleagues (2023) [10] utilized the LW17 DL model to detect rice plant diseases using RGB and grayscale images, underscoring the efficiency of these techniques irrespective of the type of image used.Further, Agarwal and his team (2022) [11] introduced a mobile solution for detection and classification of rice plant diseases, using DL and transfer learning, emphasizing the potential for real-time, on-the-go solutions.

Proposed Methodology
In this research, we have implemented and compared three Convolutional Neural Network (CNN) models and three traditional ML algorithms -SVM, RandomForestClassifier, and kNN -for the detection and diagnosis of rice leaf diseases.The three CNN models were designed with varying architectures, including different numbers of convolutional layers, pooling layers, and fully connected layers.The models were trained and evaluated on the pre-processed rice leaf disease image dataset.The obtained accuracies for the three CNN models were 0.5417, 0.7667, and 0.9333, respectively.In addition to the deep learning models, we employed three traditional ML algorithms to classify the extracted features from the images.We used the following algorithms: According to the results, the third CNN model outperformed the other two CNN models as well as traditional machine learning algorithms, achieving the highest accuracy of 0.9333.This suggests that, in the given dataset, the third CNN model is best suited for detecting and diagnosing rice leaf diseases.
The proposed methodology demonstrates the effectiveness of DL models, particularly CNNs, in the instinctive identification and classification of rice leaf diseases.Future work could involve refining the CNN architecture, incorporating additional data augmentation techniques, and exploring other DL models to enhance classification performance.

Description of the Dataset
The dataset is made up of 240 JPEG images of diseased rice leaves collected from the field.These images are divided into three classes based on the type of disease, with 80 images in each class.Leaf smut, brown spot, and bacterial leaf blight are among the classes.

Preprocessing of the Dataset
Preprocessing plays a vital role in creation of ML and DL models, as it prepares the dataset for training and assessment.
For the rice leaf disease image dataset, the preprocessing stages may encompass the following actions: ❖ Import Images: Bring in the dataset's images and their corresponding labels, making sure that the data accurately corresponds to the respective classes.
❖ Transform Color Space: Adjust the images to an appropriate color space (e.g., grayscale or HSV) if necessary for the model or to decrease computational complexity.❖ Convert Labels: Transform the categorical labels into a numerical format, such as one-hot encoding, which can be processed by the ML or DL model.
❖ Randomize the Data: Rearrange the order of the dataset to ensure that the model does not learn any unintended patterns related to the data's sequencing.
Upon completing the preprocessing actions, the dataset will be prepared for training and evaluating ML and DL models for detecting and diagnosing rice leaf diseases in your research paper.

Data Cleaning
Cleaning data is a critical stage in the ML and DL workflow, guaranteeing that the dataset has high quality and is appropriate for training models.For the rice leaf disease image dataset, the data cleaning process may encompass the following actions:  Once the data cleaning procedure is complete, the dataset will be prepared for use in training and evaluating ML and DL models for finding and diagnosing rice leaf diseases.

Confusion Matrix
A confusion matrix is a performance evaluation tool used in classification problems such as real-time Discernment and diagnosis of rice leaf diseases using ML and DL techniques.It aids in visualising classification performance by displaying the number of correct predictions (TP and TN) and incorrect predictions (FP and FN) for each class.Assume there are three diseases (Disease A, Disease B, and Disease C) and a "Healthy" category for leaves with no disease in the context of rice leaf disease detection.For this multi-class classification problem, a confusion matrix would show the correct predictions (true positives) for each class along the main diagonal and the incorrect predictions (false positives and false negatives) in other cells.Using a confusion matrix,

Precision and Recall
Precision and Recall are crucial metrics utilized to assess a categorization model's performance.Precision represents the ratio of true positives, precisely identified instances of a class, to all predicted positives, instances predicted to belong to a class.On the other hand, Recall measures the ratio of true positives to all actual positives, instances that genuinely belong to a class.For the rice leaf disease dataset, the Precision and Recall values differed for each disease class, depending on the effectiveness of the specific algorithm employed.
Using the SVM algorithm, the Precision and Recall values for each disease class were as follows:

CNN
CNNs have emerged as a dominant category of DNN, particularly excelling in image recognition tasks.They are employed across various fields, such as agriculture.CNNs utilize a blend of convolutional layers, pooling layers, and fully connected layers to generate spatial-temporal feature hierarchies via self-adjusting and self-improving backpropagation techniques [14].The primary aim of CNNs is to construct deeper networks while maintaining a minimal number of parameters.Similar to a conventional neural network, CNNs comprise neurons organized in layers, beginning with an input layer and culminating with an output layer, interconnected through learned weights and biases.Hidden layers placed between the input and output layers function to modify the input's feature space to align with the output.It is essential for a CNN to include as a hidden layer, at least one convolutional layer should be used.Unlike other techniques that call for manual feature extraction, CNNs possess the ability to learn these features autonomously.The convolutional layer is vital to a CNN's functionality, using adaptive kernels that reach across this entire network depth.This layer conducts a convolution operation on the input layer and conveys the result to the following layer, combined with a nonlinear function such as ReLU (Rectified Linear Unit).In addition, the pooling layer condenses the dimensions of the convolved features, decreasing the required computational power during data processing.Despite the reduction in spatial dimensions, training accuracy is preserved, and overfitting is circumvented.The fully connected layer (FC) produces a class score employed in the classification phase.Prior to initiating the training process, all parameters within the CNN must be determined, while kernel weights are discovered during training.An efficient activation function contributes to accelerated learning and a minimized loss function.Weights are adjusted utilizing optimisation algorithms like gradient descent or its variations, which are derived from the loss function.Conversely, expanding dataset size and implementing regularization to the data (for example, randomly excluding some activations) mitigates the potential for overfitting.

SVM
We applied the SVM technique to our dataset to identify and categorise illnesses of rice leaves.The model's accuracy was 58%.The classification metrics for each illness class are as follows: For Bacterial Leaf Blight, the model achieved a precision of 0.67, a recall of 0.75, and an F1-score of 0.71.For the Brown spot, the model's precision, recall, and F1score were all 0.70.In the example of Leaf smut, the model finally achieved a precision of 0.20, a recall of 0.17, and an F1-score of 0.18.These findings demonstrate how effective the SVM algorithm was at identifying and categorising rice leaf illnesses in our study.

Kth Nearest Neighbours (KNN)
KNN was used to assess the dataset related to rice leaf disease.The model was 0.50 percent correct overall.The model achieved an F1-score of 0.46 for Bacterial Leaf Blight with precisions of 0.60, 0.38, and 0.46 respectively.Precision, recall, and F1-score for the brown spot were all 0.50, respectively.Finally, this model achieved an F1-score for Leaf smut of 0.46, a precision of 0.43, and a recall of 0.50.The macro average metrics were precision, recall, and F1score of 0.51; the weighted average metrics were precision, recall, and F1-score of 0.52, respectively.

Random Forest Classifier
For detection and classification, the Random Forest Classifier algorithm was applied to the rice leaf disease dataset.The model's overall accurateness was 0.71.The model attained a precision value of 1.00, a recall value of 0.67, and F1-score of 0.80 for Bacterial leaf blight.The precision, recall, and F1-score for the Brown spot were all 0.80.Finally, the model had a precision of 0.62, a recall of 0.50, and an F1-score of 0.56 for Leaf smut.The macro average metrics were 0.76, 0.77, and 0.71, while the weighted average metrics were 0.71, 0.70, and 0.70, respectively.Regarding the traditional machine learning algorithms, the Random Fores Classifier achieved the highest overall accuracy of 0.71, followed by the SVM algorithm with an accurateness of 0.58 and the kNN algorithm with an accurateness of 0.50.These results suggest that the DL models are more effective in classifying rice leaf diseases than traditional ML algorithms.

Challenges and Future Direction for Rice Disease Detection in Bangladesh
In Bangladesh, rice is a main food for millions of people, and the country is highly dependent on rice production.However, the production is threatened by various Rice diseases like bacterial leaf blight, brown spot, and leaf smut are common which can cause significant yield losses.To detect these diseases, various techniques have been proposed, such as edge detection, water separation, cluster, sale, active contour, threshold, etc.However, the detection process remains a challenging task due to the complex and dynamic nature of the rice diseases.One of the most important challenges in rice disease detection is the scarcity of data.Although efforts have been made to broaden and diversify the dataset, it is still not sufficient to build accurate models.To overcome this challenge, Data improvement methods, such as adding more images to the dataset and adjusting the parameters of the ML model, can be used to create good classifiers.Another challenge is the selection of appropriate features for disease classification.The image properties such as texture, shape, color, and motion-related properties can be used as features for classification.However, selecting the most relevant features that can effectively differentiate between the healthy and infected plants remains a research challenge.Moreover, the classification accuracy of the machine learning models is also affected by the pre-processing of the images.Preprocessing techniques such as segmentation and background removal can improve the accuracy by removing noise and enhancing the image quality.However, selecting the appropriate pre-processing techniques that can enhance the relevant features without losing important information is a challenging task.Furthermore, the development of a comprehensive tool for the rice disease diagnosis system is another challenge.Although various methods have been proposed, there is a need for a comprehensive system that can integrate different techniques and provide exact and systematic diagnosis of rice diseases.The detection of rice diseases in Bangladesh is a challenging task that requires the integration of various techniques and the overcoming of several challenges such as limited data availability, feature selection, pre-processing, and the development of a comprehensive diagnosis system.However, by addressing

Result
The study explored the efficiency of both DL and traditional ML algorithms are being used to detect and classify rice leaf diseases.Three different CNN models were designed and evaluated on a pre-processed rice leaf disease image dataset, with accuracies ranging from 0.54 to 0.93.In addition, three traditional mal algorithms, including SVM, RandomForestClassifier, and kNN, were employed and achieved accuracies ranging from 0.50 to 0.71.Further analysis demonstrated that the SVM algorithm again an accuracy of 58%, while the kNN model has achieved accuracy rate. of 0.50.For the Random Forest Classifier algorithm, the overall accuracy achieved was 0.71.In terms of individual disease classes, the highest precision and recall values were observed in the Random Forest Classifier algorithm for Bacterial leaf blight, with a precision of 1.00 and recall of 0.67.With a precision of 0.70 and a recall of 0.70, the SVM algorithm achieved the highest precision and recall values for Brown spot.Lastly, the kNN algorithm demonstrated the greatest precision and recall values for Leaf smut, with a precision of 0.43 and a recall of 0.50.Overall, the study demonstrates that both DL and traditional ML algorithms can effectively detect and classify rice leaf diseases.These techniques can potentially be applied in riceproducing countries like Bangladesh to improve the efficiency and productivity of the agricultural industry.Further research and development can lead to more advanced models with higher accuracy in detecting and diagnosing a wider range of rice diseases.

Conclusion
In conclusion, the application of DL and ML methods has shown promising outcomes in the automated identification and diagnosis of diseases affecting rice plants.These techniques have the ability to accurately identify and classify various types of rice diseases, ultimately improving the efficiency and productivity of the agricultural industry.This is especially significant for Bangladesh, where rice is a staple food and a major contributor to the economy.By utilizing these techniques, significant improvements in rice production and quality can be achieved.Further research and development can result in the development of more advanced products and accurate models capable of diagnosing a wider range of rice diseases.This has the potential to make the agricultural sector in Bangladesh and other rice-producing countries around the world more sustainable and profitable.The leaves of the plants are the main location where diseases are visibly apparent, and different diseases have different effects on the leaves.Rice plants play a key role in global food security because they provide food for over half of the global population.Rice plant diseases have a notable impact on the class and quota of rice produced., causing an estimated 20-40% production loss annually.Manually detecting these diseases requires extensive work and disease knowledge from farmers, making early diagnosis a difficult and expensive task.Automated methods such as ML and DL can perform early detection at a lower cost, ultimately benefiting both farmers and consumers.
and his team (2021) [1] explored a realtime diagnosis method for rice leaf diseases using Faster R-CNN, a deep learning model, showcasing the promising capabilities of these technologies in efficient plant disease detection.Next, Latif et al. (2022) [2] investigated the use of CNNs with enhancements in the detection of diseases in rice plants, bringing to light the increased significance of these methods in the agricultural domain.Sharma and colleagues (2022) [3] utilized a variety of ML methods to detect rice plant diseases in their early stages, further emphasizing the versatility and potential of such techniques in this field.Ibrahim and Atya (2022) [4] combined traditional ML and DL approaches for detecting diseases in rice leaves, providing a new perspective on the use of hybrid methods.In a different approach, N and his team (2021) [5] employed transfer learning techniques alongside deep neural networks to predict rice leaf diseases, emphasizing the value of transfer learning in boosting the performance of DL models.Bhattacharjee et al. (2020) [6] highlighted the effectiveness of DL for rice plant disease detection and classification, adding a new dimension to the applicability of DL models.Meanwhile, Aggarwal and colleagues (2022) [7] implemented AI and ML methodologies to detect rice diseases with the aim of enhancing agro business, illustrating the potential economic impacts of these advancements.Daniya & Vigneshwari (2022) [8] emphasized the significance of image-based features for disease detection in rice plants through a deep neural network model that uses texture and deep features.Latif and his team (2022) [9] reinforced the importance of CNNs by introducing an enhanced version for the detection of rice plant diseases.

❖
SVM: It aims to find the optimal hyperplane that separates different classes in the feature space.The SVM model achieved an accuracy of 0.58.❖ Random Forest Classifier: This ensemble method consists of multiple decision trees, combining their outputs to improve the overall classification performance.The RandomForestClassifier model obtained an accuracy of 0.71.❖ K-Nearest Neighbors (kNN): This non-parametric method classifies instances in the feature space based on the majority class of their k nearest neighbours.The kNN model had a 0.50 accurateness.

1 . 2 ] 3 . 3 ]
Bacterial Leaf Blight: This bacterial disease, caused by Xanthomonas oryzae pv.oryzae, results in water-soaked lesions on the leaf edges, which can spread rapidly, causing severe damage to the rice plant.[Fig1] 2. Brown Spot: This fungal disease, caused by Bipolaris oryzae, is characterized by small, brown lesions on the leaves, which can expand and coalesce, leading to significant yield loss in rice plants.[Fig.Leaf Smut: This disease is caused by the fungus Entyloma oryzae, which leads to elongated, dark brown to black lesions on the rice leaves, negatively impacting the overall health of the plant.[Fig.This dataset serves as a valuable resource for training and developing ML and DL models to detect and diagnose rice leaf diseases effectively.

Fig 1 .Fig 2 .
Fig 1. Bacterial Leaf Blight Eliminate Duplicates: Examine the dataset for any duplicate images and remove them to avert overfitting or model bias.❖ Standardize Image Dimensions: Make sure all images have the same size by adjusting their dimensions, ensuring uniformity in the input to the model.❖ Alter Image aspect Ratios: If image aspect ratios differ, you might need to crop or pad images to maintain consistency across the dataset.
Now we are going to see the performances of our used ML and DL models (Logistic Regression, G-Naive Bayes, Bernoulli Naive Bayes, SVM, X-GBoosting,Decision Trees Classifier,Grid Search CV,Random Forest Classifier, AdaBoost Classifier, G-Boost Classifier, XgBoost,Cat Boost Classifieraccuracy,Precision, Recall, F-1 score and Support.The table shows the testing accuracy of different classifiers on a dataset.The classifiers are evaluated based on how well they can predict the target variable.The Bernoulli Naive Bayes classifier has the highest testing accuracy of 0.99, while KNN has the lowest testing accuracy of 0.77.Other classifiers have testing accuracy ranging between 0.82 to 0.92.The classifiers with higher testing accuracy are better at predicting the target variable and are more suitable for the given dataset.

❖
Bacterial Leaf Blight: Precision of 0.67 and Recall of 0.75 ❖ Brown Spot: Precision of 0.70 and Recall of 0.70 ❖ Leaf Smut: Precision of 0.20 and Recall of 0.17 Meanwhile, using the KNN algorithm, the Precision and Recall values for each disease class were: ❖ Bacterial Leaf Blight: Precision of 0.60 and Recall of 0.38 ❖ Brown Spot: Precision of 0.50 and Recall of 0.60 ❖ Leaf Smut: Precision of 0.43 and Recall of 0.50 Finally, using the Random Forest Classifier algorithm, the Precision and Recall values for each disease class were: ❖ Bacterial Leaf Blight: Precision of 1.00 and Recall of 0.67 ❖ Brown Spot: Precision of 0.80 and Recall of 0.83 ❖ Leaf Smut: Precision of 0.62 and Recall of 0.50 These Precision and Recall values can be used to compare the performance of different algorithms on the same dataset, and to evaluate the effectiveness of each algorithm in identifying specific disease classes.
During the training of the three CNN models, we observed a steady decrease in the training loss and an increase in the training accuracy.The validation loss and accuracy were monitored to ensure that the models were not overfitting to the training data.The validation loss and accuracy followed a similar trend to the training loss and accuracy, indicating that the models generalised well to new data.[Fig.4]The first CNN model had a final training accurateness of 0.8905 and a validation accurateness of 0.5417.The second CNN model had a higher training accurateness of 0.9585 and a validation accurateness of 0.7667.The third CNN model had the highest training accurateness of 0.9916 and the elevated accurateness of 0. 9333..These results demonstrate the effectiveness of the deeper and more complex architectures in improving the performance of the models.[Fig.4]

EAI
Endorsed Transactions on Internet of Things | Volume 10 | 2024 | Using Deep Learning and Machine Learning: Real-Time Discernment and Diagnostics of Rice-Leaf Diseases in Bangladesh these challenges and improving the existing techniques, it is possible to achieve accurate and efficient Improve rice production in Bangladesh by detecting rice diseases.

❖
Normalize Images: Standardize pixel values in the images by scaling them to a range of 0 to 1 or by applying mean subtraction and dividing by the standard deviation.This process aids in enhancing the model's convergence during training.
❖ Partition the Dataset: Dividing the dataset into subsets for training, validation, antesting.maintaining a balanced distribution of the three disease classes.This division allows for efficient model evaluation and reduces overfitting.
Adjust brightness and contrast levels in images to minimize variations not related to disease identification.
❖ Normalize Brightness and Contrast:❖ Partition Data: Separate this dataset into training, validation, and testing subsets, ensuring that each subset represents the three disease classes evenly.This will facilitate an effective and fair assessment of the model's performance.