An IoT Integrated Smart Prediction of Wild Animal Intrusion in Residential Areas Using Hybrid Deep Learning with Computer Vision

,


Introduction
Human population growth leads to an increase in the exploitation of forest areas and related resources for residential and other settlement purposes.As a result of the destruction of forest areas, wild animals are forced to move toward the settlements [1].It is possible for wild animals to invade a neighborhood and damage or even kill the residents and damage the corps.There is a need for a system to detect the entry of such animals into habitats.Signals and alarms should be issued whenever such animals enter the habitat.
The Internet of Things (IoT) and deep learning are two rapidly emerging technologies that can be combined to develop novel applications and capabilities [2].Deep learning models can analyze IoT sensor data to anticipate equipment failures and maintenance needs [3].Predictive maintenance models provide valuable insights into the timing and type of maintenance or replacement required, thereby optimizing maintenance schedules, and minimizing downtime.This is accomplished through the identification of patterns and trends in sensor readings.Deep learning and artificial intelligence contribute to technological advancements.IoT unlocks a vast array of technological wonders.The proposed system comprises three modules: the deep learning module, the IoT module, and the cloud module.The deep learning module employs various algorithms, including DenseNet 201, ResNet50, and YOLO, for image detection and prediction.By utilizing the prediction output, the entry of wild animals can be anticipated.The integration of an IoT module and a cloud module addresses additional requirements such as storage and signaling.
During the training process, deep learning models utilize large datasets, which cloud computing provides through its facilities, resources, and amenities.Cloud storage services offer a practical and scalable method for keeping and handling large files.In addition to storing large amounts of data, cloud storage systems provide high reliability, redundant operation, and information versioning, which are essential for deep learning operations.working on deep learning projects can collaborate through cloud-based platforms.Researchers and practitioners can work together more effectively and iterate on their models with the help of tools such as version control, code collaboration, and model sharing, which they provide.As part of cloud platforms, deep learning models can be deployed as APIs or services, allowing seamless integration into applications and services.
The IoT module comprises the camera and signaling systems.The camera is the visual input in the module which captures the image and inputs the deep learning module, and the output of the deep learning module is again returned to the IoT module.This plays a vital role as the signal or alarm for the intrusion is reflected and input for the system is fed only through this module.Following that the cloud module, which is the primary module for a model for storage, stores the dataset for the process of performing the prediction and stores the model in the cloud to retrieve whenever needed.The need for the model is purely for storage and access purposes.

Related work
Sheela et.al [4], proposed work on a system that integrates a Raspberry Pi module with a PIR sensor, Thermal Imaging camera, GSM module, and hologram.The animal image that was recorded is verified using a modified CNN technique, and the user is then informed.Absolute crop security against animal intrusion ensures that farmers won't suffer significant losses.The farmer receives a warning via an IoT application.Regarding the intruder's photographs that were captured and the notification alert, the efficiency of the suggested system has been examined.Anyone can efficiently identify any kind of incursion surrounding the field by using the provided model.[5], in their work explain a system that combines YOLOv4 and LoRa to identify the presence of wild animals and notify the appropriate wildlife authority.The prediction of the intrusion of wild animals uses a hybrid model for the prediction of the system.The proposed comprises various accuracies for the algorithms used and results in an efficient one among them.

Meenakshi et.al
Patil et.al [6], By establishing a computerized device that recognizes the invasion of wild animals and attracts them back into the forest without doing any harm, the proposed system aims to protect people and livestock at the periphery of the forest area/fields, thereby minimizing the risk of injury triggered by the Human-Wildlife dispute.It is simple to simulate the triggering situations and environmental consequences, such as a wild animal entering the area of interest.
Nikhil et.al [7], suggested an approach that explains how combining the Internet of Things and machine learning methods can make irrigation smarter.The proposed approach helps farmers grow suitable crops based on soil parameters by using machine learning techniques, forecasting crops help the farmers avoid issues like ongoing surveillance over the field, and it also aids in preventing attackers like wild animals from entering the field.Additionally, it promotes water conservation by automatically providing plants and fields with the least quantity of water necessary in accordance with their needs.
Panda et.al [8], To detect intrusions, ultrasonic sensors are employed at the corners of the field.Subsequently, a camera installed on an electric vehicle, equipped with a Node microchip microcontroller, captures photographs of the intruder to aid in field surveillance.The farmer is then alerted through an IoT application.The effectiveness of the proposed system has been evaluated based on the acquired photographs of the intruder and the notification alert.The provided model enables efficient identification of any type of encroachment around the field.
Giordano et.al [9], proposed work on an intrusion in agricultural lands.In the agriculture industry, the adoption of the Internet of Things has enabled smart farming and precision agriculture, among others.It is described in this article how Internet of Things software has been developed for crop security in order to prevent animals from entering agricultural fields.Agricultural systems can be protected from possible damage from wild animal assaults and meteorological conditions by employing repelling and monitoring equipment.

Proposed system
The proposed system comprises three modules-the deep learning module, the IoT module, and the cloud module.The deep learning model predicts the entry of wild animals through image classification algorithms.The hybrid model uses more than one algorithm for the prediction process, DenseNet 201, ResNet50 and YOLO.The IoT module inputs the image to the deep learning model.The cloud module deployed is used for storing and retrieving data for the entire process.

Figure 1. Proposed System Architecture Diagram for Prediction of Wild Animals
The flow of the system as depicted in Figure 1, includes three modules.The wild animal intrusion into the residential area is fed as input to the deep learning module.The module comprises image preprocessing and model training steps involving various deep learning algorithms [10].The deep learning model returns the output to the IoT devices which sound the alarm and signal the intrusion to the residents.The cloud module deployed is used for the data exchange with the deep learning module [11].

Internet of Things (IoT)
The proposed system uses the Internet of Things to effectively monitor and prevent animal intrusion into residential areas.This enables it to analyze the data from the sensors over the internet and gives the feedback data to the user's console [12], [13].The system comprises Motion-Activated trail cameras to detect wild animal movements.
These cameras are equipped with passive infrared and motion sensors that potentially detect animal movements in the vision of the cameras.Once the sensors detect the movement it starts capturing the photos or videos [14].The intervals between the captures/recordings can also be programmed or we can use burst mode which takes multiple shots in a short span of time.The resolution of the photos and videos depends upon the quality of the lens, megapixels, and the type of trial camera used [15].Usually, a high-resolution trial camera is used for detecting and processing the data.To detect the motion and capture in the darkness these trail cameras use a special type of LED called Infrared LEDs [16].These LEDs are not visible to human or animal eyes but can be captured images by the cameras.This enables the capturing of objects without disturbing or emitting a huge flash of light in the darkness [17].Figure 2 shows the computer vision cameras are equipped with a built-in micro-controller to capture the wild animal images.These microcontrollers analyze the initial data output from the cameras and send the data to the cloud where it analyses the images captured with deep learning algorithms and sends alert notification to the concerns of the forest department [18].Microcontroller takes care of the camera control in this IoT system that includes critical decision making when to trigger the cameras to capture the images/videos [19].It also adjusts the camera settings such as fixing exposure, shutter speed, white balance and other options depending on the surrounding lighting condition.It receives data from the PIR and sends the feedback data to the camera to activate capturing when movement is detected.Microcontroller handles the data from the camera and compression of the data to lower size so as to send to the cloud server with efficient data saving methods.The microcontroller can be equipped with Wi-Fi or cellular modules where it enables the transfer of data from the camera to the cloud server.It ensures the communication protocol to be followed during communication and converts the data into the desired type of protocol in which it needs to be transmitted [20].The transmitted data is sent to the cloud server where it analyses the data using the deep learning algorithm defined below and sends an alert notification to the forest official regarding the intrusion of the wild animal into the residential area and required action will be taken by the forest officials [21], [22].

Cloud
This LoRaWAN (Long Range Wide Area Network) uses low power and transmits data to a long range.Since the system is set up in remote places and residential areas it requires to transmit data to a long range so as to enable proper communication with the cloud and forest department's console, using LoRaWAN enables long range communication that involves transmission of data to several kilometers without any issues.This LoRaWAN uses low GHZ frequencies to transmit data over long distances.Using this communication protocol, the microcontroller communicates to the cloud storage.Cloud storage again converts the data to original data of images or videos and then processes the data.The data is processed using the deep learning algorithm where it differentiates and finds the kind of wild animal that intrudes into the residential areas.The cloud on analyzing the wild animal in the residential zone will generate an alert notification.The cloud will send an alert notification as a message to the respective forest official's mobile number.A dedicated mobile/web application is designed where it receives the alert notification when a wild animal intrudes.Web application also shows the previous intrusion data of the wild animal in the particular residential zone.Depending on the past data, the forest department can make preventive measures for the repeated zone of intrusion [23], [24].

Dataset Creation
The objective of the venture is to detect the wild animals that intrude in the living regions, so animals like lions, tigers, cheetahs, elephants, bears, jaguars, leopards are causing panic and are vulnerable to most of the local residentials.So, the dataset is created by collecting the images of these animals by using web scraping.Beautiful Soup is a package in Python that performs API calls where API stands for application programming interface which will act as a communicator or a gateway where the data flows from the internet to the device by the script written and connects with Google to fetch the images.Using this package, the script will search for different animals on the internet and collect a highquality image available on the internet with a minimum of 50 images to a maximum of 450 images are collected for each animal.Now this dataset can be used for model development.

Feature Engineering
The dataset was gathered and organized from the internet.It contains many images which are not good in quality, size and some images may also be wrongly stored, so these images have to be analyzed, corrected, and explored to make it adequate for deep learning model creation.To perform this operation feature engineering is used by applying different techniques of image processing.The dataset must be balanced to create a model which provides accurate classification.Once the dataset is balanced, it can be gone through using various visualization techniques.Matplotlib is a library in python that is used for visualization and creating graphs, but sometimes the matplotlib takes more processing time.So, a lightweight library built on the matplotlib is used for better visualization with less processing time.In the first step of preprocessing the image sharpening and restoration, in this process the image is enhanced by adding the brightness, contrast, resolution and other saturation factors that affects the quality using OpenCV a library in python that helps to handle the image data in python after completing this process next image restoration is done where the overall quality of the data is improved with making the sharpness and appearance of the image to adequate needs.
After that gaussian blue is applied to the image which usually blurs the image and the background to find the details of the image which will be useful for deep learning models to find the difference.Finally, the color correction is done like the image with RGB color is turned to grayscale, so the algorithm used for classification doesn't distract with the details that are not mandatory.

Experiments
To begin with model creation 3 different algorithms and transfer learning methods are used to produce an accurate classification model which performs seamlessly in the realtime environment irrespective of the climate, light, and other circumstances that affect the monitoring of wildlife animal intrusion.

DenseNet 201
The first algorithm used for the model training is densest 201 which is a deep learning algorithm that works based on the Conventional Neural Network (CNN) in general, but it solves many issues where CNN does not perform well.The DensNet is chosen because in CNN the increase in count of hidden layer connections, the data passing through the connections will fade away over the increase in layers count and thus result in data loss for a greater number of layers, so DenseNet is used which handles this issue by providing the gradient decent to each layer and making sure whether the data is flowing to all the layers and helps to create 100 to 1000 layers.
[] = ([ [0] ,  [1] ,  [2] ,  [3] , In a normal network with L layers, these are bound to be L connections, or links between the layers, so let's say we have a capital L number of layers.Ultimately, there will be roughly L and L plus L(L+1)/2 connections in a DenseNet.Due to the fact that a dense net has less layers than the other model, we are able to easily train a model with more than 100 layers using this method.
In the general the conventional neural network the data flows from one layer to another after calculating the loss, batch normalization finally it applies the activation function to it.
The equation (2) shows the data used from the previous layer where H is the learning by the new layer and x is the information from the previous layer.DenseNet works in a similar way but won't sum up all the layers instead it coordinates the data.
The above equation shows the DensNet coordinates the input from all the layers that is x0, x1, x2, x3..…xl-1 and H is the learning rate of the model.Since the feature map increases with each layer the growth layer is calculated using this formula.
This equation is used for calculating the growth rate where K is the hyperparameter which changes with the learning rate.Now developing the transition layer will reduce the spaces in the dimension that helps to acquire more details about the images and maintain the feature map count that provides efficient computation.It will be placed in between the dense layers to calculate the batch normalization and pooling layers.
The algorithm is loaded with the data and the epochs set to 40 with 128 layers with ReLu activation for the hidden layers and SoftMax activation function is used for output layers.The model produced an accuracy 82% and 85% loss which is not adequate for real time classification.

ResNet 50
The next algorithm used is the Resnet 50 which is a better performing in the residual neural network algorithm used which is a better performing in the residual neural network family that solves the huge state of art problem which causes overfitting and underfitting of the model when more layers are added after reaching a particular accuracy and the curve produce a liner result.In this case resent is used to overcome by creating short routes to connect between layers which solves the vanishing gradient and reduce model loss.This algorithm works by five high level steps for learning the details from the dataset.
In the first two stages the algorithm is inputted with the necessary dataset with respective labels, and this is passed to the Conventional Neural Networks (CNN) where the features of the images are extracted in the lower level of the data.These layers pass the data from one layer to another which increases the learning rate and accuracy of the model with new features every time a neural network is loaded.
Now that residential blocks have been added to the conventional layers, it is possible to directly connect the input layer to the output layer, opening the door to the creation of residual maps.Similar to this, I found the feature relevance of the data and reduced the spatial analysis using global average pooling.
Finally, the fully connected layers are used to calculate the overall output and classify the result based on the previous layer with the SoftMax activation function to the output layer which provides the result of the final classification.Here, the model is trained with the general dataset of animals, so the model can classify wild animals with domestic animals.
So, the dataset for domestic animals is collected from Kaggle and all the dataset in this are converted to wild animals' class to use it for this model development.Then the data is fitted with the algorithm with 100 epochs the model produced a quite good accuracy of 96%.

Figure 8. Confusion Matrix for ResNet50 Model
From Figure 8, it is inferred that the model produced a good accurate prediction with the test data.So, the model is tested with sample data on how it classifies with the real time image by randomly using data from the internet.The final algorithm used for the model creation is YOLOv3, which is extremely fast and accurate.It is one of the popular pre-trained models which performs all the real time recognition especially with the live feeds as the input to the model.YOLO is widely used in many sectors.It was chosen because of the model efficiency, performance, accuracy, and the recognition speed of YOLO.Before going through the YOLO, it evolved to overcome some flaws of deformable parts model kind of algorithms.This algorithm works by using different models for the required functions like feature extraction, classification of the regions and finally creating the bounding boxes for the prediction thus cost more utilization of CPU, time, and prediction speed.Finally, the classification of the image is also performed by dividing the images by many frames this also result in more memory utilization.But YOLO overcame these issues by dividing the total image into many frames and find the details in the image at a single process.Now the grid is considered as a unique feature and these features will give a confident score.A threshold value is set with this value; the confidence score of each grid is calculated by taking the maximum values it predicts the overall image with the given training data.To develop a YOLOv3 model the dataset must be labelled in the image itself with the exact locations and details of the object.So, the dataset used must be annotated which means the features have to be marked in the image.To annotate the image there are various tools available online.In this case an image annotator is used which loads all the files in a specific folder and allows the user to draw a rectangular box around the object that has to be detected and it has to be labelled with the classification name.This process must be manually done with the images in each class clearly which creates a unique XML file for each image that shows the details about the image in deep understanding format.Finally, these data set is loaded into the YOLO model with the XML files and the data of the ImageNet too which is an open-source dataset by google which contains all the general images already loaded with it, so this helps in classifying the universal objects easier and reduce the training and prediction time.Now the dataset is loaded and trained with 100 epochs and the model produces an accuracy of 98% with minimum loss for the given wild animals and provides a perfect model for the real time footage analysis and detection of wild animal intrusion.

Figure 12. Detection of Sample Data by YOLO Model
The above figure 12 image shows the model detects the tiger intrusion in by the feed collected from the IOT device placed in the respective area.After the detection within a minute the model will send alert notification to the notification console in the forest department and also send SMS to the field officer who is in the duty at the time detection.

Result and discussion
The approach that is being suggested contains a hybrid deep learning model made up of the three algorithms DenseNet 201, YOLOv3, and ResNet 50.These algorithms produce a range of accuracy levels that are evaluated in order to determine which method is the most efficient.DenseNet 201's accuracy score is 82%.As indicated in Table 2, YOLO algorithms achieve an accuracy of 98%, whereas ResNet50 achieves an accuracy of 92%.When comparing the accuracy rates, the YOLO algorithm, which has a 98% accuracy rate, is seen to be efficient.The output is then transmitted into the IoT module to alert users when wild animals enter a building.Here, the data deployment and maintenance employ the cloud module.

Conclusion
In the fast-paced world, the life of living beings is miserable.
To cope with their well-being the people started evolving in cultivation land and turned it into real estate.Next are the forest areas, the habitats of wild animals.Those habitats are turned into human habitats.So, the problem arises due to the migration of wild animals to residential areas without habitats for them.This is to be addressed by developing a system that alerts the habitants of the residential areas.The proposal ensured the use of IoT for alerting people by signals and alarms.The detection of those animals is done by a deep learning model which comprises three algorithms.The algorithms used for the detection process are DenseNet 201, ResNet50 and YOLOv3.Different levels of accuracy were produced by the hybrid model, and these accuracy levels were assessed in order to determine how efficient they were.The accuracy of the algorithm DenseNet 201 is 82%; YOLO's accuracy is 98%; and ResNet50's accuracy is 92%.As a result, algorithm YOLO is regarded as efficient since it has a high degree of accuracy and high efficiency when compared to other algorithms.The IoT device uses the output of the model to transmit the signal to the occupants.Although the suggested method addresses the problem of invasive wild animals, it is our responsibility as humans to protect and maintain these creatures' natural habitats without exploiting them for personal gain.

Future work
The proposed comprises three modules with deep learning as a primary module that predicts the intrusion with the trained dataset.The intrusion of the wild animals must be predicted with the well-trained dataset which is fed from the cloud module.The efficiency of the deep learning module is improved by using a hybrid model which comprises three algorithms, further, the efficiency of the model can be improved by increasing the large amount of data in the dataset that is by adding additional animal images and training the system efficiently.Also, by performing reengineering the system can be established with accurate results and prediction strategies.

EAIFigure 2 .
Figure 2. Data Flow Diagram for Vision Images Using IoT Device

Figure 3 .
Figure 3. Images of Different Animal Species in the Dataset Figure 3 depicts the different wild animal images collected species in the dataset.It shows the image is not clear in the above case, so the image must undergo preprocessing [25].

Figure 4 .Figure 4
Figure 4.A Sample Data After Image Processing Figure 4 illustrates the data sample after image processing using OpenCV.It undergoes image reshaping, resizing, and Gaussian blur, and finally it is converted to grayscale to fit the data with the algorithm.

Figure 5
Figure5exhibits the DenseNet model with 5 -layers combining Batch Normalization and Rectified Linear Unit (BN-ReLU) Convolution Networks algorithm.Now the dense layer and transition blocks are created in the dense layer contains many layers which are closely correlated.These layers are connected, which is feedback to all the upcoming layers, The import use of this process is to improve the gradient flow and reuse of parameters.This results in reducing the gradient descent problem.

Figure 7 .
Figure 7. DenseNet 201 Model Accuracy and Loss with Increase in Epoch Figure 7 exposes the model accuracy increases with respect to the epoch and finally it produced an accuracy of 82%.Similarly, the loss is reduced with respect to the epoch, and it produces 85% loss which is bad in real time.

Figure 9 .
Figure 9. ResNet 50 Model Prediction Figure 9 shows the Resnet 50 model prediction with sample data.It correctly classifies the wild animal and the normal animals which are feeders as test images.But this kind of prediction will be good for image data and general

Figure 10 .
Figure 10.Sample Probability Class Map of YOLO

Figure 11 .
Figure 11.XML File Created After Annotation

Table 1 .
Count of Images for Each Animal Species

Table 1
enumerates the image count from different species of animals is counted and noted.From Table1, it is inferred that the images for different species vary a lot which shows the dataset is imbalanced.So, image data generator is a package in TensorFlow that generates images with the dataset.This is mostly used to balance the dataset by creating new images for the results where collecting data is hard.In this case, the species Bear, Leopard, and Cheetah contain fewer data samples compared to the other data samples.Here image data generator is used for creating relevant images and added to the existing data for model creation.