Gait Data-Driven Analysis of Parkinson’s Disease Using Machine Learning

INTRODUCTION: Parkinson's disease is a progressive and complex neurological condition that mostly affects coordination and motor control. Parkinson's disease is most commonly associated with its motor symptoms, which include tremors, bradykinesia (slowness of movement), rigidity, and postural instability. OBJECTIVES: Determine any minor alterations in walking patterns that could be early signs of Parkinson's disease. Track the course of Parkinson's disease over time by using gait data. METHODS: In this study, we applied three types of VGRF datasets ("Dual Tasking, RAS, and Treadmill Walking") and developed an ML-based model using six different classifier methods. The datasets were analysed using 16 sensors, of which 8 were applied to each foot and the total pressure of the left and right foot. The aforementioned three distinct gait patterns movement disorders were the sources of the dataset. The gait signals dataset benefited by the participant demographic data. RESULTS: Then, we passed the outcome of applying the model and measuring performance through a cross-validation operator to check the accuracy and decision-making of the five algorithms i) Deep Learning, ii) Neural Networks, iii) Support Vector Machine (SVM), iv) Gradient Boost Tree (GBT), v) Random Forest”. The following findings compare the effectiveness of the various algorithms utilized and the observed PD very well. CONCLUSION: The different ML classifier algorithms demonstrated good detection capability with different accuracy. Our proposed ensemble model is superior to compare with the existing models. Because we can observe the proposed ensemble model result and accuracy better than the other classifier model. The other classifier model’s highest accuracy is 92.08% whereas our ensemble model got 92.31%. So, it has proved that our proposed ensemble model is excellent and robust.


Introduction
Parkinson's disease is a progressive and complex neurological condition that mostly affects coordination and motor control.The medical profession has given the disease a great deal of attention since English physician James Parkinson first described it in 1817 [1].Parkinson's disease, one of the most common neurological illnesses, is estimated to afflict millions of individuals worldwide.A feature of Parkinson's disease is the loss of dopamineproducing cells in the substantia nigra.Dopamine is a neurotransmitter that helps brain cells communicate with one another, which is necessary for coordinating smooth and deliberate movement [2].When dopamine-producing cells are compromised, Parkinson's disease patients' quality of life is significantly impacted by both motor and non-motor symptoms.Parkinson's disease is most commonly associated with its motor symptoms, which include tremors, bradykinesia (slowness of movement), rigidity, and postural instability.However, it is now understood that Parkinson's disease is a complex condition that can cause a wide range of additional symptoms, including issues with cognition, emotion, sleep, and the autonomic nervous system [3] [4The interaction of these symptoms adds to the complexity of managing the condition and its treatment.Parkinson's disease currently has no known cure, although there are treatments that can help control symptoms and offer comfort to those who have the condition.Medication, physical therapy, lifestyle modifications, and occasionally even surgery are used as forms of treatment to manage symptoms and halt the disease's course.This study aims to identify patients with Parkinson's disease (PD) by analysing gait data.This analysis was conducted using data from Physionet's Gait Analysis Database.After walking for around two minutes, participants' VGRF records various forces in Newton as a function of time.Each subject has a total of sixteen sensors; eight of them are placed on the bottom of each foot to measure the pressure of the left and right foot over time (in seconds).

Related Work
Following a review of pertinent literature, the emphasis shifted to the use of gait analysis to differentiate between individuals with Parkinson's disease and healthy individuals.As the primary regulator in humans, the brain is crucial.Even one component being damaged might have far-reaching effects.One such instance is Parkinson's disease (PD) [5].The majority of those over 50 will get Parkinson's disease, a neurological disorder [6].Its symptoms progress gradually, and its onset is gradual.For both motor and nonmotor PD symptoms, there are two primary categories [7].Nonmotor symptoms include mood problems, anxiety, depression, and cognitive dysfunction; motor symptoms include movement disorders, shaking, walking difficulty, stiffness, and postural instability.In order to distinguish PD patients from healthy controls, multiple technologies are applied.In one study, the feet of PD patients and healthy individuals were fitted with wireless inertial sensor systems (AHRS) in order to identify aberrant gait characteristics and distinguish PD patients from controls.In order to classify Parkinson's disease (PD) patients with or without gait impairment and healthy individuals, the authors examined for physical kinematic aspects of pitch, roll, and yaw rotations of the foot during walking [8].A single mobile inertial sensor placed over the shoes was shown by Barth et al. to have great sensitivity (88%) and specificity (86%) in differentiating between healthy individuals and early Parkinson's disease sufferers [9].Another work also suggested using a video infrared camera system to collect data for a Bayesian gait detection method.The collected data is used to create a picture frame matrix, skeleton numbering, depth frame matrix, and depth frame contour.Subsequently, MATLAB was employed to do extra analyses on the gathered data [10].Eight sensors were placed under each foot and extrapolated features from vertical ground reaction force (VGRF) data acquired while individuals walked were used in the unique mathematical method proposed by Alam et al. to evaluate gait in people with PD and healthy controls.The Gait pattern of healthy people and PD patients was evaluated using different machine learning classifiers and had an average accuracy of 85.21% "k-NN", 95.7% "SVM" sensitive 94.4% and specificity of 96.6% [11].Automatic diagnosis of bradykinesia, the cardinal symptom of PD was the primary focus of Samà et al. study.They suggested a mathematical approach based on an SVM classifier and got a high accuracy of more than 90%, a high sensitivity of 92.52%, and a specificity of 89.07%[12].Using the triaxial accelerometer found in the LG Optimus S. Arora et al. researched the viability and accuracy of creating a mobile app to objectively assess PD patients and differentiate them from healthy participants.After observing PD patient's controls for a month and performing gait tests using an accelerometer, they extracted 23 features in the frequency and time domain and used a random forest method to differentiate between PD and controls.The system achieved a combined accuracy of 98%, with a sensitivity of 98.5% and a specificity of 97.6% [13].Dopaminergic-depleted PD patients walked more slowly and with shorter stride lengths, same cadence, and longer double support durations compared to healthy control subjects [14].Dopamine in the substantia nigra can be identified in MRI images with the application of ML algorithms.However, due to the high price of MRI-based PD diagnosis, gaitbased PD analysis has become increasingly popular as an alternative method.Voice analysis [15], foot pressure systems [16], RGB-D cameras [17], optoelectronic motion analysis systems [18] and wearable sensors like accelerometers or inertial measurement units have all been studied using ML techniques in the medical field.

Methodology
In this study, we applied three types of VGRF datasets ("Dual Tasking, RAS, and Treadmill Walking") and developed an ML-based model using six different classifier methods.The datasets were analysed using 16 sensors, of which 8 were applied to each foot and the total pressure of the left and right foot.were directed to walk at their normal walking speed on level ground for two to five minutes over distances of twenty-five to seventy-seven metres while wearing a motorised treadmill equipped with dual tasking capabilities, a similarly wheeled walker, and Rhythmic Auditory Stimulation (RAS).It is crucial to recognise that those with Parkinson's disease or those in excellent health were the only ones included in the three studies; people with other walking difficulties were not included [19].As per our proposed model in Figure 3, we have collected the total number of data 32467 from the physioNet website.After that, we prepared the data with selected attributes, then we used the Ensemble model with "Deep Learning, Neural Networks, SVM, Decision Tree, and Random Forest" to classify the VGRF data of PD patients.Finally, we measured and evaluated the Accuracy, Kappa and specificity through our proposed model, we can quantify the efficacy of PD detection efforts with gait data analysis.

Ensemble Model in Machine Learning
In machine learning, an ensemble model is a method that takes the output of numerous distinct models and uses it to generate a single prediction that is both more accurate and more robust.The fundamental premise underlying ensemble approaches is that by integrating the benefits of many models, the total performance can be enhanced, and the shortcomings of individual models can be compensated for.From classification and regression to anomaly detection and beyond, ensemble approaches have proven to be highly effective in the machine learning field.Ensemble methods can be broadly categorized into two main types: bagging and boosting.
• Bagging, also known as Bootstrap Aggregating, is a method that involves training multiple models using different portions of the entire dataset.To create these subgroups, the training data is usually sampled at random using replacement.Every model is trained independently during prediction, and the outcome is determined by average (in regression) or voting (in classification) the predictions made by the several models.Random Forest is a wellliked ensemble approach in the bagging style.It combines multiple decision trees to create a model that is accurate and resistant to overfitting.
• Boosting: By training a large number of weak models (models that perform marginally better than random guessing) sequentially and giving them more weight, boosting gives priority to cases that were misclassified by several models.This adaptive approach is designed to improve model accuracy by addressing shortcomings in previous iterations.Well-known boosting techniques include XGBoost (Extreme Gradient Boosting), Gradient Boosting (GB), and AdaBoost (Adaptive Boosting).
An ensemble method integrates the forecasts from several separate models to provide a final prediction that is more reliable and accurate.Let T be the total number of base models, and let M1, M2,...., MT be the individual base models in the ensemble.A hypothesis function ht (X), which receives input characteristics X and generates predictions yt, is linked to each base model Mt.For a given input X, each base model Mt predicts a class label yt.The final prediction (\ensemble) of the ensemble is obtained by adding together the predictions of each of the base models.The class label that receives the most votes will be chosen as the final prediction in a majority voting process.
Because of their numerous advantages, ensemble models have become widely used in machine learning.The aim of ensemble models, a potent machine-learning technique, is to use many models to provide predictions that are more robust and accurate.Due to their capacity to tackle difficult assignments and enhance performance in numerous applications, these methods have become commonplace in modern machine learning.

Data Description
Gait data comprises an extensive array of measurements and information about an individual's ambulatory pattern.This resource provides significant insights into diverse facets of human movement and health.Gait data encompasses various factors, including step length, step width, walking speed, cadence (measured in steps per minute), and stride duration.These metrics collectively offer a comprehensive depiction of an individual's walking pattern.The utilisation of sophisticated technologies such as wearable sensors, inertial measurement units, and motion capture systems facilitates the precise acquisition and examination of gait data.We have collected the VGRF signal pairs of three different walking patterns data i) Dual-Tasking, ii) Rhythmic Auditory Stimulation (RAS), and iii) Treadmill Walking refers to the ability to perform two different tasks simultaneously, which involves cognitive and motor coordination.In the context of Parkinson's disease.
• Dual-Tasking: It presents a significant challenge due to the complex interplay of motor and cognitive impairments that characterize the condition.Parkinson's disease often leads to difficulties in executing smooth and controlled movements, known as motor deficits, as well as cognitive impairments that can affect attention, memory, and executive functions.When individuals with Parkinson's disease engage in dual-task activities, such as walking while talking or carrying an object, their limited cognitive resources and compromised motor control can result in impaired performance.This phenomenon is known as dual-task interference.
• Rhythmic Auditory Stimulation (RAS): It is a therapy that uses rhythm and music to help people with neurological diseases, especially Parkinson's disease, improve their motor coordination and movement.RAS involves coordinating physical actions, like walking or exercises for the upper limbs, with an outside sound stimulus, usually a rhythmic beat or musical rhythm.The rhythmic cues from the auditory input work as an outside clock that helps control and improve the consistency of motor actions.In Parkinson's disease, which can make it hard to control your muscles, RAS has shown that it can help improve walking, reduce times when walking stops, and improve the quality of movements overall.By giving people, a structured beat to move to, RAS may be able to tap into neural pathways that help people move more smoothly and in sync with each other.RAS is not only a healing way to deal with motor symptoms, but it also shows the amazing link between music and the brain's ability to control movement.It is a fun and non-invasive way to help people with neurological conditions.
• Treadmill Walking: People with Parkinson's disease can benefit from using motorised treadmills because they offer a structured and controlled setting in which to work on motorrelated problems.People with Parkinson's disease often have trouble walking and moving around.Motorised treadmills are used in rehabilitation situations because the speed and incline can be changed to help with these problems.Motorised treadmills help improve gait pattern, stride length, and cadence by giving you visual and audible cues, like goals and metronome beats.They also help reduce the times when stop moving.Also, the constant movement of the treadmill makes it easier to start taking steps, which can be especially helpful for people who have trouble thinking.People with Parkinson's disease who use motorised treadmills regularly may be able to improve their walking, balance, and general mobility.This could help improve their quality of life.

Data Analysis and Findings
We used data from a diagnosed PD patient to assess the significance of the symptoms.Parkinson's disease is characterized by the severity of gait symptoms.The current results are inferior to those published for several ML-based classification models for diagnosis.The decision tree for PD early detection is given below, and the accuracy of five alternative classifiers and ensemble algorithms is defined in the model.Several phases and operators have been built into the proposed model that we have constructed.In this model, we first used the selected attribute operator to extract the important attributes from the aforementioned dataset and then used the set role operator to pick a categorical variable from among the important attributes.Then, we passed the outcome of applying the model and measuring performance through a cross-validation operator to check the accuracy and decision-making of the five algorithms i) Deep Learning, ii) Neural Networks, iii) Support Vector Machine (SVM), iv) Gradient Boost Tree (GBT), v) Random Forest".The following findings compare the effectiveness of the various algorithms utilized and the observed PD very well.

Findings
The above figure shows the individual six ML algorithms of the Gait Data analysis and can conclude the Deep Learning accuracy is 92.08% the kappa is 0.87, the Neural Networks accuracy is 86.34% and the kappa is 0.78, the Support Vector Machine (SVM) accuracy is 84.62%, the kappa is 0.81, the Random Forest accuracy is 88.42%, the kappa is 0.82, the Gradient Boosted Trees (GBT) accuracy is 91.50% the kappa is 0.86, and the Ensemble model shows the highest accuracy is 93.21% the kappa is 0.83.With the said result, we can say our Ensemble model is robust in the Gait Data analysis for Parkinson's Disease (PD).The following figures are shown as per the conditions detected classification of healthy and PD patients from Gait Data. Figure 5 shows the Dual-Tasking left and Right foot to identify healthy and PD patients.Figure 6 shows the RAS left and Right foot to identify healthy and PD patients.

Figure 1 :
Figure 1: Different walking patterns for VGRF Vertical Ground Reaction Force (VGRF) at a 100 Hz sampling rate is seen in the above figure.The subjects

Figure 2 :Figure 3 :
Figure 2: Sample dataset of VGRF This data contains 19 attributes, The first attribute is time in seconds, L1 to L8 measured left-foot pressure in Newton per second, R1 to R8 measured right-foot pressure in Newton per second, the TLP is the total pressure of the left foot and TRP is the total pressure of the right foot.Proposed Model

EAI
Endorsed Transactions on Pervasive Health and Technology | Volume 10 | 2024 |

Figure 7 Figure 5 :Figure 6 :Figure 7 :Figure 8 :Figure 8
Figure 5: VGRF Dual Tasking Left and Right Foot Rhythmic Auditory Stimulation (RAS) In this study, we used machine learning methods to investigate the relationship between Parkinson's disease and to develop a model for identifying Parkinson's disease.The training data is denoted by {(X1, y1), ……, (Xn, yn), and the classifiers "Deep Learning, Neural Networks, Support Vector Machine (SVM), Random Forest, Gradient Boost Tree (GBT), and Ensemble Model" are used in our study.The classification of Parkinson's disease is critical for gaining a better understanding of the disease's causes, initiating treatment methods, and developing appropriate treatments.Based on the Gait data, this study proposed an empirical model for automatically distinguishing between healthy and PD patients.The different ML classifier algorithms demonstrated good detection capability with different accuracy.Our proposed ensemble model is superior to compare with the existing models.Because we can observe the proposed ensemble model result and accuracy better than the other classifier model.The other classifier model's highest accuracy is 92.08% whereas our ensemble model got 92.31%.So, it has proved that our proposed ensemble model is excellent and robust.EAI Endorsed Transactions on Pervasive Health and Technology | Volume 10 | 2024 |