EAI Endorsed Transactions on AI and Robotics

Design and Implementation of a Novel Parallel Algorithm for Efficient Image Compression in High-Performance Computing

2024-11-14T10:54:11+00:00

The focus of this paper is to describe the development and the architecture of a new parallel algorithm targeted for image compression within High Performance Computing context. The suggested algorithm apply parallel processing strategies for the image data set in order to minimize the amount of computation while at the same time optimize the compression ratio and speed. When combining new parallelism strategies with newly developed and existing methods of data compression, the result is visibly better in both, compression ratio and time, as opposed to comparable existing algorithms. Experimental results performed on different HPC environment prove that the solution put forward is quite scalable and efficient; therefore, it should be considered in applications where real time image processing is crucial. It provides a major contribution to the literature in image compression with a special focus on parallel computations.

Lightweight Keyword Spotting with Inter-Domain Interaction and Attention for Real-Time Voice-Controlled Robotics

2024-11-19T09:51:05+00:00

This study introduces a novel lightweight Keyword Spotting (KWS) model optimized for deployment on resource-constrained microcontrollers, with potential applications in robotic control and end-effector operations. The proposed model employs inter-domain interaction to effectively extract features from both Mel-frequency cepstral coefficients (MFCCs) and temporal audio characteristics, complemented by an attention mechanism to prioritize relevant audio segments for enhanced keyword detection. Achieving a 93.70% accuracy on the Google Command v2-12 commands dataset, the model outperforms existing benchmarks. It also demonstrates remarkable efficiency in inference speed (0.359 seconds) and resource utilization (34.9KB peak RAM and 98.7KB flash memory), offering a 3x faster inference time and reduced memory footprint compared to the DS-CNN-S model. These attributes make it particularly suitable for real-time voice command applications in low-power robotic systems, enabling intuitive and responsive control of robotic arms, end-effectors, and navigation systems. In this work, however, the KWS model is demonstrated in a simple non-destructive testing system for controlling sensor movement. This research lays the groundwork for advancing voice-activated robotic technologies on resource-limited hardware platforms.

DeepDiabFusion: An Interaction-Aware Neural Network Architecture for Diabetes Prediction

2024-11-30T06:02:24+00:00

Accurate prediction of diabetes onset is essential for effective early diagnosis and clinical intervention. This study presents a performance analysis of several machine learning (ML) algorithms applied to the Pima Indians Diabetes Dataset (PIDD), with a primary focus on a novel Artificial Neural Network (ANN) architecture, referred to as DeepDiabFusion. The proposed model integrates feature-wise normalization, parallel dense sublayers, and an interaction-aware fusion mechanism to capture complex feature relationships often overlooked by conventional models. Comparative experiments were conducted against seven traditional ML algorithms, including Logistic Regression, Random Forest, and Gradient Boosting, as well as state-of-the-art ANN-based models from recent literature. Performance was evaluated using accuracy, precision, recall, and area under the curve (AUC) metrics. The proposed model achieved an accuracy of 93.04%, precision of 86.21%, recall of 93.10%, and AUC of 0.951—outperforming all baseline and previously reported models. These results demonstrate the superior classification performance and practical applicability of the proposed ANN framework in clinical decision support systems for early diabetes detection and management.

Simulation and Control of the KUKA KR6 900EX Robot in Unity 3D: Advancing Industrial Automation through Virtual Environments

2024-12-02T16:22:18+00:00

This study presents the development of a virtual simulation of a KUKA robot within the Unity 3D platform, focusing on its ability to execute pick-and-place operations in an industrial setting. The research emphasizes the importance of digital simulations as cost-effective and safe alternatives to physical prototypes in industrial automation. By replicating robotic tasks in a virtual environment, organizations can mitigate wear and tear on expensive machinery and minimize safety hazards inherent in real-world operations. The simulation process commenced with the creation of a detailed 3D model of the KUKA robot utilizing Creo CAD software. This model was subsequently imported into the Unity 3D environment, where an interactive and realistic simulation environment was constructed. A manual control system was implemented through custom C# scripts, enabling precise joint manipulation via keyboard inputs. While the current control mechanism remains manual, this study provides a foundational framework for the future integration of advanced algorithms for trajectory planning and autonomous control. The simulation successfully demonstrates the feasibility of performing industrial robotic tasks within a virtual environment. It serves as a platform for further research, including the automation of robotic movements and the integration of virtual reality and digital twin technologies. These advancements have the potential to significantly enhance real-time monitoring, operator training, and overall operational efficiency in industrial applications. This work underscores the growing significance of virtual simulation technologies in industrial automation, presenting a scalable and flexible solution for prototyping, testing, and training within complex industrial ecosystems.

Apple Disease Detection and Classification using Random Forest (One-vs-All)

2024-12-04T01:41:51+00:00

Fruit diseases detection and recognition are a common problem worldwide. Fruit Disease detection is a hot topic among researchers and is very complex due to structure and color similarity factors. In this research we proposed a new model to detect and classify the apple disease with the help of digital image processing and machine learning. First, image processing techniques were applied to enhance the image contrast and remove the noise, which helped to segment the region of interest accurately also help to extract feature without garbage data. Then K-means clustering technique with fuzzy C-mean method was implemented to segment the images. GLCM feature extraction was used after the segmentation. Real images of apple disease with multi-disease regions were used in the research method. These features were preprocessed with methods like LDA. K-Fold cross validation was used for training and testing, with combination of random forest machine learning method. The result showed high accuracy with comparison of existing techniques.

Advancing Food Security through Precision Agriculture: YOLOv8’s Role in Efficient Pest Detection and Management

2024-12-04T09:56:28+00:00

In response to the growing global population and the consequent need for sustainable food security, effective pest management is critical for enhancing agricultural productivity. This research presents YOLOv8, a state-of-the-art deep learning model optimized for pest detection in agricultural environments, contributing to modern food security efforts. Evaluated using the complex IP102 dataset, YOLOv8 demonstrated notable improvements in pest detection accuracy, achieving scores of 66.9 mAP@0.5 and 42.1 mAP@[0.5:0.95]. These results underscore YOLOv8’s robust performance across diverse detection scenarios, enabling more precise pest control and reducing crop loss. However, in-depth dataset analysis revealed a bias towards larger pests, likely due to bounding box size variations, which presents an opportunity for model improvement. Future work will focus on addressing data imbalances, enhancing sensitivity to smaller pests, and validating YOLOv8 in varied real-world agricultural settings. These advancements are expected to significantly improve pest management practices, ultimately boosting agricultural productivity and supporting global food security through the application of modern agricultural technologies.

Enhancing Virtual Reality Experiences in Architectural Visualization of an Academic Environment

2024-12-04T10:12:22+00:00

Virtual Reality (VR) technology possesses the capability to transport users into immersive, alternative environments, providing them with a convincing sense of presence within a simulated world. This project leverages VR to develop an interactive, educational system centered around the De Montfort University (DMU) Queens Building, simulating key facilities and infrastructure through the integration of 360-degree imagery and Adobe Captivate software. Designed in response to contemporary challenges, such as the COVID-19 pandemic, which underscored the need for flexible and innovative learning methodologies, the VR system offers an immersive educational platform enriched with essential information to enhance student engagement and learning outcomes. A comprehensive literature review explored the expanding applications of VR across diverse sectors, including education, healthcare, robotics, and manufacturing. The findings of this review underscored VR's transformative potential in enhancing educational engagement and facilitating a deeper understanding of complex concepts. The project methodology involved meticulously mapping the physical layout of the Queens Building, capturing targeted 360-degree scenarios using a Ricoh Theta V camera, and subsequently transforming these into immersive VR scenarios enriched with interactive hotspots, meticulously synchronized with the building's design layout. The VR system successfully achieved the project's objectives by simulating key educational and informational use cases. It provides students with an alternative learning medium, offering interactive insights into the functionalities of equipment and facilities within the building. Furthermore, the system enables a virtual tour of the DMU campus, facilitating familiarization with the university environment. Findings from the VR application highlight its potential as a dynamic educational tool, positioning it as a valuable complement to traditional learning methods. This innovative approach demonstrates the capacity of VR to enhance student understanding, support academic and research pursuits, and ultimately enrich the overall student experience.

From Social Media Reactions to Grades: A Machine Learning-Based SocialNet Analysis for Academic Performance Prediction

2024-12-12T18:34:53+00:00

The impact of social media on student academic performance has garnered significant research interest in recent years. The pervasive use of social networking sites (SNS) among college and university students, both in and outside classrooms, has raised concerns about its potential effects on academic achievement. This study investigates the relationship between social media usage and academic performance through a dataset of 550 participants. Machine learning models, including Random Forest, Decision Trees, and Long Short-Term Memory (LSTM), were employed to analyze and predict the impact of social media on students' academic outcomes. The models were trained using clean and well-engineered data. The results indicate a moderate influence of social media usage on academic performance, with the LSTM model outperforming traditional approaches in predictive accuracy. These findings highlight the importance of considering sequential usage patterns in understanding the academic implications of social media.

A Multi-Channel Spam Detection System Utilizing Natural Language Processing and Machine Learning

2024-12-29T15:58:56+00:00

As digital communication rapidly expands, the issue of unsolicited and unwanted messages, commonly known as spam, has become a major concern. This paper introduces an advanced spam detection system that integrates Natural Language Processing (NLP) and Machine Learning (ML) techniques. The system differentiates between spam and legitimate messages by employing a hybrid model that combines Naive Bayes, Support Vector Machines (SVM), and deep learning models like Bidirectional Encoder Representations from
Transformers (BERT). The model demonstrates high effectiveness across various communication platforms, including emails, SMS, and social media, achieving an accuracy exceeding 98.5%.

A Deep Learning Based Optical Character Recognition Model for Old Turkic

2025-01-17T19:07:16+00:00

This study presents the development and evaluation of a deep learning-based optical character recognition (OCR) model specifically designed for recognizing Old Turkic script. Utilizing a convolutional neural network (CNN), the project aimed to achieve high classification accuracy across a dataset comprising 38 distinct Old Turkic characters. To enhance the model’s robustness and generalization capabilities, sophisticated data augmentation techniques were employed, generating 760 augmented images from the original 38 characters. The model was rigorously trained and validated, achieving an overall ac- curacy of 96.34%. Evaluation metrics such as precision, recall, and F1-scores were systematically analyzed, showing superior performance in most classes while identifying areas for further optimization. The results underscore the effectiveness of CNN architectures in specialized OCR tasks, demonstrating their potential in preserving and digitizing historical scripts. This study not only advances the field of document analysis and OCR but also contributes to the digital preservation and accessibility of ancient scripts.

A Comprehensive Approach to Indian Sign Language Recognition: Leveraging LSTM and MediaPipe Holistic for Dynamic and Static Hand Gesture Recognition

2025-02-12T18:31:31+00:00

Recognizing Indian Sign Language (ISL) gestures effectively is crucial for improving communication accessibility for deaf community. This study introduces an innovative approach that integrates a Sequential Long Short-Term Memory (LSTM) model with MediaPipe Holistic for accurate and real-time gesture recognition. This work outlines a straightforward approach to recognizing Indian Sign Language (ISL) gestures effectively. The process is divided into three steps: Extracting features from data, Cleaning, Labelling and identifying gestures using MediaPipe Holistic. The system tracks landmarks on the face, hands, and body across video frames, capturing essential details such as temporal and spatial features for interpreting gestures. First, the data is cleaned and labeled by removing unclear fuzzy images and null entries. Then after, the processed data is passed into a Sequential LSTM model, which has two LSTM layers and a dense output layer. In the proposed approach, model’s performance is improved by integrating techniques such as early stopping and categorical cross-entropy. The model is trained and tested using a customized ISL dataset that included 11 distinct gestures, and it achieved a high accuracy rate of 96.97%. The framework emphasizes the model's robustness across diverse lighting conditions and real-world scenarios, ensuring its applicability in sectors such as healthcare, education, and public service. By enhancing communication for ISL users, it effectively addresses existing gaps and improves accessibility in these domains.

A malware detection method based on LLM to mine semantics of API

2025-03-11T00:32:14+00:00

In recent years, the application of the LLM model has played an increasing role in more and more fields, including network security. Some attackers use LLM to attack, generate malicious code for attack, generate phishing emails, and analyze the vulnerability of the software. This also inspires us to utilize LLM to maintain net security. In the past research on malware detection, there were many feature engineering aspects that we had to ask experts to analyze, and this work is very difficult and resource-consuming due to the frequent updates of malware. In this paper, we propose a malware detection method for intrinsic semantics. The method first designs an API intrinsic semantic feature encoder, which extracts intrinsic semantic features from API names and Microsoft's official API definitions based on the LLM's prompt engineering and sentence embedding techniques. Then the API co-occurrence feature encoder is designed, which mines the contextual co-occurrence features of API from API call sequences based on the word2vec. The API semantic features and API co-occurrence features are combined to improve the malware detection performance. Also, it uses TCN-GRU to capture dependencies between API calls. Results on several public datasets show that our method achieves better performance than other methods, and in addition, ablation study results demonstrate the important role of intrinsic semantics in malware detection algorithms.

Comparative Analysis of BAS and PSO in Image Transformation Optimization

2025-03-22T10:23:28+00:00

This paper presents a comparative study between the Particle Swarm Optimization (PSO) algorithm and the Beetle Antennae Search (BAS) algorithm for optimizing image transformations, with a focus on their performance in handling noisy and non-noisy images. Our experiments reveal that BAS consistently achieves better results in terms of pixel change when compared to PSO. The algorithms were evaluated based on their ability to minimize the objective function, which measures the error between the transformed reference image and the target image. Our results demonstrate that both BAS and PSO can effectively optimize image transformations, but BAS consistently outperformed PSO in terms of convergence speed and final objective value. Additional experiments with varying objective functions further validated the robustness and efficiency of BAS in achieving accurate image alignment.

Robust Robotic Arm Calibration combining Multi-Distance Optimization Approach with Lagrange Starfish Optimization Algorithm

2025-04-01T05:51:13+00:00

In response to the limitations of existing robotic parameter calibration methods in terms of computational complexity, convergence speed, data requirements, and accuracy, this study proposes an innovative calibration scheme that combines an improved Lagrangian Starfish Optimization Algorithm (LSFA) with a Support Vector Machine (SVM) algorithm. By incorporating Lagrange interpolation and a multi-dimensional distance metric model (including Mahalanobis distance, Manhattan distance, Chebyshev distance, cosine distance, standardized Euclidean distance, and Euclidean distance), the enhanced starfish optimization algorithm significantly improves global search capabilities and local search accuracy. This effectively addresses issues such as initial value sensitivity, noise, and outliers, with the algorithm specifically designed for kinematic parameter calibration of robotic arms. Furthermore, the improved local search mechanism optimizes the position update strategy of starfish through a weighted system, preventing the algorithm from becoming trapped in local optima. To further enhance the accuracy of dynamic parameter calibration, this study integrates the SVM algorithm into the LSFA framework, proposing the LSFA-SVM method specifically for dynamic parameter calibration of robotic arms. Experiments demonstrate a 38.59% reduction in error compared to traditional SVM. The results indicate that LSFA excels in kinematic calibration of robotic arms, achieving a root mean square error (RMSE) of 0.29 mm, a 29.27% improvement over the traditional Starfish Optimization Algorithm (SFOA). This study provides an efficient and precise solution for robotic parameter calibration in complex environments.

Reimagining Asteroid Risk Assessment: A Comparative Review of Advanced Machine Learning Techniques

2025-04-21T08:35:28+00:00

The escalating discovery rate of Near-Earth Asteroids (NEAs) has intensified the need for advanced computational frameworks capable of evaluating their impact risks with high precision. Traditional machine learning models, while foundational for early NEA classification and trajectory prediction, increasingly falter when confronted with the intricate, high-dimensional dynamics of asteroid motion. This limitation underscores the necessity for sophisticated techniques that reconcile computational efficiency with predictive accuracy across large, multi-dimensional datasets. This review systematically evaluates state-of-the-art machine learning algorithms—including quantum-enhanced models, hybrid quantum-classical frameworks, and lightweight convolutional neural networks (CNNs)—for their efficacy in asteroid risk assessment. By analyzing outcomes from recent studies, we contrast performance metrics such as accuracy, computational cost, and scalability. For instance, Quantum K-Nearest Neighbors (QKNN) demonstrates a 15% accuracy improvement over classical counterparts in high-dimensional data classification, while XGBoost achieves 99.99% precision in asteroid diameter prediction. Lightweight CNNs, such as MobileNetV1, further enable real-time processing on resource-constrained platforms like CubeSats, reducing latency by 30%.