https://publications.eai.eu/index.php/airo/issue/feed EAI Endorsed Transactions on AI and Robotics 2025-07-25T17:37:58+00:00 Caitlin Roach publications@eai.eu Open Journal Systems <p>EAI Endorsed Transactions on AI and Robotics (eISSN: 2790-7511) covers all aspects of robotics and knowledge-based AI systems along with interdisciplinary approaches to computer science, control systems, computer vision, machine learning, electrical engineering, intelligent machines, mathematics, and other disciplines. An important goal of this journal is to extend cutting-edge technologies in the control and learning of both symbolic and sensory robots with regard to smart systems. Our journal contains articles on the theoretical, mathematical, computational, and experimental aspects of robotics and intelligent systems.</p> <p><strong>INDEXING</strong>: <a href="https://www.scimagojr.com/journalsearch.php?q=21101254818&amp;tip=sid&amp;clean=0">Scopus</a>, CrossRef, Google Scholar, ProQuest, EBSCO, CNKI, Dimensions</p> https://publications.eai.eu/index.php/airo/article/view/7832 Design and Implementation of a Novel Parallel Algorithm for Efficient Image Compression in High-Performance Computing 2024-11-14T10:54:11+00:00 Hewa Majeed Zangana hewa.zangana@dpu.edu.krd <p class="ICST-abstracttext"><span lang="EN-GB">The focus of this paper is to describe the development and the architecture of a new parallel algorithm targeted for image compression within High Performance Computing context. The suggested algorithm apply parallel processing strategies for the image data set in order to minimize the amount of computation while at the same time optimize the compression ratio and speed. When combining new parallelism strategies with newly developed and existing methods of data compression, the result is visibly better in both, compression ratio and time, as opposed to comparable existing algorithms. Experimental results performed on different HPC environment prove that the solution put forward is quite scalable and efficient; therefore, it should be considered in applications where real time image processing is crucial. It provides a major contribution to the literature in image compression with a special focus on parallel computations.</span></p> 2025-06-30T00:00:00+00:00 Copyright (c) 2025 Hewa Majeed Zangana https://publications.eai.eu/index.php/airo/article/view/7877 Lightweight Keyword Spotting with Inter-Domain Interaction and Attention for Real-Time Voice-Controlled Robotics 2024-11-19T09:51:05+00:00 Hien Vu Pham 20010739@st.phenikaa-uni.edu.vn Thuy Phuong Vu 20010751@st.phenikaa-uni.edu.vn Huong Thi Nguyen 20010741@st.phenikaa-uni.edu.vn Minhhuy Le leminhhuy8886@gmail.com <p>This study introduces a novel lightweight Keyword Spotting (KWS) model optimized for deployment on resource-constrained microcontrollers, with potential applications in robotic control and end-effector operations. The proposed model employs inter-domain interaction to effectively extract features from both Mel-frequency cepstral coefficients (MFCCs) and temporal audio characteristics, complemented by an attention mechanism to prioritize relevant audio segments for enhanced keyword detection. Achieving a 93.70% accuracy on the Google Command v2-12 commands dataset, the model outperforms existing benchmarks. It also demonstrates remarkable efficiency in inference speed (0.359 seconds) and resource utilization (34.9KB peak RAM and 98.7KB flash memory), offering a 3x faster inference time and reduced memory footprint compared to the DS-CNN-S model. These attributes make it particularly suitable for real-time voice command applications in low-power robotic systems, enabling intuitive and responsive control of robotic arms, end-effectors, and navigation systems. In this work, however, the KWS model is demonstrated in a simple non-destructive testing system for controlling sensor movement. This research lays the groundwork for advancing voice-activated robotic technologies on resource-limited hardware platforms.</p> 2025-03-18T00:00:00+00:00 Copyright (c) 2025 Hien Vu Pham, Thuy Phuong Vu, Huong Thi Nguyen, Minhhuy Le https://publications.eai.eu/index.php/airo/article/view/7998 DeepDiabFusion: An Interaction-Aware Neural Network Architecture for Diabetes Prediction 2024-11-30T06:02:24+00:00 Mukhriddin Arabboev mukhriddin.9207@gmail.com Shohruh Begmatov mukhriddin.9207@gmail.com Saidakmal Saydiakbarov mukhriddin.9207@gmail.com Sukhrob Bobojanov mukhriddin.9207@gmail.com Khabibullo Nosirov mukhriddin.9207@gmail.com Jean Chamberlain Chedjou mukhriddin.9207@gmail.com <p>Accurate prediction of diabetes onset is essential for effective early diagnosis and clinical intervention. This study presents a performance analysis of several machine learning (ML) algorithms applied to the Pima Indians Diabetes Dataset (PIDD), with a primary focus on a novel Artificial Neural Network (ANN) architecture, referred to as DeepDiabFusion. The proposed model integrates feature-wise normalization, parallel dense sublayers, and an interaction-aware fusion mechanism to capture complex feature relationships often overlooked by conventional models. Comparative experiments were conducted against seven traditional ML algorithms, including Logistic Regression, Random Forest, and Gradient Boosting, as well as state-of-the-art ANN-based models from recent literature. Performance was evaluated using accuracy, precision, recall, and area under the curve (AUC) metrics. The proposed model achieved an accuracy of 93.04%, precision of 86.21%, recall of 93.10%, and AUC of 0.951—outperforming all baseline and previously reported models. These results demonstrate the superior classification performance and practical applicability of the proposed ANN framework in clinical decision support systems for early diabetes detection and management.</p> 2025-06-04T00:00:00+00:00 Copyright (c) 2025 Mukhriddin Arabboev, Shohruh Begmatov, Saidakmal Saydiakbarov, Sukhrob Bobojanov, Khabibullo Nosirov, Jean Chamberlain Chedjou https://publications.eai.eu/index.php/airo/article/view/8026 Simulation and Control of the KUKA KR6 900EX Robot in Unity 3D: Advancing Industrial Automation through Virtual Environments 2024-12-02T16:22:18+00:00 Anand Ajayakumar Sujatha P2816248@my365.dmu.ac.uk Amin Kolahdooz amin.kolahdooz@dmu.ac.uk Mohammadreza Jafari P2836461@my365.dmu.ac.uk Alireza Hajfathalian amin.kolahdooz@dmu.ac.uk <p>This study presents the development of a virtual simulation of a KUKA robot within the Unity 3D platform, focusing on its ability to execute pick-and-place operations in an industrial setting. The research emphasizes the importance of digital simulations as cost-effective and safe alternatives to physical prototypes in industrial automation. By replicating robotic tasks in a virtual environment, organizations can mitigate wear and tear on expensive machinery and minimize safety hazards inherent in real-world operations. The simulation process commenced with the creation of a detailed 3D model of the KUKA robot utilizing Creo CAD software. This model was subsequently imported into the Unity 3D environment, where an interactive and realistic simulation environment was constructed. A manual control system was implemented through custom C# scripts, enabling precise joint manipulation via keyboard inputs. While the current control mechanism remains manual, this study provides a foundational framework for the future integration of advanced algorithms for trajectory planning and autonomous control. The simulation successfully demonstrates the feasibility of performing industrial robotic tasks within a virtual environment. It serves as a platform for further research, including the automation of robotic movements and the integration of virtual reality and digital twin technologies. These advancements have the potential to significantly enhance real-time monitoring, operator training, and overall operational efficiency in industrial applications. This work underscores the growing significance of virtual simulation technologies in industrial automation, presenting a scalable and flexible solution for prototyping, testing, and training within complex industrial ecosystems.</p> 2025-03-20T00:00:00+00:00 Copyright (c) 2025 Anand Ajayakumar Sujatha, Amin Kolahdooz, Mohammadreza Jafari, Alireza Hajfathalian https://publications.eai.eu/index.php/airo/article/view/8041 Apple Disease Detection and Classification using Random Forest (One-vs-All) 2024-12-04T01:41:51+00:00 Zengming Wen 761135325@qq.com Hong Lan lanhong69@163.com Muhammad Asim Khan m.asimkhattak@gmail.com <p>Fruit diseases detection and recognition are a common problem worldwide. Fruit Disease detection is a hot topic among researchers and is very complex due to structure and color similarity factors. In this research we proposed a new model to detect and classify the apple disease with the help of digital image processing and machine learning. First, image processing techniques were applied to enhance the image contrast and remove the noise, which helped to segment the region of interest accurately also help to extract feature without garbage data. Then K-means clustering technique with fuzzy C-mean method was implemented to segment the images. GLCM feature extraction was used after the segmentation. Real images of apple disease with multi-disease regions were used in the research method. These features were preprocessed with methods like LDA. K-Fold cross validation was used for training and testing, with combination of random forest machine learning method. The result showed high accuracy with comparison of existing techniques.</p> 2025-01-22T00:00:00+00:00 Copyright (c) 2025 Zengming Wen, Hong Lan, Muhammad Asim Khan https://publications.eai.eu/index.php/airo/article/view/8049 Advancing Food Security through Precision Agriculture: YOLOv8’s Role in Efficient Pest Detection and Management 2024-12-04T09:56:28+00:00 Ameer Tamoor Khan atk@plen.ku.dk Sign Marie Jensen smj@plen.ku.dk Noman Khan nomankhan@pieas.edu.pk <p>In response to the growing global population and the consequent need for sustainable food security, effective pest management is critical for enhancing agricultural productivity. This research presents YOLOv8, a state-of-the-art deep learning model optimized for pest detection in agricultural environments, contributing to modern food security efforts. Evaluated using the complex IP102 dataset, YOLOv8 demonstrated notable improvements in pest detection accuracy, achieving scores of 66.9 mAP@0.5 and 42.1 mAP@[0.5:0.95]. These results underscore YOLOv8’s robust performance across diverse detection scenarios, enabling more precise pest control and reducing crop loss. However, in-depth dataset analysis revealed a bias towards larger pests, likely due to bounding box size variations, which presents an opportunity for model improvement. Future work will focus on addressing data imbalances, enhancing sensitivity to smaller pests, and validating YOLOv8 in varied real-world agricultural settings. These advancements are expected to significantly improve pest management practices, ultimately boosting agricultural productivity and supporting global food security through the application of modern agricultural technologies.</p> 2025-01-24T00:00:00+00:00 Copyright (c) 2025 Ameer Tamoor Khan, Sign Marie Jensen, Noman Khan https://publications.eai.eu/index.php/airo/article/view/8051 Enhancing Virtual Reality Experiences in Architectural Visualization of an Academic Environment 2024-12-04T10:12:22+00:00 Abiodun Durojaye Abiodun.r.durojaye@gmail.com Amin Kolahdooz amin.kolahdooz@dmu.ac.uk Alireza Hajfathalian a.hajfathalian@gmail.com <p>Virtual Reality (VR) technology possesses the capability to transport users into immersive, alternative environments, providing them with a convincing sense of presence within a simulated world. This project leverages VR to develop an interactive, educational system centered around the De Montfort University (DMU) Queens Building, simulating key facilities and infrastructure through the integration of 360-degree imagery and Adobe Captivate software. Designed in response to contemporary challenges, such as the COVID-19 pandemic, which underscored the need for flexible and innovative learning methodologies, the VR system offers an immersive educational platform enriched with essential information to enhance student engagement and learning outcomes. A comprehensive literature review explored the expanding applications of VR across diverse sectors, including education, healthcare, robotics, and manufacturing. The findings of this review underscored VR's transformative potential in enhancing educational engagement and facilitating a deeper understanding of complex concepts. The project methodology involved meticulously mapping the physical layout of the Queens Building, capturing targeted 360-degree scenarios using a Ricoh Theta V camera, and subsequently transforming these into immersive VR scenarios enriched with interactive hotspots, meticulously synchronized with the building's design layout. The VR system successfully achieved the project's objectives by simulating key educational and informational use cases. It provides students with an alternative learning medium, offering interactive insights into the functionalities of equipment and facilities within the building. Furthermore, the system enables a virtual tour of the DMU campus, facilitating familiarization with the university environment. Findings from the VR application highlight its potential as a dynamic educational tool, positioning it as a valuable complement to traditional learning methods. This innovative approach demonstrates the capacity of VR to enhance student understanding, support academic and research pursuits, and ultimately enrich the overall student experience.</p> 2025-01-29T00:00:00+00:00 Copyright (c) 2025 Abiodun Durojaye, Amin Kolahdooz, Alireza Hajfathalian https://publications.eai.eu/index.php/airo/article/view/8171 From Social Media Reactions to Grades: A Machine Learning-Based SocialNet Analysis for Academic Performance Prediction 2024-12-12T18:34:53+00:00 Muhammad Ramzan muhmdramzan2023@outlook.com Naeem Ahmed naeem.uoh@gmail.com <p>The impact of social media on student academic performance has garnered significant research interest in recent years. The pervasive use of social networking sites (SNS) among college and university students, both in and outside classrooms, has raised concerns about its potential effects on academic achievement. This study investigates the relationship between social media usage and academic performance through a dataset of 550 participants. Machine learning models, including Random Forest, Decision Trees, and Long Short-Term Memory (LSTM), were employed to analyze and predict the impact of social media on students' academic outcomes. The models were trained using clean and well-engineered data. The results indicate a moderate influence of social media usage on academic performance, with the LSTM model outperforming traditional approaches in predictive accuracy. These findings highlight the importance of considering sequential usage patterns in understanding the academic implications of social media.</p> 2025-03-17T00:00:00+00:00 Copyright (c) 2025 Muhammad Ramzan, Naeem Ahmed https://publications.eai.eu/index.php/airo/article/view/8309 A Multi-Channel Spam Detection System Utilizing Natural Language Processing and Machine Learning 2024-12-29T15:58:56+00:00 Mohini Tyagi tyagimohini7@gmail.com Pradeep Kumar Singh pkscs@mmmut.ac.in Shivam Kumar Yadav sy76076@gmail.com Sanjay Kumar Soni sksoniec@mmmut.ac.in <p>As digital communication rapidly expands, the issue of unsolicited and unwanted messages, commonly known as spam, has become a major concern. This paper introduces an advanced spam detection system that integrates Natural Language Processing (NLP) and Machine Learning (ML) techniques. The system differentiates between spam and legitimate messages by employing a hybrid model that combines Naive Bayes, Support Vector Machines (SVM), and deep learning models like Bidirectional Encoder Representations from<br />Transformers (BERT). The model demonstrates high effectiveness across various communication platforms, including emails, SMS, and social media, achieving an accuracy exceeding 98.5%.</p> 2025-03-18T00:00:00+00:00 Copyright (c) 2025 Mohini Tyagi, Pradeep Kumar Singh, Shivam Kumar Yadav, Sanjay Kumar Soni https://publications.eai.eu/index.php/airo/article/view/8460 A Deep Learning Based Optical Character Recognition Model for Old Turkic 2025-01-17T19:07:16+00:00 Seyed Hossein Taheri s.h.taheri2001@gmail.com Houman Kosarirad hkosarirad2@huskers.unl.edu Isabel Adrover Gallego iadrovergallego2@huskers.unl.edu Nedasadat Taheri nedasadat.taheri1997@gmail.com <p class="ICST-abstracttext"><span lang="EN-GB">This study presents the development and evaluation of a deep learning-based optical character recognition (OCR) model specifically designed for recognizing Old Turkic script. Utilizing a convolutional neural network (CNN), the project aimed to achieve high classification accuracy across a dataset comprising 38 distinct Old Turkic characters. To enhance the model’s robustness and generalization capabilities, sophisticated data augmentation techniques were employed, generating 760 augmented images from the original 38 characters. The model was rigorously trained and validated, achieving an overall ac- curacy of 96.34%. Evaluation metrics such as precision, recall, and F1-scores were systematically analyzed, showing superior performance in most classes while identifying areas for further optimization. The results underscore the effectiveness of CNN architectures in specialized OCR tasks, demonstrating their potential in preserving and digitizing historical scripts. This study not only advances the field of document analysis and OCR but also contributes to the digital preservation and accessibility of ancient scripts.</span></p> 2025-04-10T00:00:00+00:00 Copyright (c) 2025 Seyed Hossein Taheri, Houman Kosarirad, Isabel Adrover Gallego , Nedasadat Taheri https://publications.eai.eu/index.php/airo/article/view/8693 A Comprehensive Approach to Indian Sign Language Recognition: Leveraging LSTM and MediaPipe Holistic for Dynamic and Static Hand Gesture Recognition 2025-02-12T18:31:31+00:00 Prachi Rawat dranujdhiman@gmail.com Papendra Kumar dranujdhiman@gmail.com Vivek Kumar Tamta dranujdhiman@gmail.com Anuj Kumar dranujdhiman@gmail.com <p>Recognizing Indian Sign Language (ISL) gestures effectively is crucial for improving communication accessibility for deaf community. This study introduces an innovative approach that integrates a Sequential Long Short-Term Memory (LSTM) model with MediaPipe Holistic for accurate and real-time gesture recognition. This work outlines a straightforward approach to recognizing Indian Sign Language (ISL) gestures effectively. The process is divided into three steps: Extracting features from data, Cleaning, Labelling and &nbsp;identifying gestures using MediaPipe Holistic. The system tracks landmarks on the face, hands, and body across video frames, capturing essential details such as temporal and spatial features for interpreting gestures. First, the data is cleaned and labeled by removing unclear fuzzy images and null entries. Then after, the processed data is passed into a Sequential LSTM model, which has two LSTM layers and a dense output layer. In the proposed approach, model’s performance is improved by integrating techniques such as early stopping and categorical cross-entropy. The model is trained and tested using a customized ISL dataset that included 11 distinct gestures, and it achieved a high accuracy rate of 96.97%. The framework emphasizes the model's robustness across diverse lighting conditions and real-world scenarios, ensuring its applicability in sectors such as healthcare, education, and public service. By enhancing communication for ISL users, it effectively addresses existing gaps and improves accessibility in these domains.</p> 2025-05-19T00:00:00+00:00 Copyright (c) 2025 Prachi Rawat, Papendra Kumar, Vivek Kumar Tamta, Anuj Kumar https://publications.eai.eu/index.php/airo/article/view/8870 Evaluating Open-Source Vision Language Models for Facial Emotion Recognition Against Traditional Deep Learning Models 2025-03-09T07:59:25+00:00 Vamsi Krishna Mulukutla 21pa1a05a3@vishnu.edu.in Sai Supriya Pavarala 21pa1a05d0@vishnu.edu.in Srinivasa Raju Rudraraju sridevi.b@vishnu.edu.in Sridevi Bonthu sridevi.b@vishnu.edu.in <p class="ICST-abstracttext"><span lang="EN-GB">Facial Emotion Recognition (FER) is crucial for applications such as human-computer interaction and mental health diagnostics. This study presents the first empirical comparison of open-source Vision-Language Models (VLMs), including Phi-3.5 Vision and CLIP, against traditional deep learning models—VGG19, ResNet-50, and EfficientNet-B0—on the challenging FER-2013 dataset, which contains 35,887 low-resolution, grayscale images across seven emotion classes. To address the mismatch between VLM training assumptions and the noisy nature of FER data, we introduce a novel pipeline that integrates GFPGAN-based image restoration with FER evaluation. Results show that traditional models, particularly EfficientNet-B0 (86.44%) and ResNet-50 (85.72%), significantly outperform VLMs like CLIP (64.07%) and Phi-3.5 Vision (51.66%), highlighting the limitations of VLMs in low-quality visual tasks. In addition to performance evaluation using precision, recall, F1-score, and accuracy, we provide a detailed computational cost analysis covering preprocessing, training, inference, and evaluation phases, offering practical insights for deployment. This work underscores the need for adapting VLMs to noisy environments and provides a reproducible benchmark for future research in emotion recognition.</span></p> 2025-08-11T00:00:00+00:00 Copyright (c) 2025 Vamsi Krishna Mulukutla, Sai Supriya Pavarala, Srinivasa Raju Rudraraju, Sridevi Bonthu https://publications.eai.eu/index.php/airo/article/view/8880 A malware detection method based on LLM to mine semantics of API 2025-03-11T00:32:14+00:00 Ronghao Hou hourh@stu2022.jnu.edu.cn Xiaoping Tian txp@bnu.edu.cn Guanggang Geng gggeng@jnu.edu.cn <p>In recent years, the application of the LLM model has played an increasing role in more and more fields, including network security. Some attackers use LLM to attack, generate malicious code for attack, generate phishing emails, and analyze the vulnerability of the software. This also inspires us to utilize LLM to maintain net security. In the past research on malware detection, there were many feature engineering aspects that we had to ask experts to analyze, and this work is very difficult and resource-consuming due to the frequent updates of malware. In this paper, we propose a malware detection method for intrinsic semantics. The method first designs an API intrinsic semantic feature encoder, which extracts intrinsic semantic features from API names and Microsoft's official API definitions based on the LLM's prompt engineering and sentence embedding techniques. Then the API co-occurrence feature encoder is designed, which mines the contextual co-occurrence features of API from API call sequences based on the word2vec. The API semantic features and API co-occurrence features are combined to improve the malware detection performance. Also, it uses TCN-GRU to capture dependencies between API calls. Results on several public datasets show that our method achieves better performance than other methods, and in addition, ablation study results demonstrate the important role of intrinsic semantics in malware detection algorithms.</p> 2025-05-07T00:00:00+00:00 Copyright (c) 2025 Ronghao Hou, Xiaoping Tian, Guanggang Geng https://publications.eai.eu/index.php/airo/article/view/8895 An Autonomous RL Agent Methodology for Dynamic Web UI Testing in a BDD Framework 2025-03-13T07:23:25+00:00 Ali Hassaan Mughal alihassaanmughal@hotmail.com <p>Modern software applications demand efficient and reliable testing methodologies to ensure robust user interface functionality. This paper introduces an autonomous reinforcement learning (RL) agent integrated within a Behavior-Driven Development (BDD) framework to enhance UI testing. By leveraging the adaptive decision-making capabilities of RL, the proposed approach dynamically generates and refines test scenarios aligned with specific business expectations and actual user behavior. A novel system architecture is presented, detailing the state representation, action space, and reward mechanisms that guide the autonomous exploration of UI states. Experimental evaluations on open-source web applications demonstrate significant improvements in defect detection, test coverage, and a reduction in manual testing efforts. This study establishes a foundation for integrating advanced RL techniques with BDD practices, aiming to transform software quality assurance and streamline continuous testing processes.</p> 2025-07-22T00:00:00+00:00 Copyright (c) 2025 Ali Hassaan Mughal https://publications.eai.eu/index.php/airo/article/view/8955 Comparative Analysis of BAS and PSO in Image Transformation Optimization 2025-03-22T10:23:28+00:00 Anik Dwivedi anikdwivedi8055@kgpian.iitkgp.ac.in Ameer Tamoor Khan anikdwivedi8055@kgpian.iitkgp.ac.in Shuai Li shuaili@ieee.org <p>This paper presents a comparative study between the Particle Swarm Optimization (PSO) algorithm and the Beetle Antennae Search (BAS) algorithm for optimizing image transformations, with a focus on their performance in handling noisy and non-noisy images. Our experiments reveal that BAS consistently achieves better results in terms of pixel change when compared to PSO. The algorithms were evaluated based on their ability to minimize the objective function, which measures the error between the transformed reference image and the target image. Our results demonstrate that both BAS and PSO can effectively optimize image transformations, but BAS consistently outperformed PSO in terms of convergence speed and final objective value. Additional experiments with varying objective functions further validated the robustness and efficiency of BAS in achieving accurate image alignment.</p> 2025-05-29T00:00:00+00:00 Copyright (c) 2025 Anik Dwivedi, Ameer Tamoor Khan, Shuai Li https://publications.eai.eu/index.php/airo/article/view/8983 Empowering Universal Robot Programming with Fine-Tuned Large Language Models 2025-03-28T08:25:00+00:00 Tien Dat Le dat.lt19010205@st.phenikaa-uni.edu.vn Minhhuy Le leminhhuy8886@gmail.com <p>LLMs are transforming AI but face challenges in robotics due to domain-specific requirements. This paper explores LLM-generated URScript code for Universal Robots (UR), improving automation accessibility. A fine-tuning dataset of 20,000 synthetic samples, based on 514 validated human-created examples, enhances performance. Using the Unsloth framework, we fine-tune and evaluate the model in real-world scenarios. Results demonstrate LLMs’potential to simplify UR robot programming, highlighting their value in industrial automation. The video demo is available at the following link, and the codebase will be added soon: https://github.com/t1end4t/llm-robotics</p> 2025-07-15T00:00:00+00:00 Copyright (c) 2025 Tien Dat Le, Minhhuy Le https://publications.eai.eu/index.php/airo/article/view/8987 A novel knowledge enhancement method for large-scale natural language training model 2025-03-29T01:16:14+00:00 Qi Han aqiufenga@163.com Gilja So kjso@ysu.ac.kr <p>Knowledge enhancement-based large-scale natural language training model is an advanced language model that combines deep learning and knowledge enhancement. By learning from massive unlabeled data and combining with external knowledge such as knowledge graph, it breaks through the limitations of traditional models in interpretability and reasoning ability. Introducing knowledge into data-driven artificial intelligence model is an important way to realize human-machine hybrid intelligence. However, since most pre-trained models are trained on large-scale unstructured corpus data, the defects in certainty and explainability can be remedied to some extent by introducing external knowledge. To solve the above problems, we present a knowledge-enhanced large-scale natural language training model that integrates deep learning with external knowledge sources (e.g., knowledge graphs) to improve interpretability and reasoning ability. This approach addresses the limitations of traditional models trained on unstructured data by incorporating external knowledge to enhance certainty and explainability. We propose a new knowledge enhancement method and demonstrate its effectiveness through a long text representation model. This model processes structured, knowledge-rich long texts by extracting and integrating knowledge and semantic information at the sentence and document levels. It then fuses these representations to generate an enhanced long text representation. Experiments on legal case matching tasks show that our model significantly outperforms existing methods, highlighting its innovation and practical value.</p> 2025-07-15T00:00:00+00:00 Copyright (c) 2025 Qi Han, Gilja So https://publications.eai.eu/index.php/airo/article/view/9002 Robust Robotic Arm Calibration combining Multi-Distance Optimization Approach with Lagrange Starfish Optimization Algorithm 2025-04-01T05:51:13+00:00 Yongtao Qu ytqu09@gmail.com Zhiqiang Li Etesop0712@outlook.com Long Liao zgscsdysjyqll@163.com Xun Deng dengxunabc@foxmail.com Yuanchang Lin lyc@cigit.ac.cn Tinghui Chen Chenth199208@163.com Linlin Chen chenlinlin1109@outlook.com Jia Liu TuTL30@outook.com Peiyang Wei weipy@cuit.edu.cn Jianhong Gan gjh@cuit.edu.cn ZhenZhen Hu hzzuestc@163.com Can Hu hucan028@outlook.com Yonghong Deng dengyhcd@163.com Wei Li 18482057719@139.com Zhibin Li LiZhibin111@outlook.com <p>In response to the limitations of existing robotic parameter calibration methods in terms of computational complexity, convergence speed, data requirements, and accuracy, this study proposes an innovative calibration scheme that combines an improved Lagrangian Starfish Optimization Algorithm (LSFA) with a Support Vector Machine (SVM) algorithm. By incorporating Lagrange interpolation and a multi-dimensional distance metric model (including Mahalanobis distance, Manhattan distance, Chebyshev distance, cosine distance, standardized Euclidean distance, and Euclidean distance), the enhanced starfish optimization algorithm significantly improves global search capabilities and local search accuracy. This effectively addresses issues such as initial value sensitivity, noise, and outliers, with the algorithm specifically designed for kinematic parameter calibration of robotic arms. Furthermore, the improved local search mechanism optimizes the position update strategy of starfish through a weighted system, preventing the algorithm from becoming trapped in local optima. To further enhance the accuracy of dynamic parameter calibration, this study integrates the SVM algorithm into the LSFA framework, proposing the LSFA-SVM method specifically for dynamic parameter calibration of robotic arms. Experiments demonstrate a 38.59% reduction in error compared to traditional SVM. The results indicate that LSFA excels in kinematic calibration of robotic arms, achieving a root mean square error (RMSE) of 0.29 mm, a 29.27% improvement over the traditional Starfish Optimization Algorithm (SFOA). This study provides an efficient and precise solution for robotic parameter calibration in complex environments.</p> 2025-06-26T00:00:00+00:00 Copyright (c) 2025 Yongtao Qu, Zhiqiang Li, Long Liao, Xun Deng, Yuanchang Lin, Tinghui Chen, Linlin Chen, Jia Liu, Peiyang Wei, Jianhong Gan, ZhenZhen Hu, Can Hu, Yonghong Deng, Wei Li, Zhibin Li https://publications.eai.eu/index.php/airo/article/view/9234 Explainable AI Based Deep Ensemble Convolutional Learning for Multi-Categorical Ocular Disease Prediction 2025-05-03T21:10:58+00:00 Abu Kowshir Bitto abu.kowshir777@gmail.com Rezwana Karim rezwana.karim776@gmail.com Mst Halema Begum akmmasum@yahoo.com Md Fokrul Islam Khan Khan fokrulkhan837@gmail.com Dr. Md. Maruf Hassan drmdmaruf.hassan@seu.edu.bd Prof. Dr. Abdul kadar Muhammad Masum akmmasum@yahoo.com <p class="ICST-abstracttext"><span lang="EN-GB">Diseases of the eye such as diabetic retinopathy, glaucoma, and cataract remain among the leading causes of blindness and vision impairment worldwide. Diagnosis in its early stages followed by early treatment is crucial to preventing permanent loss of vision. Recent advances in Artificial Intelligence (AI), particularly Transfer Learning and Explainable AI (XAI), have proven highly promising in automating the identification of retinal pathologies from medical images. In this paper, we propose an ensemble deep learning approach that integrates four pre-trained convolutional neural networks, i.e., VGG16, MobileNet, DenseNet, and InceptionV3, to classify retinal images into four categories: diabetic retinopathy, glaucoma, cataracts, and normal. The ensemble method leverages the power of multiple models to improve classification accuracy. Additionally, Explainable AI techniques are applied to make the model more interpretable, with visual explanations and insights into AI system decision-making and thereby establishing clinical trust and reliability. The system is evaluated on a new benchmarked eye disease dataset used from Hugging Face, and the results in terms of accuracy and model transparency are encouraging. This research contributes towards developing reliable, explainable, and efficient AI-driven diagnostic systems to assist healthcare professionals in the early detection and management of eye diseases</span></p> 2025-07-28T00:00:00+00:00 Copyright (c) 2025 Abu Kowshir Bitto, Rezwana Karim, Mst Halema Begum, Md Fokrul Islam Khan Khan, Dr. Md. Maruf Hassan, Prof. Dr. Abdul kadar Muhammad Masum https://publications.eai.eu/index.php/airo/article/view/9292 P2PLLMEdge: Peer-to-Peer Framework for Localized Large Language Models using CPU only Resource-Constrained Edge 2025-05-11T14:28:03+00:00 Partha Pratim Ray parthapratimray1986@gmail.com Mohan Pratap Pradhan mppradhan@cus.ac.in <p>In this research, we present P2PLLMEdge, a pioneering peer-to-peer framework designed to enable localized Large Language Models (LLMs) to operate efficiently in resource-constrained edge environments, exemplified by devices such as the Raspberry Pi 4B and CPU-only laptops. The framework addresses critical challenges, including limited computational capacity, network overhead, and scalability, by leveraging lightweight RESTful communication protocols, model-specific quantization, and decentralized task distribution. Key results demonstrate that P2PLLMEdge achieves substantial performance improvements. On average, Peer 2 (CPU-only laptop) achieves a 44.7% reduction in total duration (tpeer2, total = 15.87 × 109 ns) compared to Peer 1 (Raspberry Pi 4B, tpeer1, total = 28.18 × 109 ns). The framework processes tokens at a rate of 21.77 tokens/second on advanced LLMs like Granite3.1-moe:1b, significantly outperforming the baseline. Peer 1, employing quantized LLMs such as smolm2:360m-instruct-q8_0, reduces prompt evaluation duration by 23.2% (tpeer1, prompt_eval = 0.76 × 109 ns) compared to larger models like qwen2.5:0.5binstruct (tpeer1, prompt_eval =0.99 × 109 ns). Peer 2 demonstrates superior summarization capabilities, with evaluation durations (tpeer2, eval) reduced by 72.8% (tpeer2, eval = 5.15 × 109 ns) for explanation-type prompts relative to Peer 1 (tpeer1, eval = 18.93 × 109 ns). The framework also achieves significant network efficiency, reducing inter-peer communication durations by up to 44.9% (tpeer2, network = 25.83 × 109 ns<br />vs. tpeer1, network = 46.92 × 109 ns). Peer-to-peer synergy ensures seamless task execution, where Peer 1 generates text and offloads computationally intensive summarization tasks to Peer 2, achieving a balance between performance and resource utilization. The novelty of P2PLLMEdge lies in its ability to seamlessly integrate lightweight LLMs with decentralized edge devices, achieving advanced natural language processing functionalities entirely on edge devices traditionally deemed unsuitable for such tasks. This framework provides an adaptable, and cost-effective approach for deploying quantized LLM-driven applications. Future directions include scaling the framework to multi-peer environments, optimizing task scheduling algorithms, and exploring integration with heterogeneous LLM-enabled systems. The codes are available on https://github.com/ParthaPRay/peer_to_peer_local_llm_interaction.</p> 2025-07-08T00:00:00+00:00 Copyright (c) 2025 Partha Pratim Ray, Mohan Pratap Pradhan https://publications.eai.eu/index.php/airo/article/view/9344 MSCSO: A Hybrid Nature-Inspired Algorithm for High-Dimensional Traffic Optimization in Urban Environments 2025-05-18T10:30:44+00:00 Kuldeep Vayadande kuldeep.vayadande1@vit.edu Viomesh Kumar Singh kuldeep.vayadande1@vit.edu Amol Bhosle kuldeep.vayadande1@vit.edu Ranjana Gore kuldeep.vayadande1@vit.edu Yogesh Uttamrao Bodhe kuldeep.vayadande1@vit.edu Aditi Bhat kuldeep.vayadande1@vit.edu Zulfikar Charoliya kuldeep.vayadande1@vit.edu Aayush Chavan kuldeep.vayadande1@vit.edu Pranav Bachhav kuldeep.vayadande1@vit.edu Aditya Bhoyar kuldeep.vayadande1@vit.edu <p>Metropolitan regions have experienced higher economical and environmental pressure due to the fasted urbanization leading to increased traffic jams that necessitate the use of higher optimization techniques. Traditional traffic models do not usually take large-dimensional and dynamicity of urban mobility into consideration and require extraordinary computational approaches. Modified Sand Cat Swarm Optimization (MSCSO) improves the Sand Cat Swarm Optimization (SCSO) algorithm that adds Levy flights to global exploration and roulette wheel selection to adaptive exploitation to solve problems that are complex and high-dimensional. When used in urban traffic management, MSCSO works with enormous volumes of traffic, speed, weather, and incident, all of which may decrease Travel Time Index by 15 percent during rush hours. Benchmark tests are used to prove that MSCSO is better, scoring 0.0 in Sphere, Ackley and Rastrigin functions, and 28.0753 in Rosenbrock, whereas higher scores belong to Particle Swarm Optimization, Genetic Algorithms, Ant Colony Optimization and SCSO (e.g., 46). It supports urban planning, since a Flask-based web interface has the possibility to input and visualize real time traffic data in a simple way. The success of MSCSO is reliant on high-quality data and hardware-friendly algorithms but can scale to use real-time data sources, such as from GPS, machine learning traffic projections, and cloud hosting, and is of potential use in logistics, energy delivery, and resource assignment.</p> 2025-07-11T00:00:00+00:00 Copyright (c) 2025 Kuldeep Vayadande, Viomesh Kumar Singh, Amol Bhosle, Ranjana Gore, Yogesh Uttamrao Bodhe, Aditi Bhat, Zulfikar Charoliya, Aayush Chavan, Pranav Bachhav, Aditya Bhoyar https://publications.eai.eu/index.php/airo/article/view/9532 AI-Powered Predictive Analytics for Financial Risk Management in U.S. Markets 2025-06-12T08:40:34+00:00 Md Zikar Hossan zikarhosen@gmail.com Muslima Begom Riipa mbriipa@gmail.com Md Azhad Hossain azhad17@gmail.com Sweety Rani Dhar Sweetyranidhar@gmail.com Al Modabbir Zaman almodabbirzaman48@gmail.com Mohammad Hossain mhossain.eee@gmail.com Arif Hossen arifhossen4295@gmail.com Hasan Mahmud Sozib sozib2019@gmail.com <p class="ICST-abstracttext"><span lang="EN-GB">In the fast-changing environment of financial complexity, efficient risk management is vital for economic stability as well as for growth. In this study, we present a robust AI-powered predictive analytics framework to improve financial risk classification in U.S. markets. The framework utilizes advanced machine learning techniques, a hybrid CatBoost and SVM model that allows it to solve challenges like class imbalance in a high-dimensional dataset while maintaining interpretable models. To probe errors, we use techniques such as Principal Component Analysis (PCA) and Synthetic Minority Oversampling Technique (SMOTE) for data quality and fairness in classification. Comprehensive experiments on a financial risk dataset are conducted to evaluate the framework at which it achieves high accuracy (95.93%) and F1-score (0.95) when compared to traditional machine learning models such as Logistic Regression and Random Forest. Furthermore, a feature importance analysis identifies important predictors of financial risk such as Total Debt-to-Income Ratio, Loan Duration, and Interest Rate, providing actionable on decision-making. Additionally, the proposed approach is not only highly scalable but it is also interpretable and adaptable to the dynamic demands of financial institutions. This study serves as a benchmark for predicting analytics for dealing with risk-associated challenges, leading to informed decision-making to ensure economic stability by integrating AI and machine learning in financial systems. </span></p> 2025-08-22T00:00:00+00:00 Copyright (c) 2025 Md Zikar Hossan, Muslima Begom Riipa, Md Azhad Hossain, Sweety Rani Dhar, Al Modabbir Zaman, Mohammad Hossain, Arif Hossen, Hasan Mahmud Sozib https://publications.eai.eu/index.php/airo/article/view/9563 An adaptive traditional Chinese herbal medicine image recognition model via ED-HLOA-optimized DenseNet201 2025-06-17T09:10:51+00:00 Peiyang Wei weipy@cuit.edu.cn Rundong Zou aliwa8168@gmail.com Guangdong Dong 1007809206@qq.com Wen Qin qinwen@sicnu.edu.cn Jianhong Gan gjh@cuit.edu.cn Oifeng Su 1007809206@qq.com Zhibin Li gjh@cuit.edu.cn <p class="ICST-abstracttext"><span lang="EN-GB">Research shows that ginseng, fritillaria cirrhosa and other Chinese herbal medicines and their active components can fight tumors via immune regulation, apoptosis induction, and signaling pathway modulation. Thus, deep learning-based for authentic Chinese herbal medicine identification and classification is gaining more attention. Although convolutional neural network (CNN)-based image recognition models have made progress in CHB recognition, they often face limitations such as simple structures, fixed parameters, and a singular optimization approach, primarily relying on learning rate adjustments, which impede achieving the high accuracy required for image recognition of CHB. To address this issue, this study proposes an elite differential mutation-based horned lizard optimization algorithm (ED-HLOA) and applies it to optimize a DenseNet201-based recognition model for CHB. It enables adaptive adjustments of the learning rate and compression factor for DenseNet201. Empirical studies on the dataset collected from practical application demonstrates that the ED-HLOA-optimized DenseNet201 model achieves high accuracy in CHB image classification, verifying the effectiveness of the algorithm. Compared with several state-of-the-art optimization algorithms, ED-HLOA performs well on both the training and verification sets, effectively avoiding overfitting.</span></p> 2025-07-23T00:00:00+00:00 Copyright (c) 2025 Peiyang Wei, Rundong Zou, Guangdong Dong, Wen Qin, Jianhong Gan, Oifeng Su, Zhibin Li https://publications.eai.eu/index.php/airo/article/view/9795 An Explainable AI Based Deep Ensemble Transformer Framework for Gastrointestinal Disease Prediction from Endoscopic Images 2025-07-25T17:37:58+00:00 Prof. Dr. Abdul kadar Muhammad Masum akmmasum@yahoo.com Abu Kowshir Bitto abu.kowshir777@gmail.com Shafiqul Islam Talukder shafiqul.cse2017@gmail.com Md Fokrul Islam Khan fokrulkhan837@gmail.com Mohammed Shamsul Alam alam_cse@yahoo.com Khandaker Mohammad Mohi Uddin mohiuddin.kh@seu.edu.bd <p class="ICST-abstracttext"><span lang="EN-GB">Gastrointestinal diseases such as gastroesophageal reflux disease (GERD) and polyps remain prevalent and challenging to diagnose accurately due to overlapping visual features and inconsistent endoscopic image quality. In this study, we investigate the application of transformer-based deep learning models—Vision Transformer (ViT), Swin Transformer, and a novel Ensemble Transformer model—for classifying four categories: GERD, GERD Normal, Polyp, and Polyp Normal from endoscopic images. The dataset was curated and collected in collaboration with Zainul Haque Sikder Women's Medical College &amp; Hospital, ensuring high-quality clinical annotations. All models were evaluated using precision, recall, F1 score, and overall classification accuracy. Our proposed Ensemble Transformer model, which fuses the outputs of ViT and Swin Transformer, achieved superior performance by delivering well-balanced F1 scores across all classes, reducing misclassification, and improving robustness with an overall accuracy of 87%. Furthermore, we incorporated explainable AI (XAI) techniques such as Grad-CAM and Grad-CAM++ to generate visual explanations of the model’s predictions, enhancing interpretability for clinical validation. This work demonstrates the potential of integrating global and local attention mechanisms along with XAI in building reliable, real-time, AI-assisted diagnostic support systems for gastrointestinal disorders, particularly in resource-limited healthcare settings.</span></p> 2025-08-25T00:00:00+00:00 Copyright (c) 2025 Prof. Dr. Abdul kadar Muhammad Masum, Abu Kowshir Bitto, Shafiqul Islam Talukder, Md Fokrul Islam Khan, Mohammed Shamsul Alam, Khandaker Mohammad Mohi Uddin https://publications.eai.eu/index.php/airo/article/view/8945 Cutting-Edge Techniques for Detecting Fake Reviews 2025-03-20T07:48:36+00:00 Kuldeep Vayadande amey.kharade231@vit.edu Amit Mishra amey.kharade231@vit.edu Gajanan R. Patil amey.kharade231@vit.edu Yogesh Bodhe amey.kharade231@vit.edu Pavitha Nooji amey.kharade231@vit.edu Ninad Kale amey.kharade231@vit.edu Anish Katariya amey.kharade231@vit.edu Amey Kharade amey.kharade231@vit.edu Parth Supekar amey.kharade231@vit.edu Lalit Patil amey.kharade231@vit.edu <p>The paper reviews various approaches for detecting fake reviews using different machine learning techniques, each with distinct strengths and limitations. It examines existing literature on supervised learning methods, unsupervised techniques, graph-based models, and hybrid approaches. Among these, unsupervised models rely on pattern recognition, while supervised methods, including SVM and transformer-based models like BERT, offer high accuracy but struggle with class imbalance and computational efficiency. Unsupervised and graph-based models serve as effective alternatives when labeled data is scarce or when complex relationships between reviews and users must be analyzed. Additionally, hybrid approaches that integrate multiple techniques are gaining traction, as they enhance feature selection and model performance. In this paper, we explore different methodologies for fake review classification, analyze their advantages and drawbacks, and highlight key challenges in the field.</p> 2025-07-09T00:00:00+00:00 Copyright (c) 2025 Kuldeep Vayadande, Amit Mishra, Gajanan R. Patil, Yogesh Bodhe, Pavitha Nooji, Ninad Kale, Anish Katariya, Amey Kharade, Parth Supekar, Lalit Patil https://publications.eai.eu/index.php/airo/article/view/9142 Reimagining Asteroid Risk Assessment: A Comparative Review of Advanced Machine Learning Techniques 2025-04-21T08:35:28+00:00 Kuldeep Vayadande kuldeep.vayadande1@vit.edu Dnyaneshwar M. Bavkar kuldeep.vayadande1@vit.edu Ishwari Rohit Raskar kuldeep.vayadande1@vit.edu Umar Mubarak Mulani kuldeep.vayadande1@vit.edu Jyoti Kanjalkar kuldeep.vayadande1@vit.edu Rajashree Tukaram Gadhave kuldeep.vayadande1@vit.edu Preeti Bailke kuldeep.vayadande1@vit.edu Yogesh Bodhe bodheyog@gmail.com Ajit R. Patil kuldeep.vayadande1@vit.edu <p>The escalating discovery rate of Near-Earth Asteroids (NEAs) has intensified the need for advanced computational frameworks capable of evaluating their impact risks with high precision. Traditional machine learning models, while foundational for early NEA classification and trajectory prediction, increasingly falter when confronted with the intricate, high-dimensional dynamics of asteroid motion. This limitation underscores the necessity for sophisticated techniques that reconcile computational efficiency with predictive accuracy across large, multi-dimensional datasets. This review systematically evaluates state-of-the-art machine learning algorithms—including quantum-enhanced models, hybrid quantum-classical frameworks, and lightweight convolutional neural networks (CNNs)—for their efficacy in asteroid risk assessment. By analyzing outcomes from recent studies, we contrast performance metrics such as accuracy, computational cost, and scalability. For instance, Quantum K-Nearest Neighbors (QKNN) demonstrates a 15% accuracy improvement over classical counterparts in high-dimensional data classification, while XGBoost achieves 99.99% precision in asteroid diameter prediction. Lightweight CNNs, such as MobileNetV1, further enable real-time processing on resource-constrained platforms like CubeSats, reducing latency by 30%.</p> 2025-06-02T00:00:00+00:00 Copyright (c) 2025 Kuldeep Vayadande, Dnyaneshwar M. Bavkar, Ishwari Rohit Raskar, Umar Mubarak Mulani, Jyoti Kanjalkar, Rajashree Tukaram Gadhave, Preeti Bailke, Yogesh Bodhe, Ajit R. Patil