Methodologies enabling Mobile Robots Collective Motion: A Comprehensive Review
DOI:
https://doi.org/10.4108/airo.11922Keywords:
Reinforcement Learning, Mobile Robots Collective Motion, Nature Inspired Algorithms, Self Propelled Particles, SPP algorithms, Learning Based MethodsAbstract
The demand for multi-robotic systems continues to grow. As a result, there is a notable interest among researchers and industry experts to develop control methods or policies to determine how multi-robot system members should cooperate in order to enforce cohesion. These methods can be classified as nature-inspired, self-propelled particles (SPP) based, and learning-based algorithms. This paper presents a comprehensive review of these methods. The main aim is to identify, analyze and discuss the strengths and weaknesses of these methods. Additionally, this paper aims to suggest the optimal control method for enabling effective collective motion of multiple robots and to highlight the identified research gaps and suggest how they can be addressed. Nature inspired algorithms such as ant colony optimization (ACO) are simple and easy to implement when compared to others. However, they require careful parameter tuning for them to operate optimally. On the other hand, self-propelled particles (SPP) based algorithms are decentralized, easy to configure, and produce naturalistic emergent behavior, but suffer from high oscillation when experiencing inaccurate sensor readings and communication delays. Addressing the limitations of nature-inspired and SPP based algorithms should focus on developing control methods with learning abilities. Although learning-based methods are computationally intensive, they are capable of handling sensor inaccuracies and communication latency, making them well suited for the collective motion requirements of mobile robots, particularly in highly dynamic environments. Various strategies, including network pruning, the use of TinyML, and Central Training with Distributed Execution (CTDE), can be employed to optimize learning based methods for robots with limited resources.
Downloads
References
[1] Navarro I, Matia F. A survey of collective movement of mobile robots. International Journal of Advanced Robotic Systems. 2013;10(1):73.
[2] Sasaki T, Biro D. Cumulative culture can emerge from collective intelligence in animal groups. Nature communications. 2017;8(1):15049.
[3] Cho J, Sung J, Yoon J, Lee H. Towards persistent surveillance and reconnaissance using a connected swarm of multiple UAVs. IEEE Access. 2020;8:157906- 17.
[4] Wang Y, Bai P, Liang X, Wang W, Zhang J, Fu Q. Reconnaissance mission conducted by UAV swarms based on distributed PSO path planning algorithms. IEEE access. 2019;7:105086-99.
[5] Niculescu V, Polonelli T, Magno M, Benini L. Ultra- Lightweight Collaborative Mapping for Robot Swarms. arXiv preprint arXiv:240703136. 2024.
[6] Arkin RC, Balch T, et al. Cooperative multiagent robotic systems. Artificial intelligence and mobile robots. 1998:277-95.
[7] Viragh C, Vasarhelyi G, Tarcai N, Szorenyi T, Somorjai G, Nepusz T, et al. Flocking algorithm for autonomous flying robots. Bioinspiration & biomimetics. 2014;9(2):025012.
[8] Moshayedi AJ, Roy AS, Khan ZH, Yang S, Razi A, Andani ME. Path Planning for AGVs: Balancing Computational Efficiency and Optimality. In: 2025 5th International Conference on Conference on Robotics, Automation and Intelligent Control (ICRAIC). IEEE; 2025. p. 1-12.
[9] Moshayedi AJ, Xu D, Sharifdoust M, Khan AS, Khan ZH, Andani ME. Comparative performance analysis of a novel fusion-based algorithm for AGV navigation. Advances in Computational Intelligence. 2025;5(4):1- 31.
[10] Moshayedi AJ, Roy AS, Karan U, jun ZHANG M, Bassir D. An Integrated Beetle Antennae Search– Enabled Navigation Framework for Omnidirectional AGV Mobile Robots in Unknown Environments. EAI Endorsed Transactions on AI and Robotics. 2026;5.
[11] Moshayedi AJ, Li J, Khan AS, Khan ZH, Khoojine AS, Hu J, et al. Navigating the field: SLAM implementation for service robots in sports environment. In: Robotics and Artificial Intelligence in Sports Medicine and Sports Services. Elsevier; 2026. p. 241-75.
[12] Folk S, Paulos J, Kumar V. RotorPy: A Python-based Multirotor Simulator with Aerodynamics for Education and Research. arXiv preprint arXiv:230604485. 2023.
[13] Rohmer E, Singh SPN, Freese M. V-REP: A versatile and scalable robot simulation framework. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems; 2013. p. 1321-6.
[14] Michel O. Cyberbotics ltd. webots™: professional mobile robot simulation. International Journal of Advanced Robotic Systems. 2004;1(1):5.
[15] Mondada F, Gambardella LM, Floreano D, Nolfi S, Deneuborg JL, Dorigo M. The cooperation of swarmbots: Physical interactions in collective robotics. IEEE Robotics & Automation Magazine. 2005;12(2):21-8.
[16] Dorigo M. SWARM-BOT: An experiment in swarm robotics. In: Proceedings 2005 IEEE Swarm Intelligence Symposium, 2005. SIS 2005. IEEE; 2005. p. 192-200.
[17] Giernacki W, Skwierczyński M, Witwicki W, Wroński P, Kozierski P. Crazyflie 2.0 quadrotor as a platform for research and education in robotics and control engineering. In: 2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR); 2017. p. 37-42.
[18] Turgut AE, Gokce F, Celikkanat H, Bayindir L, Sahin E. Kobot: A mobile robot designed specifically for swarm robotics research. Middle East Technical University, Ankara, Turkey, METU-CENG-TR Tech Rep. 2007;5(2007).
[19] Rubio F, Valero F, Llopis-Albert C. A review of mobile robots: Concepts, methods, theoretical framework, and applications. International Journal of Advanced Robotic Systems. 2019;16(2):1729881419839596.
[20] Sanchez-Ibanez JR, Perez-del Pulgar CJ, Garcia-Cerezo A. Path planning for autonomous mobile robots: A review. Sensors. 2021;21(23):7898.
[21] Mohanty PK, Parhi DR. Controlling the motion of an autonomous mobile robot using various techniques: a review. Journal of Advance Mechanical Engineering. 2013;1(1):24-39.
[22] Wang H, Qin J. Pheromone learning-enhanced neural ant colony optimization. Neurocomputing. 2026:133375.
[23] Hameed S, Qolomany B, Belhaouari SB, Abdallah M, Qadir J, Al-Fuqaha A. Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models; 2025. Available from: https://arxiv.org/abs/2504.14126.
[24] Lissaman PB, Shollenberger CA. Formation flight of birds. Science. 1970;168(3934):1003-5.
[25] Heppner FH. Avian flight formations. Bird-banding. 1974;45(2):160-9.
[26] Harel R, Duriez O, Spiegel O, Fluhr J, Horvitz N, Getz WM, et al. Decision-making by a soaring bird: time, energy and risk considerations at different spatio-temporal scales. Philosophical Transactions of the Royal Society B: Biological Sciences. 2016;371(1704):20150397.
[27] FengYing Y, Din A, HuiChao L, Babar M, Ahmad S. Decentralized consensus in robotic swarm for collective collision and avoidance. IEEE Access. 2024;12:72143- 54.
[28] Chu H, Yi J, Yang F. Chaos particle swarm optimization enhancement algorithm for UAV safe path planning. Applied Sciences. 2022;12(18):8977.
[29] Mirjalili S, Mirjalili SM, Lewis A. Grey wolf optimizer. Advances in engineering software. 2014;69:46-61.
[30] Tang J, Liu G, Pan Q. A review on representative swarm intelligence algorithms for solving optimization problems: Applications and trends. IEEE/CAA Journal of Automatica Sinica. 2021;8(10):1627-43.
[31] Can U, Alatas B. Physics Based Metaheuristic Algorithms for Global Optimization. American Journal of Information Science and Computer Engineering. 2015;1(3):94-106.
[32] Jiang Y, Hu T, Huang C, Wu X. An improved particle swarm optimization algorithm. Applied Mathematics and Computation. 2007;193(1):231-9.
[33] Bai Q. Analysis of particle swarm optimization algorithm. Computer and information science. 2010;3(1):180.
[34] Wang G, Li Q, Guo L. Multiple UAVs routes planning based on particle swarm optimization algorithm. In: 2010 2nd International Symposium on Information Engineering and Electronic Commerce. IEEE; 2010. p. 1-5.
[35] Zhang Y, Wu L, Wang S, et al. UCAV path planning by fitness-scaling adaptive chaotic particle swarm optimization. Mathematical Problems in Engineering. 2013;2013.
[36] Mathew TV. Genetic algorithm. Report submitted at IIT Bombay. 2012;53:18-9.
[37] Duan H, Luo Q, Shi Y, Ma G. ? Hybrid particle swarm optimization and genetic algorithm for multi- UAV formation reconfiguration. IEEE Computational intelligence magazine. 2013;8(3):16-27.
[38] Ghamry KA, Kamel MA, Zhang Y. Multiple UAVs in forest fire fighting mission using particle swarm optimization. In: 2017 International conference on unmanned aircraft systems (ICUAS). IEEE; 2017. p. 1404-9.
[39] Wu X, Bai W, Xie Y, Sun X, Deng C, Cui H. A hybrid algorithm of particle swarm optimization, metropolis criterion and RTS smoother for path planning of UAVs. Applied Soft Computing. 2018;73:735-47.
[40] Ahmed G, Sheltami T, Mahmoud A, Yasar A. IoD swarms collision avoidance via improved particle swarm optimization. Transportation Research Part A: Policy and Practice. 2020;142:260-78.
[41] Phung MD, Ha QP. Motion-encoded particle swarm optimization for moving target search using UAVs. Applied Soft Computing. 2020;97:106705.
[42] He W, Qi X, Liu L. A novel hybrid particle swarm optimization for multi-UAV cooperate path planning. Applied Intelligence. 2021;51:7350-64.
[43] Mesquita R, Gaspar PD. A novel path planning optimization algorithm based on particle swarm optimization for UAVs for bird monitoring and repelling. Processes. 2021;10(1):62.
[44] Prasetya DA, Nguyen PT, Faizullin R, Iswanto I, Armay EF. Resolving the shortest path problem using the haversine algorithm. Journal of critical reviews. 2020;7(1):62-4.
[45] Yan M, Yuan H, Xu J, Yu Y, Jin L. Task allocation and route planning of multiple UAVs in a marine environment based on an improved particle swarm optimization algorithm. EURASIP Journal on Advances in Signal Processing. 2021;2021:1-23.
[46] Yu Z, Si Z, Li X, Wang D, Song H. A novel hybrid particle swarm optimization algorithm for path planning of UAVs. IEEE Internet of Things Journal. 2022;9(22):22547-58.
[47] Rutenbar RA. Simulated annealing algorithms: An overview. IEEE Circuits and Devices magazine. 2002;5(1):19-26.
[48] Blum C. Ant colony optimization: Introduction and recent trends. Physics of Life reviews. 2005;2(4):353- 73.
[49] Junger M, Reinelt G, Rinaldi G. The traveling salesman problem. Handbooks in operations research and management science. 1995;7:225-330.
[50] Gaertner D, Clark KL, et al. On Optimal Parameters for Ant Colony Optimization Algorithms. In: IC-AI. Citeseer; 2005. p. 83-9. Dorigo M, Blum C. Ant colony optimization theory: A survey. Theoretical computer science. 2005;344(2- 3):243-78.
[52] Qiannan Z, Ziyang Z, Chen G, Ruyi D. Path planning of UAVs formation based on improved ant colony optimization algorithm. In: Proceedings of 2014 IEEE Chinese Guidance, Navigation and Control Conference. IEEE; 2014. p. 1549-52.
[53] Cekmez U, Ozsiginan M, Sahingoz OK. Multi colony ant optimization for UAV path planning with obstacle avoidance. In: 2016 international conference on unmanned aircraft systems (ICUAS). IEEE; 2016. p. 47- 52.
[54] Yang F, Ji X, Yang C, Li J, Li B. Cooperative search of UAV swarm based on improved ant colony algorithm in uncertain environment. In: 2017 IEEE International Conference on Unmanned Systems (ICUS). Ieee; 2017. p. 231-6.
[55] Rosalie M, Danoy G, Chaumette S, Bouvry P. From random process to chaotic behavior in swarms of UAVs. In: Proceedings of the 6th ACM Symposium on Development and Analysis of Intelligent Vehicular Networks and Applications; 2016. p. 9-15.
[56] Gaspard P, et al. Rossler systems. Encyclopedia of nonlinear science. 2005;231:808-11.
[57] Rosalie M, Dentler JE, Danoy G, Bouvry P, Kannan S, Olivares-Mendez MA, et al. Area exploration with a swarm of UAVs combining deterministic chaotic ant colony mobility with position MPC. In: 2017 International Conference on Unmanned Aircraft Systems (ICUAS). IEEE; 2017. p. 1392-7.
[58] Garcia CE, Prett DM, Morari M. Model predictive control: Theory and practice—A survey. Automatica. 1989;25(3):335-48.
[59] Dentler J, Rosalie M, Danoy G, Bouvry P, Kannan S, Olivares-Mendez MA, et al. Collision avoidance effects on the mobility of a UAV swarm using chaotic ant colony with model predictive control. Journal of Intelligent & Robotic Systems. 2019;93:227-43.
[60] Holkar KS, Waghmare LM. An overview of model predictive control. International Journal of control and automation. 2010;3(4):47-63.
[61] Stolfi DH, Brust MR, Danoy G, Bouvry P. Emerging inter-swarm collaboration for surveillance using pheromones and evolutionary techniques. Sensors. 2020;20(9):2566.
[62] Rosalie M, Kieffer E, Brust MR, Danoy G, Bouvry P. Bayesian optimisation to select Rossler system parameters used in Chaotic Ant Colony Optimisation for Coverage. Journal of computational science. 2020;41:101047.
[63] Kyriakakis NA, Marinaki M, Marinakis Y. A hybrid ant colony optimization-variable neighborhood descent approach for the cumulative capacitated vehicle routing problem. Computers & Operations Research. 2021;134:105397.
[64] Ke L, Feng Z. A two-phase metaheuristic for the cumulative capacitated vehicle routing problem. Computers & Operations Research. 2013;40(2):633-8.
[65] Duarte A, Mladenović N, Sanchez-Oro J, Todosijević R. Variable neighborhood descent. In: Handbook of heuristics. Springer; 2016. p. 1-27.
[66] Liu C, Wu L, Xiao W, Li G, Xu D, Guo J, et al. An improved heuristic mechanism ant colony optimization algorithm for solving path planning. Knowledge-Based Systems. 2023;271:110540.
[67] Lu Y, Ma Y, Wang J, Han L. Task assignment of UAV swarm based on wolf pack algorithm. Applied Sciences. 2020;10(23):8335.
[68] Lambora A, Gupta K, Chopra K. Genetic algorithm- A literature review. In: 2019 international conference on machine learning, big data, cloud and parallel computing (COMITCon). IEEE; 2019. p. 380-4.
[69] Lu Y, Ma Y, Wang J. Multi-population parallel Wolf Pack algorithm for task assignment of UAV swarm. Applied Sciences. 2021;11(24):11996.
[70] Dong L, Yuan X, Yan B, Song Y, Xu Q, Yang X. An improved grey wolf optimization with multistrategy ensemble for robot path planning. Sensors. 2022;22(18):6843.
[71] Wang Z, Zhang J. A task allocation algorithm for a swarm of unmanned aerial vehicles based on bionic wolf pack method. Knowledge-Based Systems. 2022;250:109072.
[72] Jonker R, Volgenant T. Improving the Hungarian assignment algorithm. Operations research letters. 1986;5(4):171-5.
[73] Firmansyah ER, Masruroh SU, Fahrianto F. Comparative analysis of a* and basic theta* algorithm in android-based pathfinding games. In: 2016 6th International Conference on Information and Communication Technology for The Muslim World (ICT4M). IEEE; 2016. p. 275-80.
[74] Jayaweera HM, Hanoun S. A dynamic artificial potential field (D-APF) UAV path planning technique for following ground moving targets. IEEE access. 2020;8:192760-76.
[75] Obadina OO, Thaha MA, Mohamed Z, Shaheed MH. Grey-box modelling and fuzzy logic control of a Leader–Follower robot manipulator system: A hybrid Grey Wolf–Whale Optimisation approach. ISA transactions. 2022;129:572-93.
[76] Mirjalili S, Lewis A. The whale optimization algorithm. Advances in engineering software. 2016;95:51-67.
[77] Ortega R, Loria A, Nicklasson PJ, Sira-Ramirez H. Euler-Lagrange systems. In: Passivity-based Control of Euler-Lagrange Systems: Mechanical, Electrical and Electromechanical Applications. Springer; 1998. p. 15- 37.
[78] Jing X, Hou M, Li W, Chen C, Feng Z, Wang M. Task Parameter Planning Algorithm for UAV Area Complete Coverage in EO Sector Scanning Mode. Aerospace. 2023;10(7):612.
[79] Mladenović N, Hansen P. Variable neighborhood search. Computers & operations research. 1997;24(11):1097-100.
[80] Reynolds CW. Flocks, herds and schools: A distributed behavioral model. In: Proceedings of the 14th annual conference on Computer graphics and interactive techniques; 1987. p. 25-34.
[81] Vicsek T, Zafeiris A. Collective motion. Physics reports. 2012;517(3-4):71-140. Strombom D. Collective motion from local attraction. Journal of Theoretical Biology. 2011;283(1):145- 51. Available from: https://www.sciencedirect. com/science/article/pii/S002251931100261X.
[83] Chen M, Dai F, Wang H, Lei L. DFM: A distributed flocking model for UAV swarm networks. IEEE Access. 2018;6:69141-50.
[84] Dai F, Chen M, Wei X, Wang H. Swarm intelligence-inspired autonomous flocking control in UAV networks. IEEE Access. 2019;7:61786-96.
[85] Hauert S, Leven S, Varga M, Ruini F, Cangelosi A, Zufferey JC, et al. Reynolds flocking in reality with fixed-wing robots: communication range vs. maximum turning rate. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE; 2011. p. 5015-20.
[86] Braga RG, da Silva RC, Ramos ACB, Reynolds CW. Development of a Swarming Algorithm Based on Reynolds Rules to control a group of multi-rotor UAVs using ROS; 2016. Available from: https://api. semanticscholar.org/CorpusID:221093766.
[87] Vasarhelyi G, Viragh C, Somorjai G, Tarcai N, Szorenyi T, Nepusz T, et al. Outdoor flocking and formation flight with autonomous aerial robots. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE; 2014. p. 3866-73.
[88] Muinos-Landin S, Fischer A, Holubec V, Cichos F. Reinforcement learning with artificial microswimmers. Science Robotics. 2021;6(52):eabd9285.
[89] Hamet P, Tremblay J. Artificial intelligence in medicine. Metabolism. 2017;69:S36-40.
[90] ErtelW. Introduction to artificial intelligence. Springer; 2018.
[91] Ramesh A, Kambhampati C, Monson JR, Drew P. Artificial intelligence in medicine. Annals of the Royal College of Surgeons of England. 2004;86(5):334.
[92] Zhang Z, Zhang Z. Artificial neural network. Multivariate time series analysis in climate and environmental research. 2018:1-35.
[93] Sharma S, Sharma S, Athaiya A. Activation functions in neural networks. Towards Data Sci. 2017;6(12):310-6.
[94] Shanmuganathan S. Artificial neural network modelling: An introduction. Springer; 2016.
[95] Nwadiugwu MC. Neural networks, artificial intelligence and the computational brain. arXiv preprint arXiv:210108635. 2020.
[96] Shao F, Shen Z. How can artificial neural networks approximate the brain? Frontiers in psychology. 2023;13:970214.
[97] Wilamowski BM, Yu H. Neural network learning without backpropagation. IEEE Transactions on Neural Networks. 2010;21(11):1793-803.
[98] Ranganathan A. The levenberg-marquardt algorithm. Tutoral on LM algorithm. 2004;11(1):101-10.
[99] More JJ. The Levenberg-Marquardt algorithm: implementation and theory. In: Numerical analysis: proceedings of the biennial Conference held at Dundee, June 28–July 1, 1977. Springer; 2006. p. 105-16.
[100] Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G. The graph neural network model. IEEE transactions on neural networks. 2008;20(1):61-80.
[101] Wu F, Souza A, Zhang T, Fifty C, Yu T, Weinberger K. Simplifying graph convolutional networks. In: International conference on machine learning. Pmlr; 2019. p. 6861-71.
[102] Weiss GH. Random walks and their applications: Widely used as mathematical models, random walks play an important role in several areas of physics, chemistry, and biology. American Scientist. 1983;71(1):65-71.
[103] Chinea A. Understanding the principles of recursive neural networks: A generative approach to tackle model complexity. In: International Conference on Artificial Neural Networks. Springer; 2009. p. 952-63.
[104] Goller C, Kuchler A. Learning task-dependent distributed representations by backpropagation through structure. In: Proceedings of international conference on neural networks (ICNN’96). vol. 1. IEEE; 1996. p. 347-52.
[105] Frasconi P, Gori M, Sperduti A. A general framework for adaptive processing of data structures. IEEE transactions on Neural Networks. 1998;9(5):768-86.
[106] O’shea K, Nash R. An introduction to convolutional neural networks. arXiv preprint arXiv:151108458. 2015.
[107] Chen M, Wei Z, Huang Z, Ding B, Li Y. Simple and deep graph convolutional networks. In: International conference on machine learning. PMLR; 2020. p. 1725- 35.
[108] Scarselli F, Gori M, Tsoi A, Hagenbuchner M, Monfardini G. The Graph Neural Network Model. IEEE transactions on neural networks / a publication of the IEEE Neural Networks Council. 2009 01;20:61-80.
[109] Kortvelesy R, Prorok A. ModGNN: Expert policy approximation in multi-agent systems with a modular graph neural network architecture. In: 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE; 2021. p. 9161-7.
[110] Zhou J, Cheng J, Zhang L, Zhang W. A General Auxiliary Controller for Multi-agent Flocking. In: 2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP). IEEE; 2021. p. 789-94.
[111] Tolstaya EV, Gama F, Paulos J, Pappas G, Kumar VR, Ribeiro A. Learning Decentralized Controllers for Robot Swarms with Graph Neural Networks. ArXiv. 2019;abs/1903.10527. Available from: https://api. semanticscholar.org/CorpusID:85517736.
[112] Chen S, Sun Y, Li P, Zhou L, Lu CT. Spatial Temporal Graph Neural Networks for Decentralized Control of Robot Swarms. In: Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems. SIGSPATIAL ’23. ACM; 2023. p. 1–4. Available from: http://dx.doi.org/10.1145/ 3589132.3625630.
[113] Subasi A. Chapter 3 - Machine learning techniques. In: Subasi A, editor. Practical Machine Learning for Data Analysis Using Python. Academic Press; 2020. p. 91- 202. Available from: https://www.sciencedirect. com/science/article/pii/B9780128213797000035.
[114] Ghojogh B, Ghodsi A. Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey; 2023. Available from: https://arxiv.org/ abs/2304.11461.
[115] Song Y, Fang X, Liu B, Li C, Li Y, Yang SX. A novel foraging algorithm for swarm robotics based on virtual pheromones and neural network. Applied Soft Computing. 2020;90:106156.
[116] Tinoco CR, Oliveira G. Pherocom: decentralised and asynchronous swarm robotics coordination based on virtual pheromone and vibroacoustic communication. arXiv preprint arXiv:220213456. 2022.
[117] Le VT, Ngo TD. Virtual pheromone based network flow control for modular robotic systems. Electronics. 2020;9(3):481.
[118] Na S, Qiu Y, Turgut AE, Ulrich J, Krajnik T, Yue S, et al. Bio-inspired artificial pheromone system for swarm robotics applications. Adaptive Behavior. 2021;29(4):395-415.
[119] Liu T, Sun X, Hu C, Fu Q, Yue S. A multiple pheromone communication system for swarm intelligence. IEEE Access. 2021;9:148721-37.
[120] Na S, Raoufi M, Turgut AE, Krajnik T, Arvin F. Extended artificial pheromone system for swarm robotic applications. In: Artificial life conference proceedings. MIT Press One Rogers Street, Cambridge, MA 02142-1209, USA journals-info . . . ; 2019. p. 608- 15.
[121] Gosrich W, Mayya S, Li R, Paulos J, Yim M, Ribeiro A, et al.. Coverage Control in Multi-Robot Systems via Graph Neural Networks; 2021. Available from: https: //arxiv.org/abs/2109.15278.
[122] Agarwal S, Ribeiro A, Kumar V. Asynchronous Perception-Action-Communication with Graph Neural Networks; 2023. Available from: https://arxiv.org/ abs/2309.10164.
[123] Ghosh A, Sufian A, Sultana F, Chakrabarti A, De D. In: Fundamental Concepts of Convolutional Neural Network; 2020. p. 519-67.
[124] Muandet K, Fukumizu K, Sriperumbudur B, Scholkopf B. Kernel Mean Embedding of Distributions: A Review and Beyond. Foundations and Trends in Machine Learning. 2017;10(1–2):1–141. Available from: http: //dx.doi.org/10.1561/2200000060.
[125] Hussein A, Gaber MM, Elyan E, Jayne C. Imitation learning: A survey of learning methods. ACM Computing Surveys (CSUR). 2017;50(2):1-35.
[126] Tang C, Monteleoni C. On Lloyd’s algorithm: new theoretical insights for clustering in practice. In: Artificial Intelligence and Statistics. PMLR; 2016. p. 1280-9.
[127] Faber V, Gunzburger M. Centroidal Voronoi Tessellations: Applications and Algorithms. Siam Review - SIAM REV. 1999 12;41:637-76.
[128] Agarwal S, Muthukrishnan R, Gosrich W, Kumar V, Ribeiro A. LPAC: Learnable Perception-Action- Communication Loops with Applications to Coverage Control; 2024. Available from: https://arxiv.org/ abs/2401.04855.
[129] Chen S, Sun Y, Li P, Zhou L, Lu CT. Spatial Temporal Graph Neural Networks for Decentralized Control of Robot Swarms. In: Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems; 2023. p. 1-4.
[130] Chen S, Sun Y, Li P, Zhou L, Lu CT. Learning Decentralized Flocking Controllers with Spatio-Temporal Graph Neural Network. In: 2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE; 2024. p. 2596-602.
[131] Liu S, Chang P, Liang W, Chakraborty N, Driggs- Campbell K. Decentralized structural-rnn for robot crowd navigation with deep reinforcement learning. In: 2021 IEEE international conference on robotics and automation (ICRA). IEEE; 2021. p. 3517-24.
[132] Trautman P, Krause A. Unfreezing the robot: Navigation in dense, interacting crowds. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems; 2010. p. 797-803.
[133] Long P, Liu W, Pan J. Deep-learned collision avoidance policy for distributed multiagent navigation. IEEE Robotics and Automation Letters. 2017;2(2):656-63.
[134] Golilarz NA, Gao H, Addeh A, Pirasteh S. ORCA optimization algorithm: A new meta-heuristic tool for complex optimization problems. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). IEEE; 2020. p. 198-204.
[135] Huang Z, Li R, Shin K, Driggs-Campbell K. Learning sparse interaction graphs of partially detected pedestrians for trajectory prediction. IEEE Robotics and Automation Letters. 2021;7(2):1198-205.
[136] Jang E, Gu S, Poole B. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:161101144. 2016.
[137] Deng Z, Gao P, Jose WJ, Reardon C, Wigness M, Rogers J, et al. Coordinated multi-robot navigation with formation adaptation. In: 2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE; 2025. p. 11384-91.
[138] Doya K. Reinforcement learning: Computational theory and biological mechanisms. HFSP journal. 2007;1(1):30.
[139] Mahajan S. Reinforcement learning: A review from a machine learning perspective. International Journal. 2014;4(8).
[140] Jia J, Wang W. Review of reinforcement learning research. In: 2020 35th Youth Academic Annual Conference of Chinese Association of Automation (YAC). IEEE; 2020. p. 186-91.
[141] Lin B. Reinforcement learning and bandits for speech and language processing: Tutorial, review and outlook. Expert Systems with Applications. 2023:122254.
[142] Ssengonzi C, Kogeda OP, Olwal TO. A survey of deep reinforcement learning application in 5G and beyond network slicing and virtualization. Array. 2022;14:100142.
[143] Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, et al. Asynchronous methods for deep reinforcement learning. In: International conference on machine learning. PmLR; 2016. p. 1928-37.
[144] Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O. Proximal policy optimization algorithms. arXiv preprint arXiv:170706347. 2017.
[145] Schulman J, Levine S, Abbeel P, Jordan M, Moritz P. Trust region policy optimization. In: International conference on machine learning. PMLR; 2015. p. 1889- 97.
[146] Lopez-Incera A, Ried K, Muller T, Briegel HJ. Development of swarm behavior in artificial learning agents that adapt to different foraging environments. PLoS One. 2020;15(12):e0243628.
[147] Abpeikar S, Kasmarik K, Garratt M, Hunjet R, Khan MM, Qiu H. Automatic collective motion tuning using actor-critic deep reinforcement learning. Swarm and Evolutionary Computation. 2022;72:101085.
[148] Zhu W, Yu S, Chen H, Gong Z. Review of Application of Model-free Reinforcement Learning in Intelligent Decision. Highlights in Science, Engineering and Technology. 2023;56:315-23.
[149] Xie C, Patil S, Moldovan T, Levine S, Abbeel P. Modelbased reinforcement learning with parametrized physical models and optimism-driven exploration. In: 2016 IEEE international conference on robotics and automation (ICRA). IEEE; 2016. p. 504-11.
[150] Kuvayev L, Sutton RS. Model-based reinforcement learning with an approximate, learned model. In: Proceedings of the ninth Yale workshop on adaptive and learning systems. Citeseer; 1996. p. 101-5.
[151] Neto G. From single-agent to multi-agent reinforcement learning: Foundational concepts and methods. Learn Theory Course 2005; 2;.
[152] Jin R, Chen Z, Lin Y, Song J, Wierman A. Approximate global convergence of independent learning in multiagent systems. arXiv preprint arXiv:240519811. 2024.
[153] Jaquier N, et al. Transfer learning in robotics: an upcoming breakthrough. A review of promises and challenges. 2023.
[154] Yuan L, Zhang Z, Li L, Guan C, Yu Y. A survey of progress on cooperative multi-agent reinforcement learning in open environment. arXiv 2023. arXiv preprint arXiv:231201058.
[155] Liu Q, Guo J, Lin S, Ma S, Zhu J, Li Y. MASQ: Multi- Agent Reinforcement Learning for Single Quadruped Robot Locomotion. arXiv preprint arXiv:240813759. 2024.
[156] Du W, Ding S. A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications. Artificial Intelligence Review. 2021;54(5):3215-38.
[157] Zai A, Brown B. Deep reinforcement learning in action. Manning Publications; 2020.
[158] Hong D, Lee S, Cho YH, Baek D, Kim J, Chang N. Energy-efficient online path planning of multiple drones using reinforcement learning. IEEE Transactions on Vehicular Technology. 2021;70(10):9725-40.
[159] Javaid A. Understanding Dijkstra’s algorithm. Available at SSRN 2340905. 2013.
[160] Wang H, Yu Y, Yuan Q. Application of Dijkstra algorithm in robot path-planning. In: 2011 second international conference on mechanic automation and control engineering. IEEE; 2011. p. 1067-9.
[161] Goh KC, Ng RB, Wong YK, Ho NJ, Chua MC. Aerial filming with synchronized drones using reinforcement learning. Multimedia Tools and Applications. 2021;80:18125-50.
[162] Venturini F, Mason F, Pase F, Chiariotti F, Testolin A, Zanella A, et al. Distributed reinforcement learning for flexible and efficient uav swarm control. IEEE Transactions on Cognitive Communications and Networking. 2021;7(3):955-69.
[163] Kozma R, Alippi C, Choe Y, Morabito FC. Artificial intelligence in the age of neural networks and brain computing. Academic Press; 2018.
[164] Ataur Khalil A, Byrne AJ, Ashiqur Rahman M, Manshaei MH. Efficient UAV Trajectory-Planning using Economic Reinforcement Learning. arXiv e-prints. 2021:arXiv-2103.
[165] Long P, Fan T, Liao X, Liu W, Zhang H, Pan J. Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning. In: 2018 IEEE international conference on robotics and automation (ICRA). IEEE; 2018. p. 6252-9.
[166] Batra S, Huang Z, Petrenko A, Kumar T, Molchanov A, Sukhatme GS. Decentralized control of quadrotor swarms with end-to-end deep reinforcement learning. In: Conference on Robot Learning. PMLR; 2022. p. 576- 86.
[167] Canese L, Cardarilli GC, Di Nunzio L, Fazzolari R, Giardino D, Re M, et al. Multi-agent reinforcement learning: A review of challenges and applications. Applied Sciences. 2021;11(11):4948.
[168] Kapoor S. Multi-agent reinforcement learning: A report on challenges and approaches. arXiv preprint arXiv:180709427. 2018.
[169] Zhao X, Yang R, Zhang Y, Yan M, Yue L. Deep reinforcement learning for intelligent dual-UAV reconnaissance mission planning. Electronics. 2022;11(13):2031.
[170] Singh A, Yang L, Hartikainen K, Finn C, Levine S. Endto- end robotic reinforcement learning without reward engineering. arXiv preprint arXiv:190407854. 2019.
[171] Qi J, Zhou Q, Lei L, Zheng K. Federated reinforcement learning: Techniques, applications, and open challenges. arXiv preprint arXiv:210811887. 2021.
[172] Zhuo HH, Feng W, Lin Y, Xu Q, Yang Q. Federated deep reinforcement learning. arXiv preprint arXiv:190108277. 2019.
[173] Lee W, et al. Federated reinforcement learning-based UAV swarm system for aerial remote sensing. Wireless Communications and Mobile Computing. 2022;2022.
[174] Abpeikar S, Kasmarik K, Garratt M. Iterative transfer learning for automatic collective motion tuning on multiple robot platforms. Frontiers in Neurorobotics. 2023;17:1113991.
[175] Xu Y, Lu X, Fei Y, Huang Y. Iterative self-transfer learning: A general methodology for response time-history prediction based on small dataset. Journal of Computational Design and Engineering. 2022;9(5):2089-102.
[176] Zhong X, Guo S, Shan H, Gao L, Xue D, Zhao N. Feature-based transfer learning based on distribution similarity. IEEE Access. 2018;6:35551-7.
[177] Lowe R, Wu YI, Tamar A, Harb J, Pieter Abbeel O, Mordatch I. Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems. 2017;30.
[178] Wang Z, Guo Y, Li N, Hu S, Wang M. Autonomous collaborative combat strategy of unmanned system group in continuous dynamic environment based on PDMADDPG. Computer Communications. 2023;200:182- 204.
[179] Yin Y, Chen Z, Liu G, Yin J, Guo J. Autonomous navigation of mobile robots in unknown environments using off-policy reinforcement learning with curriculum learning. Expert Systems with Applications. 2024:123202.
[180] Haarnoja T, Zhou A, Abbeel P, Levine S. Soft actorcritic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International conference on machine learning. PMLR; 2018. p. 1861- 70.
[181] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in neural information processing systems. 2017;30.
[182] Hu K, Xu K, Xia Q, Li M, Song Z, Song L, et al. An overview: Attention mechanisms in multi-agent reinforcement learning. Neurocomputing. 2024;598:128015.
[183] Singh CD, He B, Fermuller C, Metzler C, Aloimonos Y. Minimal perception: enabling autonomy in resource constrained robots. Frontiers in Robotics and AI. 2024;11:1431826.
[184] Abdalwhab ABM, Beltrame G, Kahou SE, St-Onge D. Attention-Based Multi-Agent RL for Multi-Machine Tending Using Mobile Robots. AI. 2025;6(10):252.
[185] Abdalwhab A, Beltrame G, Kahou SE, St-Onge D. Learningmulti-agentmulti-machine tending by mobile robots. arXiv preprint arXiv:240816875. 2024.
[186] Liu S, Wu Z. Efficient multi-robot exploration via multi-head attention-based cooperation strategy. arXiv preprint arXiv:191101774. 2019.
[187] Escudie E, Matignon L, Saraydaryan J. Attention graph for multi-robot social navigation with deep reinforcement learning. arXiv preprint arXiv:240117914. 2024.
[188] Jianliang A. A multimodal educational robots driven via dynamic attention. Frontiers in Neurorobotics. 2024;18:1453061.
[189] Bhuyan B, Sarma HKD, Sarma N, Kar A, Mall R. Quality of service (QoS) provisions in wireless sensor networks and related challenges. Wireless Sensor Network. 2010;2(11):861.
[190] Ming J, Xie Z, Teng H. Optimization of A comprehensive dispatching system based on ant colony algorithm and dynamic weight power dispatching strategy. Scientific Reports. 2025;15(1):39441.
[191] Liu PX, Meng M, Ye X, Gu J. An UDP-based protocol for Internet robots. In: Proceedings of the 4th World Congress on Intelligent Control and Automation (Cat. No.02EX527). vol. 1; 2002. p. 59-65 vol.1.
[192] Botta A, Rotbei S, Zinno S, Ventre G. Cyber security of robots: A comprehensive survey. Intelligent Systems with Applications. 2023;18:200237.
[193] Moroncelli A, Pacheco A, Strobel V, Lajoie PY, Dorigo M, Reina A. Byzantine fault detection in Swarm- SLAM using blockchain and geometric constraints. In: International Conference on Swarm Intelligence. Springer; 2024. p. 42-56.
[194] Yaacoub JPA, Noura HN, Salman O, Chehab A. Robotics cyber security: Vulnerabilities, attacks, countermeasures, and recommendations. International Journal of Information Security. 2022;21(1):115-58.
[195] Trujillo JC, Munguia R, Guerra E, Grau A. Visualbased SLAM configurations for cooperative multi-UAV systems with a lead agent: an observability-based approach. Sensors. 2018;18(12):4243.
[196] Trujillo JC, Munguia R, Guerra E, Grau A. Cooperative monocular-based SLAM formulti-UAV systems in GPSdenied environments. Sensors. 2018;18(5):1351.
[197] Itani M, Chen T, Yoshioka T, Gollakota S. Creating speech zones with self-distributing acoustic swarms. Nature Communications. 2023;14(1):5684.
[198] Lee S, Min BC. Distributed control of multi-robot systems in the presence of deception and denial of service attacks. arXiv preprint arXiv:210200098. 2021.
[199] Dev K, Madhwal Y, Shevelo S, Osinenko P, Yanovich Y. SwarmRaft: Leveraging Consensus for Robust Drone Swarm Coordination in GNSS-Degraded Environments. IEEE Internet of Things Journal. 2025.
[200] Chen H, Wen C, Li X. Resilient Multi-Dimensional Consensus and Distributed Optimization against Agent-Based and Denial-of-Service Attacks. arXiv preprint arXiv:251006835. 2025.
[201] Salem MA, Perez MC, Rabia AH. A TinyML Reinforcement Learning Approach for Energy-Efficient Light Control in Low-Cost Greenhouse Systems. In: 2025 Interdisciplinary Conference on Electrics and Computer (INTCEC). IEEE; 2025. p. 1-6.
[202] Alongi F, Ghielmetti N, Pau D, Terraneo F, FornaciariW. Tiny neural networks for environmental predictions: An integrated approach with miosix. In: 2020 IEEE International Conference on Smart Computing (SMARTCOMP). IEEE; 2020. p. 350-5.
[203] Cereda E, Giusti A, Palossi D. Training on the Fly: Ondevice Self-supervised Learning aboard Nano-drones within 20 mW. IEEE Transactions on Computer- Aided Design of Integrated Circuits and Systems. 2024;43(11):3685-95.
[204] Amato C. An introduction to centralized training for decentralized execution in cooperative multiagent reinforcement learning. arXiv preprint arXiv:240903052. 2024.
[205] Attaran M, Celik BG. Digital Twin: Benefits, use cases, challenges, and opportunities. Decision Analytics Journal. 2023;6:100165.
[206] Jawhar I, Mohamed N, Wu J, Al-Jaroodi J. Networking of multi-robot systems: Architectures and requirements. Journal of Sensor and Actuator Networks. 2018;7(4):52.
Downloads
Published
Issue
Section
Categories
License
Copyright (c) 2025 Maikano Kenneth Oganne, Thabo Semong, Dimane Mpoeleng

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.
