Risk-Aware Reinforcement Learning for Cooperative Autonomous Vehicle Coordination with Adaptive Risk Sensitivity and Multi-Agent Optimization
DOI:
https://doi.org/10.4108/eetiot.10944Keywords:
Risk-aware reinforcement, cooperative autonomous vehicles, adaptive risk sensitivity, multi-agent reinforcement learning, Bayesian risk modelling, intelligent transportation systemsAbstract
Ensuring safe and efficient coordination of autonomous vehicles (AVs) in intelligent transportation systems is particularly difficult under dense, uncertain, and rapidly changing traffic conditions. Many existing reinforcement learning (RL) methods show good performance in simplified environments but fail to fully account for heterogeneous risk exposure and non-stationary, multi-agent interactions. To address this gap, this paper introduces a Risk-Aware Reinforcement Learning (RARL) framework that couples adaptive risk sensitivity with Bayesian risk estimation in a cooperative multi-agent setting. Within RARL, reward signals are dynamically reshaped using real-time probabilistic risk measures, allowing AV agents to jointly balance safety and traffic efficiency across signalised intersections, multi-lane highways, and roundabout scenarios.
The proposed approach is evaluated using SUMO, CARLA, NGSIM, and INTERACTION benchmarks. Compared with strong multi-agent RL baselines such as Bi-AC, MACPO, and MAPPO-L, RARL achieves up to 30% fewer collisions, about 25% higher throughput, roughly 30% improvement in scenario-recognition accuracy, and around 20% faster training convergence. These empirical results show that explicit and adaptive risk modelling significantly enhances policy robustness, scalability, and cooperative behaviour in heterogeneous traffic. By tightly integrating risk-aware decision making with multi-agent coordination, RARL provides a scalable and practically deployable paradigm for next-generation autonomous driving, improving safety, reliability, and real-time adaptability.
Downloads
References
[1] Kiran, B. R., Sobh, I., Talpaert, V., Mannion, P., Al Sallab, A. A., Yogamani, S., & Pérez, P. (2022). Deep reinforcement learning for autonomous driving: A survey. IEEE Transactions on Intelligent Transportation Systems, 23(6), 4909–4926. https://doi.org/10.1109/TITS.2021.3054625
[2] Zhuang, H., Lei, C., Chen, Y., & Tan, X. (2023). Cooperative decision-making for mixed traffic at an unsignalized intersection based on multi-agent reinforcement learning. Applied Sciences, 13(8), 5018. https://doi.org/10.3390/app13085018
[3] Candela, E., Doustaly, O., Parada, L., Feng, F., Demiris, Y., & Angeloudis, P. (2023). Risk-aware controller for autonomous vehicles using model-based collision prediction and reinforcement learning. Artificial Intelligence, 320, 103923. https://doi.org/10.1016/j.artint.2023.103923
[4] Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., ... Wierstra, D. (2016). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.
[5] Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G., ... Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489. https://doi.org/10.1038/nature16961
[6] Van Hasselt, H., Guez, A., & Silver, D. (2016). Deep reinforcement learning with double Q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 30, pp. 2094–2100).
[7] Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press.
[8] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. https://doi.org/10.1038/nature14236
[9] Kober, J., Bagnell, J. A., & Peters, J. (2013). Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11), 1238–1274. https://doi.org/10.1177/0278364913495721
[10] Arulkumaran, K., Deisenroth, M. P., Brundage, M., & Bharath, A. A. (2017). Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine, 34(6), 26–38. https://doi.org/10.1109/MSP.2017.2743240
[11] Gu, S., Grudzien Kuba, J., Chen, Y., Du, Y., Yang, L., Knoll, A., & Yang, Y. (2023). Safe multi-agent reinforcement learning for multi-robot control. Artificial Intelligence, 319, 103905. https://doi.org/10.1016/j.artint.2023.103905
[12] Gu, S., Yang, L., Du, Y., Chen, G., Walter, F., Wang, J., & Knoll, A. C. (2022). A review of safe reinforcement learning: Methods, theory and applications. arXiv preprint arXiv:2205.10330.
[13] Hu, J., & Wellman, M. P. (2003). Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research, 4, 1039–1069.
[14] Achiam, J., Held, D., Tamar, A., & Abbeel, P. (2017). Constrained policy optimization. In Proceedings of the 34th International Conference on Machine Learning (pp. 22–31). PMLR.
[15] Altman, E. (1999). Constrained Markov decision processes. Chapman & Hall/CRC.
[16] Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba, W. (2016). OpenAI Gym. arXiv preprint arXiv:1606.01540.
[17] Cai, Z., Cao, H., Lu, W., Zhang, L., & Xiong, H. (2021). Safe multi-agent reinforcement learning through decentralized multiple control barrier functions. arXiv preprint arXiv:2103.12553.
[18] Ding, D., Wei, X., Yang, Z., Wang, Z., & Jovanović, M. (2023). Provably efficient generalized Lagrangian policy optimization for safe multi-agent reinforcement learning. In Learning for Dynamics and Control Conference (pp. 315–332). PMLR.
[19] ElSayed-Aly, I., Bharadwaj, S., Amato, C., Ehlers, R., Topcu, U., & Feng, L. (2021). Safe multi-agent reinforcement learning via shielding. In Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (pp. 483–491).
[20] García, J., & Fernández, F. (2015). A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research, 16(42), 1437–1480.
[21] Gu, S., Chen, G., Zhang, L, Hou, J., Hu, Y., & Knoll, A. (2022). Constrained reinforcement learning for vehicle motion planning with topological reachability analysis. Robotics, 11(4), 81. https://doi.org/10.3390/robotics11040081
[22] Gu, S., Huang, D., Wen, M., Chen, G., & Knoll, A. (2024). Safe multiagent learning with soft constrained policy optimization in real robot control. IEEE Transactions on Industrial Informatics, 20(9), 10706–10716. https://doi.org/10.1109/TII.2024.3391934
[23] Gu, S., Kshirsagar, A., Du, Y., Chen, G., Peters, J., & Knoll, A. (2023). A human-centered safe robot reinforcement learning framework with interactive behaviors. Frontiers in Neurorobotics, 17, 1280341. https://doi.org/10.3389/fnbot.2023.1280341
[24] Inamdar, R., Sundarr, S. K., Khandelwal, D., Sahu, V. D., & Katal, N. (2024). A comprehensive review on safe reinforcement learning for autonomous vehicle control in dynamic environments. e-Prime – Advances in Electrical Engineering, Electronics and Energy, 10, 100810. https://doi.org/10.1016/j.prime.2024.100810
[25] Zhang, Z., Liu, Q., Li, Y, Lin, K., & Li, L. (2024). Safe reinforcement learning in autonomous driving with epistemic uncertainty estimation. IEEE Transactions on Intelligent Transportation Systems, 25(10), 13653–13666. https://doi.org/10.1109/TITS.2024.3397700
[26] Li, G., Yang, Y., Li, S., Qu, X., Lyu, N., & Li, S. E. (2022). Decision making of autonomous vehicles in lane change scenarios: Deep reinforcement learning approaches with risk awareness. Transportation Research Part C: Emerging Technologies, 134, 103452. https://doi.org/10.1016/j.trc.2021.103452
[27] Cao, Z., Xu, S., Jiao, X., Peng, H., & Yang, D. (2022). Trustworthy safety improvement for autonomous driving using reinforcement learning. Transportation Research Part C: Emerging Technologies, 138, 103656. https://doi.org/10.1016/j.trc.2022.103656
[28] Sargam, G. S., & Kalapala, R. (2025). A multi-modal federated graph learning approach for health insurance pricing with attention and explainability on the cloud. In Proceedings of the Third International Conference on Cyber Physical Systems, Power Electronics and Electric Vehicles (ICPEEV 2025) (pp. 1–6). IEEE. https://doi.org/10.1109/ICPEEV67897.2025.11291437
[29] Kalapala, R., & Sargam, G. S. (2025). Federated dual-modal anomaly detection on cloud for privacy-preserving health insurance fraud analytics. In Proceedings of the Third International Conference on Cyber Physical Systems, Power Electronics and Electric Vehicles (ICPEEV 2025) (pp. 1–6). IEEE. https://doi.org/10.1109/ICPEEV67897.2025.11291269
[30] Gorrepati, L. P., Kalapala, R., & Sargam, G. S. (2025). Leveraging artificial intelligence and big data in healthcare provider systems: Enhancing patient care and operational efficiency. In Proceedings of the Third International Conference on Cyber Physical Systems, Power Electronics and Electric Vehicles (ICPEEV 2025) (pp. 1–6). IEEE. https://doi.org/10.1109/ICPEEV67897.2025.11291497
[31] Kalapala, R., & Sargam, G. S. (2025). Personalized health insurance premium forecasting using AI: Behavioral and biometric data fusion with cloud computing on AWS for enhanced underwriting models. In Proceedings of the Third International Conference on Cyber Physical Systems, Power Electronics and Electric Vehicles (ICPEEV 2025) (pp. 1–6). IEEE. https://doi.org/10.1109/ICPEEV67897.2025.11291190
[32] Sargam, G. S., & Kalapala, R. (2025). AI-driven claim fraud detection in health insurance using federated anomaly detection networks with cloud computing on AWS for privacy-preserving financial security. In Proceedings of the Third International Conference on Cyber Physical Systems, Power Electronics and Electric Vehicles (ICPEEV 2025) (pp. 1–6). IEEE. https://doi.org/10.1109/ICPEEV67897.2025.11291290
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Malikireddy Ramesh Reddy, Annalakshmi Govindaraj

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open-access article distributed under the terms of the Creative Commons Attribution CC BY 4.0 license, which permits unlimited use, distribution, and reproduction in any medium so long as the original work is properly cited.
