Risk-Aware Reinforcement Learning for Cooperative Autonomous Vehicle Coordination with Adaptive Risk Sensitivity and Multi-Agent Optimization

Malikireddy Ramesh Reddy; Annalakshmi Govindaraj

doi:10.4108/eetiot.10944

Authors

Malikireddy Ramesh Reddy Koneru Lakshmaiah Education Foundation https://orcid.org/0009-0003-0862-5953
Annalakshmi Govindaraj Koneru Lakshmaiah Education Foundation

DOI:

https://doi.org/10.4108/eetiot.10944

Keywords:

Risk-aware reinforcement, cooperative autonomous vehicles, adaptive risk sensitivity, multi-agent reinforcement learning, Bayesian risk modelling, intelligent transportation systems

Abstract

Ensuring safe and efficient coordination of autonomous vehicles (AVs) in intelligent transportation systems is particularly difficult under dense, uncertain, and rapidly changing traffic conditions. Many existing reinforcement learning (RL) methods show good performance in simplified environments but fail to fully account for heterogeneous risk exposure and non-stationary, multi-agent interactions. To address this gap, this paper introduces a Risk-Aware Reinforcement Learning (RARL) framework that couples adaptive risk sensitivity with Bayesian risk estimation in a cooperative multi-agent setting. Within RARL, reward signals are dynamically reshaped using real-time probabilistic risk measures, allowing AV agents to jointly balance safety and traffic efficiency across signalised intersections, multi-lane highways, and roundabout scenarios.

The proposed approach is evaluated using SUMO, CARLA, NGSIM, and INTERACTION benchmarks. Compared with strong multi-agent RL baselines such as Bi-AC, MACPO, and MAPPO-L, RARL achieves up to 30% fewer collisions, about 25% higher throughput, roughly 30% improvement in scenario-recognition accuracy, and around 20% faster training convergence. These empirical results show that explicit and adaptive risk modelling significantly enhances policy robustness, scalability, and cooperative behaviour in heterogeneous traffic. By tightly integrating risk-aware decision making with multi-agent coordination, RARL provides a scalable and practically deployable paradigm for next-generation autonomous driving, improving safety, reliability, and real-time adaptability.

Downloads

Download data is not yet available.

References

[1] Kiran, B. R., Sobh, I., Talpaert, V., Mannion, P., Al Sallab, A. A., Yogamani, S., & Pérez, P. (2022). Deep reinforcement learning for autonomous driving: A survey. IEEE Transactions on Intelligent Transportation Systems, 23(6), 4909–4926. https://doi.org/10.1109/TITS.2021.3054625

[2] Zhuang, H., Lei, C., Chen, Y., & Tan, X. (2023). Cooperative decision-making for mixed traffic at an unsignalized intersection based on multi-agent reinforcement learning. Applied Sciences, 13(8), 5018. https://doi.org/10.3390/app13085018

[3] Candela, E., Doustaly, O., Parada, L., Feng, F., Demiris, Y., & Angeloudis, P. (2023). Risk-aware controller for autonomous vehicles using model-based collision prediction and reinforcement learning. Artificial Intelligence, 320, 103923. https://doi.org/10.1016/j.artint.2023.103923

[4] Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., ... Wierstra, D. (2016). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.

[5] Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G., ... Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489. https://doi.org/10.1038/nature16961

[6] Van Hasselt, H., Guez, A., & Silver, D. (2016). Deep reinforcement learning with double Q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 30, pp. 2094–2100).

[7] Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press.

[8] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. https://doi.org/10.1038/nature14236

[9] Kober, J., Bagnell, J. A., & Peters, J. (2013). Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11), 1238–1274. https://doi.org/10.1177/0278364913495721

[10] Arulkumaran, K., Deisenroth, M. P., Brundage, M., & Bharath, A. A. (2017). Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine, 34(6), 26–38. https://doi.org/10.1109/MSP.2017.2743240

[11] Gu, S., Grudzien Kuba, J., Chen, Y., Du, Y., Yang, L., Knoll, A., & Yang, Y. (2023). Safe multi-agent reinforcement learning for multi-robot control. Artificial Intelligence, 319, 103905. https://doi.org/10.1016/j.artint.2023.103905

[12] Gu, S., Yang, L., Du, Y., Chen, G., Walter, F., Wang, J., & Knoll, A. C. (2022). A review of safe reinforcement learning: Methods, theory and applications. arXiv preprint arXiv:2205.10330.

[13] Hu, J., & Wellman, M. P. (2003). Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research, 4, 1039–1069.

[14] Achiam, J., Held, D., Tamar, A., & Abbeel, P. (2017). Constrained policy optimization. In Proceedings of the 34th International Conference on Machine Learning (pp. 22–31). PMLR.

[15] Altman, E. (1999). Constrained Markov decision processes. Chapman & Hall/CRC.

[16] Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba, W. (2016). OpenAI Gym. arXiv preprint arXiv:1606.01540.

[17] Cai, Z., Cao, H., Lu, W., Zhang, L., & Xiong, H. (2021). Safe multi-agent reinforcement learning through decentralized multiple control barrier functions. arXiv preprint arXiv:2103.12553.

[18] Ding, D., Wei, X., Yang, Z., Wang, Z., & Jovanović, M. (2023). Provably efficient generalized Lagrangian policy optimization for safe multi-agent reinforcement learning. In Learning for Dynamics and Control Conference (pp. 315–332). PMLR.

[19] ElSayed-Aly, I., Bharadwaj, S., Amato, C., Ehlers, R., Topcu, U., & Feng, L. (2021). Safe multi-agent reinforcement learning via shielding. In Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (pp. 483–491).

[20] García, J., & Fernández, F. (2015). A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research, 16(42), 1437–1480.

[21] Gu, S., Chen, G., Zhang, L, Hou, J., Hu, Y., & Knoll, A. (2022). Constrained reinforcement learning for vehicle motion planning with topological reachability analysis. Robotics, 11(4), 81. https://doi.org/10.3390/robotics11040081

[22] Gu, S., Huang, D., Wen, M., Chen, G., & Knoll, A. (2024). Safe multiagent learning with soft constrained policy optimization in real robot control. IEEE Transactions on Industrial Informatics, 20(9), 10706–10716. https://doi.org/10.1109/TII.2024.3391934

[23] Gu, S., Kshirsagar, A., Du, Y., Chen, G., Peters, J., & Knoll, A. (2023). A human-centered safe robot reinforcement learning framework with interactive behaviors. Frontiers in Neurorobotics, 17, 1280341. https://doi.org/10.3389/fnbot.2023.1280341

[24] Inamdar, R., Sundarr, S. K., Khandelwal, D., Sahu, V. D., & Katal, N. (2024). A comprehensive review on safe reinforcement learning for autonomous vehicle control in dynamic environments. e-Prime – Advances in Electrical Engineering, Electronics and Energy, 10, 100810. https://doi.org/10.1016/j.prime.2024.100810

[25] Zhang, Z., Liu, Q., Li, Y, Lin, K., & Li, L. (2024). Safe reinforcement learning in autonomous driving with epistemic uncertainty estimation. IEEE Transactions on Intelligent Transportation Systems, 25(10), 13653–13666. https://doi.org/10.1109/TITS.2024.3397700

[26] Li, G., Yang, Y., Li, S., Qu, X., Lyu, N., & Li, S. E. (2022). Decision making of autonomous vehicles in lane change scenarios: Deep reinforcement learning approaches with risk awareness. Transportation Research Part C: Emerging Technologies, 134, 103452. https://doi.org/10.1016/j.trc.2021.103452

[27] Cao, Z., Xu, S., Jiao, X., Peng, H., & Yang, D. (2022). Trustworthy safety improvement for autonomous driving using reinforcement learning. Transportation Research Part C: Emerging Technologies, 138, 103656. https://doi.org/10.1016/j.trc.2022.103656

[28] Sargam, G. S., & Kalapala, R. (2025). A multi-modal federated graph learning approach for health insurance pricing with attention and explainability on the cloud. In Proceedings of the Third International Conference on Cyber Physical Systems, Power Electronics and Electric Vehicles (ICPEEV 2025) (pp. 1–6). IEEE. https://doi.org/10.1109/ICPEEV67897.2025.11291437

[29] Kalapala, R., & Sargam, G. S. (2025). Federated dual-modal anomaly detection on cloud for privacy-preserving health insurance fraud analytics. In Proceedings of the Third International Conference on Cyber Physical Systems, Power Electronics and Electric Vehicles (ICPEEV 2025) (pp. 1–6). IEEE. https://doi.org/10.1109/ICPEEV67897.2025.11291269

[30] Gorrepati, L. P., Kalapala, R., & Sargam, G. S. (2025). Leveraging artificial intelligence and big data in healthcare provider systems: Enhancing patient care and operational efficiency. In Proceedings of the Third International Conference on Cyber Physical Systems, Power Electronics and Electric Vehicles (ICPEEV 2025) (pp. 1–6). IEEE. https://doi.org/10.1109/ICPEEV67897.2025.11291497

[31] Kalapala, R., & Sargam, G. S. (2025). Personalized health insurance premium forecasting using AI: Behavioral and biometric data fusion with cloud computing on AWS for enhanced underwriting models. In Proceedings of the Third International Conference on Cyber Physical Systems, Power Electronics and Electric Vehicles (ICPEEV 2025) (pp. 1–6). IEEE. https://doi.org/10.1109/ICPEEV67897.2025.11291190

[32] Sargam, G. S., & Kalapala, R. (2025). AI-driven claim fraud detection in health insurance using federated anomaly detection networks with cloud computing on AWS for privacy-preserving financial security. In Proceedings of the Third International Conference on Cyber Physical Systems, Power Electronics and Electric Vehicles (ICPEEV 2025) (pp. 1–6). IEEE. https://doi.org/10.1109/ICPEEV67897.2025.11291290

Risk-Aware Reinforcement Learning for Cooperative Autonomous Vehicle Coordination with Adaptive Risk Sensitivity and Multi-Agent Optimization

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Make a Submission

Scopus

Latest publications

Information