Deep Reinforcement Learning-Based Intelligent Control for Efficiency Enhancement in Thermal Power Plant Fuel Management
DOI:
https://doi.org/10.4108/ew.11848Keywords:
Deep Reinforcement Learning, Thermal Power Plants, Fuel Management, Efficiency Optimization, Multi‑Objective Control, CO₂ EmissionsAbstract
Thermal power plants remain a significant component of global power generation; however, several limitations persist. Hence, this research work has been developed on the basis of a proposed intelligent fuel management system based on Deep Reinforcement Learning techniques with a Proximal Policy Optimization (PPO) algorithm as a step toward increasing efficiency and sustainability of operation of thermal power plants. In this work, a fuel management problem has been formulated as a Markov Decision Process (MDP) environment within which a Deep Reinforcement Learning agent interacts with the boiler–turbine and condenser system using real efficiency data from a thermal power plant. of a thermal power plant. A multi-objective reward function was formulated using a reward shaping strategy, whereby the reward signal is explicitly structured to guide the reinforcement learning agent toward thermodynamically efficient and emission-aware plant operation. The reward formulation maximizes thermal efficiency while penalizing higher heat rate, auxiliary power consumption, and CO₂ emissions. Experimental results demonstrate that the proposed Deep Reinforcement Learning approach outperforms conventional control models. The efficiency level of this system raises from 33.68% to 35.72%, marking a relative improvement of 2.04%, with a lowered auxiliary power demand from 6.08% to 5.73%. More significantly, this optimized policy provides an expected 15-20% reduction in CO₂ emissions and lowers the heat rate from 14,000 kJ/kWh down to 11,000–12,000 kJ/kWh,000 kJ/kWh from previous levels. Convergence has been observed in the rise of episode reward values and reducing loss values during training. The current work marks a fresh start utilizing the power of PPO-Based Deep RL with Multiple Reward design in real-time closed-loop fuel management operations as a highly scalable and adaptable alternative compared to rule-set and traditional supervised learning methods.
Downloads
References
[1] P. Reinforcement learning for efficient power systems planning: A review of operational and expansion strategies. Energies. 2024;17(9):2167.
[2] Dou J, Wen Z. Boiler combustion modeling and optimization based on reinforcement learning algorithm. Discover Applied Sciences. 2025;8(1):39.
[3] Wang, Z., Xue, W., Li, K., Tang, Z., Liu, Y., Zhang, F., ... & Zhou, H. Dynamic combustion optimization of a pulverized coal boiler considering the wall temperature constraints: A deep reinforcement learning-based framework. Applied Thermal Engineering.2025;259, 124923.
[4] Ye J, Wang X, Hua Q, Sun L. Deep reinforcement learning-based energy management of a hybrid electricity-heat-hydrogen energy system with demand response. Energy. 2024;305:131874.
[5] Shuai Q, Yin Y, Huang S, Chen C. Deep reinforcement learning-based real-time energy management for an integrated electric–thermal energy system. Sustainability. 2025;17(2):407.
[6] Franzoso A, Fambri G, Badami M. Deep reinforcement learning as a tool for analysis and optimization of energy flows in multi-energy systems. Energy Conversion and Management. 2025;341:120095.
[7] Li Z, Liu L, Zhao Z, Mu S, Li D, Zhuo Y. Reinforcement learning-enhanced multi-objective optimization for sustainable coal blending in thermal power plants. PLoS ONE. 2025;20(9):e0331208.
[8] Zhang Z, Yuan W, Wang Y, Ou K, Huang Y, Xuan D. Enhanced deep reinforcement learning-based thermal management strategy for PEMFC considering coolant parasitic power. International Journal of Hydrogen Energy. 2025;146:149919.
[9] Li W, Li S, Du C, Xu Y, Xin Q, Yan F. Deep reinforcement learning control for PEMFC thermal management and air supply system. Applied Thermal Engineering. 2025;279:128030.
[10] Podlasek S, Jankowski M, Bałazy P, Lalik K, Figaj R. ANN-based control for performance optimization of a hybrid ORC power plant. Energy. 2024;306:132082.
[11] Chen KY, Chen LS, Chen MC, Lee CL. SVM-based equipment fault detection in a thermal power plant. Computers in Industry. 2011;62:42–50.
[12] Kabengele KT, Olayode IO, Tartibu LK. Performance analysis of a hybrid thermal power plant using adaptive neuro-fuzzy inference systems. Applied Sciences. 2023;13(21):11874.
[13] Perera ATD, Wickramasinghe PU, Nik VM, Scartezzini JL. Introducing reinforcement learning to the energy system design process. Applied Energy. 2020;262:114580.
[14] Stavrev S, Ginchev D. Reinforcement learning techniques for optimizing energy systems. Electronics. 2024;13(8):1459.
[15] Hossain, R. R., Yin, T., Du, Y., Huang, R., Tan, J., Yu, Huang, Q. Efficient learning of power grid voltage control strategies via model-based deep reinforcement learning. Machine Learning. 2023;113(5), 2675-2700.
[16] Mengoni, P., Jiandong, D. S., Zixin, L., & Yun, P. W. P. GenAI avatars in VR: Role of presence, health, and technological factors. Computers & Education: Reality.2026;8, 100141.
[17] Kong X, Abdelbaky MA, Liu X, Lee KY. Stable feedback linearization-based economic MPC for thermal power plants. Energy. 2023;268:126658.
[18] Song, Y., Duan, Y., & Rao, T. (2024). Fault Diagnosis of Power Equipment Based on Improved SVM Algorithm. EAI Endorsed Transactions on Energy Web, 12.
[19] Liu X, Bansal RC. Multi-objective CFD-based optimization of boiler combustion in coal-fired power plants. Applied Energy. 2014;130:658–669.
[20] Gultom E, Nasruddin, Muzhoffar DAF, Sholahudin. Multi-objective genetic algorithm for biomass co-firing power plant optimization. Thermal Science and Engineering Progress. 2025;63:103716.
[21] Xu, X., Chen, Q., Ren, M., Cheng, L., & Xie, J. Combustion optimization for coal fired power plant boilers based on improved distributed ELM and distributed PSO. Energies.2019;12(6), 1036.
[22] Arferiandi YD, Caesarendra W, Nugraha H. Heat rate prediction of combined-cycle power plant using ANN. Sensors. 2021;21(4):1022.
[23] Alabdulhadi, A. A., Rehman, S., Ali, A., & Shafiullah, M. Deep learning framework for wind speed prediction in Saudi Arabia. Neural Computing and Applications.2025;37(5), 3685-3701.
[24] Bernadić A, Kujundžić G, Primorac I. Reinforcement learning in power system control and optimization. B&H Electrical Engineering. 2023;17(1):26–34.
[25] Li, Q., Lin, T., Yu, Q., Du, H., Li, J., & Fu, X. Review of deep reinforcement learning and its application in modern renewable power system control. Energies.2023; 16(10), 4143.
[26] Addo, K., Kabeya, M., & Ojo, E. E. AI-Powered Digital Twin Co-Simulation Framework for Climate-Adaptive Renewable Energy Grids. Energies.2025;18(21), 5593.
[27] Tabas D, Zhang B. Computationally efficient safe reinforcement learning for power systems. arXiv.2022; arXiv:2110.10333.
[28] Jency, A., & Ramar, K. A review of abnormal behaviour detection in crowd for video surveillance: advances and trends, datasets, opportunities and prospects. Expert Systems.2025; 42(4), e70013.
[29] Koprivica B, Zurek S. Separation of rotational power loss components for electrical steels. IEEE Transactions on Magnetics. 2021;57(8):1–12.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Rui Zhu, Qiang Liu, Guofeng Li, Xufeng Hong, Zhenlu Tian, Yanjun Guo, Shengju Hao

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open-access article distributed under the terms of the Creative Commons Attribution CC BY 4.0 license, which permits unlimited use, distribution, and reproduction in any medium so long as the original work is properly cited.