Deep Reinforcement Learning-Based Intelligent Control for Efficiency Enhancement in Thermal Power Plant Fuel Management

Authors

  • Rui Zhu Shandong Energy Group Lingtai Thermal Power Generation Co., Ltd
  • Qiang Liu Shandong Energy Group Lingtai Thermal Power Generation Co., Ltd
  • Guofeng Li Shandong Energy Group Lingtai Thermal Power Generation Co., Ltd
  • Xufeng Hong Shandong Energy Group Lingtai Thermal Power Generation Co., Ltd
  • Zhenlu Tian Shandong Energy Group Lingtai Thermal Power Generation Co. Ltd
  • Yanjun Guo Xi’an Thermal Power Research Institute Co., Ltd
  • Shengju Hao Xi'an YTRG Co., Ltd

DOI:

https://doi.org/10.4108/ew.11848

Keywords:

Deep Reinforcement Learning, Thermal Power Plants, Fuel Management, Efficiency Optimization, Multi‑Objective Control, CO₂ Emissions

Abstract

Thermal power plants remain a significant component of global power generation; however, several limitations persist. Hence, this research work has been developed on the basis of a proposed intelligent fuel management system based on Deep Reinforcement Learning techniques with a Proximal Policy Optimization (PPO) algorithm as a step toward increasing efficiency and sustainability of operation of thermal power plants. In this work, a fuel management problem has been formulated as a Markov Decision Process (MDP) environment within which a Deep Reinforcement Learning agent interacts with the boiler–turbine and condenser system using real efficiency data from a thermal power plant. of a thermal power plant. A multi-objective reward function was formulated using a reward shaping strategy, whereby the reward signal is explicitly structured to guide the reinforcement learning agent toward thermodynamically efficient and emission-aware plant operation. The reward formulation maximizes thermal efficiency while penalizing higher heat rate, auxiliary power consumption, and CO₂ emissions. Experimental results demonstrate that the proposed Deep Reinforcement Learning approach outperforms conventional control models. The efficiency level of this system raises from 33.68% to 35.72%, marking a relative improvement of 2.04%, with a lowered auxiliary power demand from 6.08% to 5.73%. More significantly, this optimized policy provides an expected 15-20% reduction in CO₂ emissions and lowers the heat rate from 14,000 kJ/kWh down to 11,000–12,000 kJ/kWh,000 kJ/kWh from previous levels. Convergence has been observed in the rise of episode reward values and reducing loss values during training. The current work marks a fresh start utilizing the power of PPO-Based Deep RL with Multiple Reward design in real-time closed-loop fuel management operations as a highly scalable and adaptable alternative compared to rule-set and traditional supervised learning methods.

Downloads

Download data is not yet available.

References

[1] P. Reinforcement learning for efficient power systems planning: A review of operational and expansion strategies. Energies. 2024;17(9):2167.

[2] Dou J, Wen Z. Boiler combustion modeling and optimization based on reinforcement learning algorithm. Discover Applied Sciences. 2025;8(1):39.

[3] Wang, Z., Xue, W., Li, K., Tang, Z., Liu, Y., Zhang, F., ... & Zhou, H. Dynamic combustion optimization of a pulverized coal boiler considering the wall temperature constraints: A deep reinforcement learning-based framework. Applied Thermal Engineering.2025;259, 124923.

[4] Ye J, Wang X, Hua Q, Sun L. Deep reinforcement learning-based energy management of a hybrid electricity-heat-hydrogen energy system with demand response. Energy. 2024;305:131874.

[5] Shuai Q, Yin Y, Huang S, Chen C. Deep reinforcement learning-based real-time energy management for an integrated electric–thermal energy system. Sustainability. 2025;17(2):407.

[6] Franzoso A, Fambri G, Badami M. Deep reinforcement learning as a tool for analysis and optimization of energy flows in multi-energy systems. Energy Conversion and Management. 2025;341:120095.

[7] Li Z, Liu L, Zhao Z, Mu S, Li D, Zhuo Y. Reinforcement learning-enhanced multi-objective optimization for sustainable coal blending in thermal power plants. PLoS ONE. 2025;20(9):e0331208.

[8] Zhang Z, Yuan W, Wang Y, Ou K, Huang Y, Xuan D. Enhanced deep reinforcement learning-based thermal management strategy for PEMFC considering coolant parasitic power. International Journal of Hydrogen Energy. 2025;146:149919.

[9] Li W, Li S, Du C, Xu Y, Xin Q, Yan F. Deep reinforcement learning control for PEMFC thermal management and air supply system. Applied Thermal Engineering. 2025;279:128030.

[10] Podlasek S, Jankowski M, Bałazy P, Lalik K, Figaj R. ANN-based control for performance optimization of a hybrid ORC power plant. Energy. 2024;306:132082.

[11] Chen KY, Chen LS, Chen MC, Lee CL. SVM-based equipment fault detection in a thermal power plant. Computers in Industry. 2011;62:42–50.

[12] Kabengele KT, Olayode IO, Tartibu LK. Performance analysis of a hybrid thermal power plant using adaptive neuro-fuzzy inference systems. Applied Sciences. 2023;13(21):11874.

[13] Perera ATD, Wickramasinghe PU, Nik VM, Scartezzini JL. Introducing reinforcement learning to the energy system design process. Applied Energy. 2020;262:114580.

[14] Stavrev S, Ginchev D. Reinforcement learning techniques for optimizing energy systems. Electronics. 2024;13(8):1459.

[15] Hossain, R. R., Yin, T., Du, Y., Huang, R., Tan, J., Yu, Huang, Q. Efficient learning of power grid voltage control strategies via model-based deep reinforcement learning. Machine Learning. 2023;113(5), 2675-2700.

[16] Mengoni, P., Jiandong, D. S., Zixin, L., & Yun, P. W. P. GenAI avatars in VR: Role of presence, health, and technological factors. Computers & Education: Reality.2026;8, 100141.

[17] Kong X, Abdelbaky MA, Liu X, Lee KY. Stable feedback linearization-based economic MPC for thermal power plants. Energy. 2023;268:126658.

[18] Song, Y., Duan, Y., & Rao, T. (2024). Fault Diagnosis of Power Equipment Based on Improved SVM Algorithm. EAI Endorsed Transactions on Energy Web, 12.

[19] Liu X, Bansal RC. Multi-objective CFD-based optimization of boiler combustion in coal-fired power plants. Applied Energy. 2014;130:658–669.

[20] Gultom E, Nasruddin, Muzhoffar DAF, Sholahudin. Multi-objective genetic algorithm for biomass co-firing power plant optimization. Thermal Science and Engineering Progress. 2025;63:103716.

[21] Xu, X., Chen, Q., Ren, M., Cheng, L., & Xie, J. Combustion optimization for coal fired power plant boilers based on improved distributed ELM and distributed PSO. Energies.2019;12(6), 1036.

[22] Arferiandi YD, Caesarendra W, Nugraha H. Heat rate prediction of combined-cycle power plant using ANN. Sensors. 2021;21(4):1022.

[23] Alabdulhadi, A. A., Rehman, S., Ali, A., & Shafiullah, M. Deep learning framework for wind speed prediction in Saudi Arabia. Neural Computing and Applications.2025;37(5), 3685-3701.

[24] Bernadić A, Kujundžić G, Primorac I. Reinforcement learning in power system control and optimization. B&H Electrical Engineering. 2023;17(1):26–34.

[25] Li, Q., Lin, T., Yu, Q., Du, H., Li, J., & Fu, X. Review of deep reinforcement learning and its application in modern renewable power system control. Energies.2023; 16(10), 4143.

[26] Addo, K., Kabeya, M., & Ojo, E. E. AI-Powered Digital Twin Co-Simulation Framework for Climate-Adaptive Renewable Energy Grids. Energies.2025;18(21), 5593.

[27] Tabas D, Zhang B. Computationally efficient safe reinforcement learning for power systems. arXiv.2022; arXiv:2110.10333.

[28] Jency, A., & Ramar, K. A review of abnormal behaviour detection in crowd for video surveillance: advances and trends, datasets, opportunities and prospects. Expert Systems.2025; 42(4), e70013.

[29] Koprivica B, Zurek S. Separation of rotational power loss components for electrical steels. IEEE Transactions on Magnetics. 2021;57(8):1–12.

Downloads

Published

14-05-2026

How to Cite

1.
Zhu R, Liu Q, Li G, Hong X, Tian Z, Yanjun Guo, et al. Deep Reinforcement Learning-Based Intelligent Control for Efficiency Enhancement in Thermal Power Plant Fuel Management. EAI Endorsed Trans Energy Web [Internet]. 2026 May 14 [cited 2026 May 15];13. Available from: https://publications.eai.eu/index.php/ew/article/view/11848

Most read articles by the same author(s)