A Distributed and Secure Resource Allocation Method for Power Communication Networks Based on Policy Distillation

Yue Zhang; Zongtao Li; Si Chen; Guoqiang Hu; Pengcheng Li; Ruimei Wu

doi:10.4108/eetsis.11995

Authors

Yue Zhang Inner Mongolia Power Communication Company
Zongtao Li Inner Mongolia Power Communication Company
Si Chen Inner Mongolia Power (Group) Co., Ltd.
Guoqiang Hu Inner Mongolia Power (Group) Co., Ltd.
Pengcheng Li Inner Mongolia Power Communication Company
Ruimei Wu Inner Mongolia Power Communication Company

DOI:

https://doi.org/10.4108/eetsis.11995

Keywords:

Power Communication Networks, Reinforcement Learning, Policy Distillation, Distributed and Secure Resource Allocation

Abstract

INTRODUCTION: In the next-generation smart grid communication architecture, how to achieve secure, dynamic, and fine-grained network resource allocation to ensure differentiated QoS for various services has become a key challenge. OBJECTIVES: Therefore, this study proposes a lightweight resource allocation method based on constrained policy distillation to address the challenge of balancing lightweight deployment with strong security assurance in power communication networks. METHODS: By integrating Graph Neural Networks (GNNs) and Bidirectional LSTM (Bi-LSTM), the model extracts three-dimensional features of topology, service, and resources to construct a 128-dimensional joint state representation. Moreover, a multi-objective reward function is designed that employs a double Q-network to mitigate value overestimation and generate a high-fidelity decision trajectory library. Through service-constrained policy distillation, the model innovatively combines KL divergence loss, a squared hard-constraint loss, and a soft-constraint L2 loss to compress the teacher model into a student model, subsequently compiled and deployed at the edge. Finally, a rule engine layer dynamically adjusts priorities for intercepting critical violations and ensures the security of the power system. RESULTS: Experimental results based on real-world power grid datasets demonstrate that our model achieves superior performance in resource efficiency, security, and edge effectiveness, effectively balancing lightweight deployment with strong security assurance in resource allocation for power communication networks. CONCLUSION: It can be seen that this method enables distributed and secure resource allocation in power communication network environments, thus providing reliable QoS guarantees for new-type power systems.

References

[1] He H, Meng X, Wang Y, et al. Deep reinforcement learning based energy management strategies for electrified vehicles: Recent advances and perspectives. Renewable and Sustainable Energy Reviews. 2024; 192: 114248.

[2] Zhao H, Sun W, Ni Y, et al. Deep deterministic policy gradient-based rate maximization for RIS-UAV-assisted vehicular communication networks. IEEE Transactions on Intelligent Transportation Systems. 2024; 25(11):15732-15744.

[3] Luo L, Yan X. Scheduling of stochastic distributed hybrid flow-shop by hybrid estimation of distribution algorithm and proximal policy optimization. Expert Systems with Applications. 2025; 271: 126523.

[4] Rahmani A M, Haider A, Moghaddasi K, et al. Self-learning adaptive power management scheme for energy-efficient IoT-MEC systems using soft actor-critic algorithm. Internet of Things, 2025; 31: 101587.

[5] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.

[6] Vrahatis AG, Lazaros K, Kotsiantis S. Graph attention networks: a comprehensive review of methods and applications. Future Internet, 2024; 16(9): 318.

[7] Rathi M, Gomathy C. Smart agriculture resource allocation and energy optimization using bidirectional long short-term memory with ant colony optimization (Bi-LSTM–ACO). Frontiers in Communications and Networks, 2025; 6: 1587402.

[8] Zhang X, Peng M, Yan S, et al. Deep-reinforcement-learning-based mode selection and resource allocation for cellular V2X communications. IEEE Internet of Things Journal. 2019; 7(7): 6380-6391.

[9] Liu S, Yu G, Wen D, et al. Communication and energy efficient decentralized learning over D2D networks. IEEE Transactions on Wireless Communications. 2023; 22(12): 9549-9563.

[10] Ji M, Wu Q, Fan P, et al. Graph neural networks and deep reinforcement learning based resource allocation for v2x communications. IEEE Internet of Things Journal. 2024; 12(4): 3613-3628.

[11] Van Hasselt H, Guez A, Silver D. Deep reinforcement learning with double q-learning. Proceedings of the AAAI conference on artificial intelligence, Phoenix, AZ, USA, 12–17 February 2016; pp. 2094-2100

[12] Safavinejad R, Chang H, Liu L. Deep reinforcement learning for dynamic spectrum access: Convergence analysis and system design. IEEE Transactions on Wireless Communications. 2024; 23(12): 18888-18902.

[13] Iqbal A, Tham M L, Chang Y C. Double deep Q-network-based energy-efficient resource allocation in cloud radio access network. IEEE access, 2021; 9: 20440-20449.

[14] Rashid HU, Jeong SH. Resource allocation in multi-cell networks: A deep reinforcement learning approach. 2023 International Conference on Information and Communication Technology Convergence. IEEE, Jeju Island, Korea, 11-13 October 2023; pp. 793-795.

[15] Hussain F, Hassan S A, Hussain R, et al. Machine learning for resource management in cellular and IoT networks: Potentials, current solutions, and open challenges. IEEE communications surveys & tutorials. 2020; 22(2): 1251-1275.

[16] Ayepah-Mensah D, Sun G, Owusu Boateng G, et al. Federated Policy Distillation for Digital Twin-Enabled Intelligent Resource Trading in 5G Network Slicing. IEEE Transactions on Network and Service Management. 2025; 22(1): 361-379.

[17] Mensah D A, Sun G, Boateng G O, et al. Federated Policy Distillation for Digital Twin-Enabled Intelligent Resource Trading in 5G Network Slicing. IEEE Transactions on Network and Service Management. 2025; 22(1):361-379.

[18] Ma L, Cheng N, Wang X, et al. Distilling knowledge from resource management algorithms to neural networks: A unified training assistance approach. 2023 IEEE 98th Vehicular Technology Conference (VTC2023-Fall), IEEE, Hong Kong, 10-13 October 2023; pp. 1-5.

[19] Zhang Q, Wang J, Shen Y, et al. Privilege-guided knowledge distillation for edge deployment in excavator activity recognition. Automation in Construction, 2024; 166: 105688.

[20] Chen Y, Wang Z, Cai H, et al. Federated Knowledge Distillation using Hierarchical Reinforcement Learning in Resource-Constrained IoT Edge-Cloud Computing Environments. IEEE Transactions on Mobile Computing, 2025; 7:1-15.

[21] Mao T, Zhu J, Zhang M, et al. A Decentralized Actor–Critic Algorithm With Entropy Regularization and Its Finite-Time Analysis. IEEE Transactions on Neural Networks and Learning Systems. 2025; 36(10):19423-19436.

[22] Ihle F, Menth M. MPLS Network Actions: Technological overview and P4-based implementation on a high-speed switching ASIC. IEEE Open Journal of the Communications Society. 2025; 6:3480-3501.

A Distributed and Secure Resource Allocation Method for Power Communication Networks Based on Policy Distillation

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Make a Submission