Deep Reinforcement Learning for Intelligent Reflecting Surface-assisted D2D Communications
DOI:
https://doi.org/10.4108/eetinis.v10i1.2864Keywords:
Intelligent reflecting surface (IRS), D2D communications, deep reinforcement learningAbstract
In this paper, we propose a deep reinforcement learning (DRL) approach for solving the optimisation problem of the network’s sum-rate in device-to-device (D2D) communications supported by an intelligent reflecting surface (IRS). The IRS is deployed to mitigate the interference and enhance the signal between the D2D transmitter and the associated D2D receiver. Our objective is to jointly optimise the transmit power at the D2D transmitter and the phase shift matrix at the IRS to maximise the network sum-rate. We formulate a Markov decision process and then propose the proximal policy optimisation for solving the maximisation game. Simulation results show impressive performance in terms of the achievable rate and processing time.
Downloads
References
Huang, J., Xing, C.C. and Guizani, M. (2020) Power allocation for D2D communications with SWIPT. IEEE Trans. Wireless Commun. 19(4): 2308–2320. DOI: https://doi.org/10.1109/TWC.2019.2963833
Nguyen, K.K., Duong, T.Q., Vien, N.A., Le-Khac, N.A. and Nguyen, L.D. (2019) Distributed deep deterministic policy gradient for power allocation control in D2D-based V2V communications. IEEE Access 7: 164533–164543. DOI: https://doi.org/10.1109/ACCESS.2019.2952411
Mousavifar, S.A., Liu, Y., Leung, C., Elkashlan, M. and Duong, T.Q. (September 2014) Wireless energy harvesting and spectrum sharing in cognitive radio. In Proc. IEEE 80th Vehicular Technology Conference (VTC2014-Fall), Vancouver, BC, Canada: 1–5. DOI: https://doi.org/10.1109/VTCFall.2014.6966232
Yu, H., Tuan, H.D., Nasir, A.A., Duong, T.Q. and Poor, H. V. (2020) Joint design of reconfigurable intelligent surfaces and transmit beamforming under proper and improper Gaussian signaling. IEEE J. Select. Areas Commun. 38(11): 2589–2603. DOI: https://doi.org/10.1109/JSAC.2020.3007059
Zou, Y., Gong, S., Xu, J., Cheng, W., Hoang, D.T. and Niyato, D. (2020) Wireless powered intelligent reflecting surfaces for enhancing wireless communications. IEEE Transactions on Vehicular Technology 69(10): 12369–12373. DOI: https://doi.org/10.1109/TVT.2020.3011942
Zheng, B., You, C. and Zhang, R. (2021) Efficient channel estimation for double-IRS aided multi-user MIMO system. IEEE Trans. Commun. 69(6): 3818–3832. DOI: https://doi.org/10.1109/TCOMM.2021.3064947
Nguyen, K.K., Khosravirad, S., Costa, D.B.D., Nguyen, L. D. and Duong, T.Q. (2022) Reconfigurable intelligent surface-assisted multi-UAV networks: Efficient resource allocation with deep reinforcement learning. IEEE J. Selected Topics in Signal Process. 16(3): 358–368. DOI: https://doi.org/10.1109/JSTSP.2021.3134162
Chen, Y., Ai, B., Zhang, H., Niu, Y., Song, L., Han, Z. and Poor, H.V. (2021) Reconfigurable intelligent surface assisted device-to-device communications. IEEE Trans. Wireless Commun. 20(5): 2792–2804. DOI: https://doi.org/10.1109/TWC.2020.3044302
Jia, S., Yuan, X. and Liang, Y.C. (2021) Reconfigurable intelligent surfaces for energy efficiency in D2D communication network. IEEE Wireless Commun. Lett. 10(3): 683–687. DOI: https://doi.org/10.1109/LWC.2020.3046358
Pradhan, C., Li, A., Song, L., Li, J., Vucetic, B. and Li, Y. (2020) Reconfigurable intelligent surface (RIS)-enhanced two-way OFDM communications. IEEE Transactions on Vehicular Technology 69(12): 16270–16275. DOI: https://doi.org/10.1109/TVT.2020.3038942
Cao, Y., Lv, T., Ni, W. and Lin, Z. (2021) Sum-rate maximization for multi-reconfigurable intelligent surface-assisted device-to-device communications. IEEE Trans. Commun. 69(11): 7283–7296. DOI: https://doi.org/10.1109/TCOMM.2021.3106334
Yang, G., Liao, Y., Liang, Y.C., Tirkkonen, O., Wang, G. and Zhu, X. (2021) Reconfigurable intelligent surface empowered device-to-device communication underlaying cellular networks. IEEE Trans. Commun. 69(11): 7790–7805. DOI: https://doi.org/10.1109/TCOMM.2021.3102640
Nguyen, K.K., Vien, N.A., Nguyen, L.D., Le, M.T., Hanzo, L. and Duong, T.Q. (2021) Real-time energy harvesting aided scheduling in UAV-assisted D2D networks relying on deep reinforcement learning. IEEE Access 9: 3638–3648. DOI: https://doi.org/10.1109/ACCESS.2020.3046499
Huang, C., Mo, R. and Yuen, C. (2020) Reconfigurable intelligent surface assisted multiuser MISO systems exploiting deep reinforcement learning. IEEE J. Select. Areas Commun. 38(8): 1839–1850. DOI: https://doi.org/10.1109/JSAC.2020.3000835
Shokry, M., Elhattab, M., Assi, C., Sharafeddine, S. and Ghrayeb, A. (2021) Optimizing age of informa-tion through aerial reconfigurable intelligent surfaces: A deep reinforcement learning approach. IEEE Transac-tions on Vehicular Technology 70(4): 3978–3983. DOI: https://doi.org/10.1109/TVT.2021.3063953
Feng, K., Wang, Q., Li, X. and Wen, C.K. (2020) Deep reinforcement learning based intelligent reflecting surface optimization for MISO communication systems. IEEE Wireless Commun. Lett. 9(5): 745–749. DOI: https://doi.org/10.1109/LWC.2020.2969167
Nguyen, K.K., Duong, T.Q., Do-Duy, T., Claussen, H. and Hanzo, L. (2022) 3D UAV trajectory and data collection optimisation via deep reinforcement learning. IEEE Trans. Commun. 70(4): 2358–2371. DOI: https://doi.org/10.1109/TCOMM.2022.3148364
Bertsekas, D.P. (1995) Dynamic Programming and Optimal Control, 1 (Athena Scientific Belmont, MA).
Schulman, J., Wolski, F., Dhariwal, P., Radford, A. and Klimov, O. (2017), Proximal policy optimization algorithms. URL https://arxiv.org/abs/1707.06347.
Schulman, J., Moritz, P., Levine, S., Jordan, M.I. and Abbeel, P. (2016) High-dimensional continuous control using generalized advantage estimation. In Proc. 4th International Conf. Learning Representations (ICLR).
Mnih, V. et al. (2016) Asynchronous methods for deep reinforcement learning. In Proc. Int. Conf. Mach. Learn.(PMLR): 1928–1937.
Kingma, D.P. and Ba, J.L. (2014), Adam: A method for stochastic optimization. URL arXivpreprintarXiv: 1412.6980.
Abadi, M. et al. (2016) Tensorflow: A system for large-scale machine learning. In Proc. 12th USENIX Sym. Opr. Syst. Design and Imp. (OSDI 16): 265–283.
Sutton, R.S., McAllester, D., Singh, S. and Mansour, Y. (2000) Policy gradient methods for reinforcement learning with function approximation. In Adv. Neural Inf. Process. Syst.: 1057–1063.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 EAI Endorsed Transactions on Industrial Networks and Intelligent Systems
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
This is an open-access article distributed under the terms of the Creative Commons Attribution CC BY 3.0 license, which permits unlimited use, distribution, and reproduction in any medium so long as the original work is properly cited.