Deep Reinforcement Learning for Intelligent Reflecting Surface-assisted D2D Communications

Khoi Khac Nguyen; Antonino Masaracchia; Cheng Yin

doi:10.4108/eetinis.v10i1.2864

Authors

Khoi Khac Nguyen Queen's University Belfast
Antonino Masaracchia Queen's University Belfast https://orcid.org/0000-0002-2299-8487
Cheng Yin University of Surrey

DOI:

https://doi.org/10.4108/eetinis.v10i1.2864

Keywords:

Intelligent reflecting surface (IRS), D2D communications, deep reinforcement learning

Abstract

In this paper, we propose a deep reinforcement learning (DRL) approach for solving the optimisation problem of the network’s sum-rate in device-to-device (D2D) communications supported by an intelligent reflecting surface (IRS). The IRS is deployed to mitigate the interference and enhance the signal between the D2D transmitter and the associated D2D receiver. Our objective is to jointly optimise the transmit power at the D2D transmitter and the phase shift matrix at the IRS to maximise the network sum-rate. We formulate a Markov decision process and then propose the proximal policy optimisation for solving the maximisation game. Simulation results show impressive performance in terms of the achievable rate and processing time.

Downloads

Citations

Citation Indexes: 8

Captures

Readers: 11

-

see details

References

Huang, J., Xing, C.C. and Guizani, M. (2020) Power allocation for D2D communications with SWIPT. IEEE Trans. Wireless Commun. 19(4): 2308–2320. DOI: https://doi.org/10.1109/TWC.2019.2963833

Nguyen, K.K., Duong, T.Q., Vien, N.A., Le-Khac, N.A. and Nguyen, L.D. (2019) Distributed deep deterministic policy gradient for power allocation control in D2D-based V2V communications. IEEE Access 7: 164533–164543. DOI: https://doi.org/10.1109/ACCESS.2019.2952411

Mousavifar, S.A., Liu, Y., Leung, C., Elkashlan, M. and Duong, T.Q. (September 2014) Wireless energy harvesting and spectrum sharing in cognitive radio. In Proc. IEEE 80th Vehicular Technology Conference (VTC2014-Fall), Vancouver, BC, Canada: 1–5. DOI: https://doi.org/10.1109/VTCFall.2014.6966232

Yu, H., Tuan, H.D., Nasir, A.A., Duong, T.Q. and Poor, H. V. (2020) Joint design of reconfigurable intelligent surfaces and transmit beamforming under proper and improper Gaussian signaling. IEEE J. Select. Areas Commun. 38(11): 2589–2603. DOI: https://doi.org/10.1109/JSAC.2020.3007059

Zou, Y., Gong, S., Xu, J., Cheng, W., Hoang, D.T. and Niyato, D. (2020) Wireless powered intelligent reflecting surfaces for enhancing wireless communications. IEEE Transactions on Vehicular Technology 69(10): 12369–12373. DOI: https://doi.org/10.1109/TVT.2020.3011942

Zheng, B., You, C. and Zhang, R. (2021) Efficient channel estimation for double-IRS aided multi-user MIMO system. IEEE Trans. Commun. 69(6): 3818–3832. DOI: https://doi.org/10.1109/TCOMM.2021.3064947

Nguyen, K.K., Khosravirad, S., Costa, D.B.D., Nguyen, L. D. and Duong, T.Q. (2022) Reconfigurable intelligent surface-assisted multi-UAV networks: Efficient resource allocation with deep reinforcement learning. IEEE J. Selected Topics in Signal Process. 16(3): 358–368. DOI: https://doi.org/10.1109/JSTSP.2021.3134162

Chen, Y., Ai, B., Zhang, H., Niu, Y., Song, L., Han, Z. and Poor, H.V. (2021) Reconfigurable intelligent surface assisted device-to-device communications. IEEE Trans. Wireless Commun. 20(5): 2792–2804. DOI: https://doi.org/10.1109/TWC.2020.3044302

Jia, S., Yuan, X. and Liang, Y.C. (2021) Reconfigurable intelligent surfaces for energy efficiency in D2D communication network. IEEE Wireless Commun. Lett. 10(3): 683–687. DOI: https://doi.org/10.1109/LWC.2020.3046358

Pradhan, C., Li, A., Song, L., Li, J., Vucetic, B. and Li, Y. (2020) Reconfigurable intelligent surface (RIS)-enhanced two-way OFDM communications. IEEE Transactions on Vehicular Technology 69(12): 16270–16275. DOI: https://doi.org/10.1109/TVT.2020.3038942

Cao, Y., Lv, T., Ni, W. and Lin, Z. (2021) Sum-rate maximization for multi-reconfigurable intelligent surface-assisted device-to-device communications. IEEE Trans. Commun. 69(11): 7283–7296. DOI: https://doi.org/10.1109/TCOMM.2021.3106334

Yang, G., Liao, Y., Liang, Y.C., Tirkkonen, O., Wang, G. and Zhu, X. (2021) Reconfigurable intelligent surface empowered device-to-device communication underlaying cellular networks. IEEE Trans. Commun. 69(11): 7790–7805. DOI: https://doi.org/10.1109/TCOMM.2021.3102640

Nguyen, K.K., Vien, N.A., Nguyen, L.D., Le, M.T., Hanzo, L. and Duong, T.Q. (2021) Real-time energy harvesting aided scheduling in UAV-assisted D2D networks relying on deep reinforcement learning. IEEE Access 9: 3638–3648. DOI: https://doi.org/10.1109/ACCESS.2020.3046499

Huang, C., Mo, R. and Yuen, C. (2020) Reconfigurable intelligent surface assisted multiuser MISO systems exploiting deep reinforcement learning. IEEE J. Select. Areas Commun. 38(8): 1839–1850. DOI: https://doi.org/10.1109/JSAC.2020.3000835

Shokry, M., Elhattab, M., Assi, C., Sharafeddine, S. and Ghrayeb, A. (2021) Optimizing age of informa-tion through aerial reconfigurable intelligent surfaces: A deep reinforcement learning approach. IEEE Transac-tions on Vehicular Technology 70(4): 3978–3983. DOI: https://doi.org/10.1109/TVT.2021.3063953

Feng, K., Wang, Q., Li, X. and Wen, C.K. (2020) Deep reinforcement learning based intelligent reflecting surface optimization for MISO communication systems. IEEE Wireless Commun. Lett. 9(5): 745–749. DOI: https://doi.org/10.1109/LWC.2020.2969167

Nguyen, K.K., Duong, T.Q., Do-Duy, T., Claussen, H. and Hanzo, L. (2022) 3D UAV trajectory and data collection optimisation via deep reinforcement learning. IEEE Trans. Commun. 70(4): 2358–2371. DOI: https://doi.org/10.1109/TCOMM.2022.3148364

Bertsekas, D.P. (1995) Dynamic Programming and Optimal Control, 1 (Athena Scientific Belmont, MA).

Schulman, J., Wolski, F., Dhariwal, P., Radford, A. and Klimov, O. (2017), Proximal policy optimization algorithms. URL https://arxiv.org/abs/1707.06347.

Schulman, J., Moritz, P., Levine, S., Jordan, M.I. and Abbeel, P. (2016) High-dimensional continuous control using generalized advantage estimation. In Proc. 4th International Conf. Learning Representations (ICLR).

Mnih, V. et al. (2016) Asynchronous methods for deep reinforcement learning. In Proc. Int. Conf. Mach. Learn.(PMLR): 1928–1937.

Kingma, D.P. and Ba, J.L. (2014), Adam: A method for stochastic optimization. URL arXivpreprintarXiv: 1412.6980.

Abadi, M. et al. (2016) Tensorflow: A system for large-scale machine learning. In Proc. 12th USENIX Sym. Opr. Syst. Design and Imp. (OSDI 16): 265–283.

Sutton, R.S., McAllester, D., Singh, S. and Mansour, Y. (2000) Policy gradient methods for reinforcement learning with function approximation. In Adv. Neural Inf. Process. Syst.: 1057–1063.