Distributed Joint Channel Assignment and Power Control for Sum Rate Maximization of D2D-Enabled Massive MIMO System

Device-to-device (D2D) communications underlaid massive multiple-input multiple-output (MIMO) systems have been recognized as a promising candidate technology to achieve the challenging fifth-generation (5G) network requirements. This integration enhances network throughput, improves spectral e ffi ciency, and o ffl oads the tra ffi c load of base stations. However, the co/cross-tier interferences between cellular and D2D communications caused by resource sharing is a significant challenge, especially when dense D2D users exist in an underlay mode. In this paper, we jointly optimize the channel assignment and power allocation to maximize the sum data rate while maintaining the interference constraints of cellular links. Due to the lack of network-wide information in large scale networks, resource management and interference coordination is hard to be implemented in a centralized way. Therefore, we propose a three-stage stable and distributed resource allocation and interference management scheme based on local information and requires little coordination and communication between devices. We model the channel allocation optimization problem in the first stage as a many-to-one matching game. In the second stage, the algorithm adopts a cost charging policy to solve each user’s power control problem as a non-cooperative game. In the third stage, the algorithm search for swap blocking pairs until stable matching exist. It is shown in this paper that the proposed algorithm converges to a stable matching and terminates after finite iterations. Simulation results show that the proposed algorithm can achieve more than 86% of the average transmission rate performance of the optimal matching with lower complexity.


Introduction
The explosive increase in demand for mobile broadband services and the desire to support a broad range of Internet of Things (IoT) applications needing faster, reliable and higher capacity networks drive the wireless industry to develop a new fifth-generation (5G) network architecture [1,7]. Future cellular networks can support 1000 times higher mobile data volume and 5 times assignment and transmission power control for D2Dassisted massive MIMO system.

Motivation
As a single technology cannot achieve the diverse set of 5G requirements, it is essential to investigate the effect of integrating multiple technologies in one system. Early work on D2D communications has focused on single antenna systems. However, moving towards multi-antenna systems is unavoidable, due to the huge diversity and multiplexing gain of massive MIMO system. The potential benefits of these different technologies are known individually, but the challenges and benefits from their coexistence require further study. When D2D is underlaid in massive MIMO cellular networks, it would result in a complicated interference management problem due to the following two key factors: • Different from the traditional BS, massive MIMO BS allows vast cellular transmissions over shared channels, increasing cellular-to-D2D interference much higher.
• The D2D users are also expected to be dense enough to offload the cellular traffic efficiently. Therefore, the D2D-to-cellular and D2D-to-D2D interference will be more severe than before.
Currently, interference management and resource optimization for D2D underlaid massive MIMO networks remain an open problem. Previous resource allocation schemes proposed for D2D underlaid massive MIMO systems are unlikely to jointly optimize the channel assignment and power control to alleviate interference. As opposed to previous work, this paper proposes a stable and distributed joint channel assignment and power control scheme employing the concept of matching game theory to: • Mitigate interference and maximize throughput gains of these different solutions when they coexist and share network resources.
• Study effect of the additional degrees of freedom resulting from the massive antennas BS on the throughput when D2D communications are integrated with massive MIMO systems.

Contribution
In this work, our objective is to optimize the average sum rate for D2D underlaid massive MIMO cellular network while protecting the cellular users. In order to practicability and network scalability, we propose a three-stage stable and distributed resource optimization scheme in which D2D users and BS can interact and make resource allocation decisions based on locally available information. Different from the existing works, this paper addresses the following challenges specified to D2D-enabled massive MIMO system: • First, we formulate the joint resource allocation optimization problem to optimize the average sum rate, which takes account of the interference constraints for cellular communications as a nonlinear optimization problem which is NP-hard.
• Second, to have a distributed and stable solution, we decompose the original problem into three cascaded subproblems: channel assignment, power control, and swapping, and then, we proposed a three-stage distributed algorithm. In the first stage, we model the channel allocation optimization problem as a many-to-one pairing game. Each D2D pairs proposes its preferred channel according to its utility value, and the BS accept the most preferred request. In the second stage, we model the power allocation optimization problem as a non-cooperative game. Each D2D pair optimizes its utility value according to its communication channel gain and interference channel gain to limit the D2D-to-cellular interference. Finally, in the third stage, the algorithm considers the peer-effect, resulting from the mutual interference among D2D pairs sharing the same resource and searching for blocking pairs until stable matching is established. The proposed algorithm can be implemented via distributed decision at each device based on locally available information. To the best of our knowledge, no one had presented a distributed matching game-based joint channel-power allocation optimization to enhance throughput when D2D underlaid in a massive MIMO system. In addition, we prove the stability of the proposed matching-based resource optimization algorithm.
• Finally, through extensive simulations, we characterized the throughput performance of our algorithm. Simulations results show that the proposed algorithm with limited complexity is efficient, and the throughput loss compared to the optimal method is small.

Main Assumptions Considered
• Each cellular channel is shared by multiple cellular users (CUs) and D2D pairs, and only one resource block is assigned to each D2D pair at a time.
• Each D2D pairs and cellular users transmit simultaneously, and their mode of communication (i.e., D2D mode or cellular mode) has already been determined.
• Perfect channel state information (CSI) at the transceiver: Since the aim of the work is to analyze, investigate and propose distributed joint channel assignment and power control scheme to manage interference and enhance throughput in D2D enabled massive MIMO communication, perfect channel state information is assumed at the transceiver for simplicity.

RELATED WORK
The main challenge regarding integrating D2D in a cellular network is to deal with co-channel interference between D2D and cellular users caused by spectrum sharing. Extensive research efforts have been spent on solving the problem through efficient interference management, power control, resource allocation [16-30, 35-41, 41, 43-45]. Because of their promising potential to attain challenging 5G requirements, the coexistence of D2D and massive MIMO has gotten wide research attention in the last few years. The benefits and challenges when massive MIMO and D2D co-exist have been studied in uplink [17][18][19][20][21][22] and downlink transmissions [16, 23-30, 35, 36]. The authors in [17] proposed a scheme to optimize the channel assignment, power control and precoding for D2D underlaid massive MIMO network to maximize the sum rate of both cellular and D2D users. However, only one cellular and D2D users were allowed to share each cellular channel, which can limit the multiplexing gain of the system. Enabling multiple cellular users and D2D pairs to share the same channel can enhance the spectrum efficiency of the massive MIMO system. The authors in [18] employed partial zero-forcing at the receivers to deal with cellular and D2D spectral efficiencies tradeoff under perfect and imperfect channel state information (CSI). The authors state that under perfect CSI, zeroforcing at the receivers can completely overcome the loss in cellular spectral efficiency due to D2D underlay. However, this paper does not consider any resource management scheme to alleviate interference. In [19], the authors have proposed a power control and pilot allocation scheme for D2D enabled multi-cell massive MIMO system to optimize spectral efficiency. The wok provided a detailed analysis for the joint optimization of the data and pilot transmission power. The authors have proposed a successive approximationbased algorithm to solve the resulting data and pilot transmission power problem. In [20], the authors have proposed a user's position based D2D mode selection scheme to reduce pilot contamination problems and optimize system SE when D2D is underlaid in a massive MIMO system. In [21], the authors have proposed an   open-loop based transmission power control solution  to alleviate the cellular-to-D2D and D2D-to-cellular interference. An analytical approach is used for the proposed power control paradigm to evaluate spectral and energy efficiency. The authors in [16,[23][24][25]35] have studied the achievable data rate, and energy efficiency tradeoff for D2D enabled downlink massive MIMO system employing a stochastic geometry-based analytical framework. However, these works didn't consider any resource allocation scheme for interference management and showed that the density of D2D users limits the benefits of the coexistence of D2D and massive MIMO.
In [16], the authors have investigated the average sum rate and energy efficiency tradeoffs when D2D and massive MIMO communications coexist. The authors derived exact analytical expressions for the average sum rate and energy efficiency. They stated that the coexistence of underlay D2D communications and massive MIMO is mainly beneficial in low D2D user density. The work in [23] investigated the optimum number of cellular users that could be switched from cellular mode to D2D mode to maximize the total system throughput. The authors pointed out that there is an optimal number of users switched to D2D mode to maximize the total capacity, which strongly depends on the network parameters such as the number of BS antennas, D2D link distance and the transmission power of the BS, cellular and D2D users. The authors in [25] presented an analytical framework based on stochastic geometry for D2D underlaid a multi-cell massive MIMO communication system. Utilizing a linear precoding scheme for cellular downlink transmission, the impact of RF mismatches and the achievable cellular rate are analytically derived. However, these works didn't consider any resource allocation strategy for interference control and revealed that the density of D2D users limits the benefits of the coexistence of D2D and massive MIMO.
The work in [24] considered a scenario when a massive MIMO system underlaid with multi-antenna D2D users. They employed beamforming techniques at a D2D transmitter and massive MIMO antenna BS to reduce D2D-to-cellular and cellular-to-D2D interference. The authors in [26] proposed a reinforcement learning-based rate adaptation algorithm for D2D communications underlaid downlink massive MIMO networks. They derived the asymptotic SINR for both cellular and D2D links. They showed that Gaussian random variables could approximate the interfering channel distributions. The work in [27] developed a heuristic pilot optimization and pilot allocation strategy to optimize SE and EE when D2D and massive MIMO exists. They stated that the pilot allocation with an optimal pilot length could significantly improve energy and spectral efficiency compared with the entirely reused or orthogonal pilot scheduling strategy. The authors in [35] investigated D2D enabled user cooperation in frequency division duplexing massive MIMO system using cascaded precoding techniques to reduce the channel estimation overhead. The authors in [36] proposed BS precoding and D2D power allocation techniques for downlink D2D enabled single-cell massive MIMO network to enhance the achievable data rates of D2D pairs.
Most of the existing resource optimization and interference management schemes proposed for the D2Dunderlaid massive MIMO system lack a distributed joint channel assignment and power control approach to manage both co/cross-tier interferences. A considerable number of the works on D2D communication have concentrated on conventional single antenna BSs that consider various channel assignment and power control strategies to handle interference and optimize network performance [39][40][41][42][43][44][45]. However, due to massive MIMO's vast diversity and multiplexing gains, moving toward a multi-antenna BS is expected and the focus of recent research activities. Moreover, as both D2D and massive MIMO permit multiple transmissions over the shared channel, the mutual interference between cellular and D2D communications becomes severe. Without effective resource coordination and interference management, spectrum efficiency can degrade. Therefore, it is non-trivial to design efficient channel allocation and power control strategies to maintain substantial performance gains when D2D and massive MIMO exist in the emerging cellular network. To the best of our knowledge, algorithms for efficient distributed channel assignment and non-binary power allocation for D2D underlying massive MIMO systems have not been discussed before, and this work is intended to fill that gap. This work consider single cell scenario without considering impact of user mobility and inter-cell interference. In Table 2 we provide brief comparison between existing works and ours.
The remaining of this paper is organized as follows. Section 3 presents the proposed heterogeneous multi layer system model for D2D underlaid massive MIMO enabled network and formulates the transmission rate optimization problem. Section 4 evaluates the proposed distributed resource optimization algorithm. Numerical evaluations are provided in Section 5 and conclusion is drawn in Section 6.

System Model and Problem Definition
Throughout the paper, the following notations are used. Matrices are represented by boldface capital letters; vectors by boldface lower case letters. The superscripts () H stand for conjugate transpose. h i,j stands for entry in the i th row and j th column of the matrix. The notations are summarized in Table 3.

System Model
We consider the massive MIMO system with Device   ..,f |F| }. Each resource f n ∈ F will serve K single antenna cellular users and q ≤ |D| D2D pairs sharing the same resource with cellular users. The set of D2D pairs sharing the channel f n is denoted by D f n ⊂ D and D f n ∩ D f n ′ =∅ when n n ′ To reuse the cellular channels, first, the D2D U E need to send their requests to the BS. It is considered that at any time each channel can be shared among K CUE and q D2D pairs. Thus, BS assigns one channel to each D2D pair request while guaranteeing quality of service (QoS) requirements of cellular communication. Hence, a D2D pairs will be assigned resources when the interference caused to cellular network communications is under a specific threshold.
Let H ∈ C MxK be the channel matrix between the K cellular users and the BS antenna array, where the i th column of H, denoted by h c i , represents the Mx1 channel vector between the i th cellular user and the BS. Let G ∈ C MxD represent the interference channel matrix between the D2D pairs and BS, where j th column of G, denoted by g d j , represents an M x 1 channel vector between j th D2D pair and the BS. p c i and p d j are the transmission power for i th cellular and j th D2D users. Uplink transmission is the scenario where the K users transmit signals to the BS. Let x c i be the signal transmitted from the i th cellular user on resource block f n . Since K CUE users share the same time-frequency resource, the Mx1 received signal vector over channel block f n at the BS is the combination of all signal transmitted from K cellular user plus the   signal transmitted from the D2D pairs sharing the same resource and can be given by: Where p c and p d are the transmission power vector of cellular and D2D users.
The corresponding signal received at j th D2D pair over cellular channel f n would be: The channel gain between j th D2D pair is g d j d j whereas g d j d j ′ is the gain between different D2D pairs sharing the same cellular channel. h c i d j is the channel gain between i th cellular user and j th D2D receiver and n is noise with zero-mean and unit variance. With linear up link decoding schemes at the BS, the received signal y f n c is decomposed into k streams by multiplying it with an M x K linear detection matrix, A: Each stream is then decoded independently. The i th element ofŷ f n c which is used to decode x c i is given by: Where a c i denotes the i th column of A. Hence, the received signal-to-interference-plus-noise ratio (SINR) of the i th element over channel f n is given by: The transmission rate of the i th cellular user and j th D2D pair over channel resource f n respectively can be given by: Then the average sum rate (ASR) obtained from the total data rate of both D2D and cellular users is given by:

Problem Definition
Here, we jointly optimize the channel assignment and power control to maximize the average sum data rate while guaranteeing the interference requirements of CUEs. Mathematically, the resource optimization problem to maximize the transmission rate can be formulated as: Where p and X are transmission power vector and channel assignment matrix respectively. The n column of X represents the D2D pairs sharing resource block f n . Constraint (13b) ensure the protection of cellular user by keeping total interference from D2D pair below a predefined threshold. Each resource block can be shared by at most q number of D2D pairs and each D2D pair can be assigned to one resource block as represented by constraints (13c) and (13d). Pm is the maximum transmission power as denoted by constraint (13e). Problem 13 is a non-linear optimization problem, requiring exponential computation effort to obtain the optimal solution through exhaustive search. Due to the lack of global information in large-scale networks, resource allocation is hard to be implemented in a centralized way. To attain a stable and distributed solution, we aim to model P1 as many-to-one matching and solve it in a distributed manner by each D2D pair. In this regard, this paper aims to develop a matching theory-based distributed channel-power allocation scheme using local information only.

Resource Allocation Algorithm
Here, we will define and formulate the proposed resource allocation optimization algorithm. We first introduce some basic concepts of the matching theory. Then, we develop a three-step iterative matching algorithm to effectively pair D2D users and cellular resources, which are critical for the introduced many-to-one matching model. Finally, we present theoretical analysis of the properties of the proposed algorithm, which involves convergence, stability, and computational complexity.

Matching Game Definition
A matching game is a powerful mathematical tool to study the formation of mutually beneficial relations among distinct players based on their preferences [45][46][47][48]. It divides the players into two distinct groups, and each element rank the member of other groups in order of preference. The choice of one over the other is derived from the locally available information. In our context, one side of the matching is a group of D2D pair D, whereas the set of resource blocks F from the other side. In many-to-one matching, each frequency channel from one side of the matching is allowed to pair with more than one D2D user from the opposite side, while each D2D user is allowed to match to at most one resource block from the other side of the matching. This matching is defined as follow: Definition1. Consider two disjoint and finite sets of players, D={d j } |D| j=1 and F={f i } |F| i=1 , then a many-to-one matching µ is defined by a function from the set DU F into the set of element of DU F such that: 1) |µ(d j )| ≤ 1 and µ(d j ) ∈ F, ∀d j ∈ D 2) |µ(f n )| ≤ q and µ(f n ) ∈ D, ∀f n ∈ F 3) µ(d j ) = f n if and only if d j is in µ(f n ) Where q is the number of D2D pairs allowed to match with each channels. The first two properties indicate the matching is many-to-one relation. Third condition is all about if a D2D pair d j is matched with channel f n , and then channel f n is also matched with D2D pair d j .
Utility Function. In matching theory, utility is a function that quantifies a player's performance in relation to others. A player's choice over another is defined by its utility function.
Preference Profiles. In the matching game, each player builds a ranking list about the other side's players by using its utility function. This ranking list is named preference profile and indicates each player's performance on the opposite side based on its objectives and locally available information.

Joint Channel Assignment and Power Allocation Algorithm
This section proposes a distributed matching algorithm to solve the channel assignment and power allocation problem P1. In the proposed algorithm, the BS share cellular resources with D2D pairs to improve the data rate while minimizing the incurred interference to the existing cellular communications.
Because of the mutual interference between cochannel D2D pairs, the preference of each D2D pair d j will also be affected by the choice of other D2D pairs. This kind of matching is called matching with externalities or peer effect. The proposed matching algorithm considers the peer effect among players and is called many-to-one matching with peer effect. Each D2D pair will compute to maximize its transmission rate. The details of this algorithm is found in Algorithm1 and now we explain it here.
The proposed algorithm has three stages, channel assignment, power allocation, and swapping.
Channel assignment. We consider the channel assignment optimization problem by solving the following optimization problem: subject to : We considered the interference and transmission power constraint during the power control stage. To solve problem P2, a combinational optimization problem, we model it as a many-to-one matching game, suitable for solving distributed assignment problems with locally available information and is explained here. During channel assignment, each resource block f n and D2D pair build their preference list P d j and P f n . During this phase, each D2D pair does not care about whom the other D2D pairs will match. Hence, the preference value of D2D pair d j for resource f n only depends on the cellular-to-D2D interference and can be given by the following utility function of channel-tointerference-plus-noise ratio (CINR).
Based on the preference value, each D2D pair will rank available channels in a decreasing order in its preference profile represented by P d j . According to (15), a channel f n ∈ F which produces a higher utility value, i.e. CINR, will be preferred over a channel f n ′ by a D2D pair d j i.e., f n ≻ µ(d j ) f n ′ , and would be placed higher in its preference list.
Similarly, for each resource block f n , the BS build a preference list of the D2D pair according to the interference channel gain between D2D transmitter and BS, i.e. (H H H) −1 H H G. Accordingly, each resource block f n gives less utility to a D2D pair which creates higher interference.
Then, according to its preference profile P d j each unassigned D2D pair d j proposes to its most preferred channel f n which not rejected it before. Each resource block f n accepts the most preferred q D2D pairs and rejects the remains. The channel assignment phase terminates when the channel accepts all D2D pairs, which do not violate the optimization constraint.
Power allocation. After performing the resource assignment in stage 1, we have the channel assignment matrix X. During the power allocation phase, each D2D pair will obtain the corresponding transmit power p d j required to optimize its transmission rate for the assigned channel in stage 1 while maintaining interference and transmission power constraints. As each D2D pair d j ∈ D f n compute to maximize its own utility independently, the power allocation optimization problem can be modeled as a non-cooperative game. Let p f n denote the set of power action profiles of all D2D pairs matched to the channel f n , then, the data rate of D2D pair d j can be rewritten as: and the corresponding power allocation optimization problem can be formulated as: subject to : which is the interference channel gain between j th D2D pair and the BS. The first term in the objective function can be considered as a gain, which approximates the achievable data rate. The second term is the cost charged, which is proportional to the ratio between the interference caused at the BS and D2D link quality. The D2D pairs with the highest communication link quality and deep interference link to the BS can achieve the best transmission power. As the objective function is concave in p d j , after solving problem (17), we can obtain the best transmit power of D2D pair d j : Swapping phase. During this phase, for the given transmission power in stage 2, the algorithm determines the peer effect and seeks for blocking pair until stable matching between channel and D2D pair is found. The utility of each D2D pair depends on the underlying matching state µ and is given by: Where SIN R f n d j is according to eqn.9. The algorithm will swap two D2D pairs d j and d j ′ only when: where ∆R µ→µ ′ is the difference

Initialization
3. for all D2D pairs d j ∈ D and channels f n ∈ F build their preference lists P d j and P f n based on their preference function.
Each D2D pair d j ∈ D unmatch proposes to its most preferred channel f n and remove it from its preference list.

(b) Accept/reject
For each channel f n , keep the most favorite D2D proposals and remove them from unmatched D2D list.
Until set of unmatched D2D pairs D unmatch = ∅ or preference list of unmatched D2D pair is empty. (a) if (d j , d j ′ ) form swap blocking pair, such that f n ∈ µ(d j ) and f n ′ ∈ µ(d j ′ ) i. update the current matching state to µ * = (µ * ( d j ), µ * (f n )), such that ii. Else if there doesn't exist blocking pair Hold the current matching state.
Until there is no blocking pair or the matching of two consecutive iteration are identical.
End of the algorithm.
Remark: In the implementation, the BS will make decisions on behalf of resource blocks. The interference channel gain between D2D pairs and BS can be estimated by BS using the pilot signal or any standard channel estimation technique. Once this information is acquired, the BS can rank each D2D pair for each RB in the preference profile. Basically, D2D pairs could be any device equipped with D2D technology and has the capability to switch between direct communication mode and cellular communication mode as needed.

Properties of the Proposed Algorithm
In this subsection, we analyze the properties of Algorithm1 in terms of stability, convergence and complexity in details.
Stability. Algorithm1 converges to a stable allocation.
To proof the stability of our proposed algorithm, let's consider the following stability definition. Definition2. A matching µ is swap stable if there exists no swap blocking pair (d j , f n ) and (d j ′ , f n ′ ), ∀ d j , where µ t−1 (d j ) and µ t−1 (d j ′ ) represent the current matched partners of d j and d j ′ , i.e. f n and f n ′ respectively.
where µ t−1 (f n ) and µ t−1 (f n ′ ) represent the current matched partners of f n and f n ′ , i.e. d j and d j ′ respectively.
From these stability properties, the matching derived from our proposed algorithm is stable. Proof : To prove the stability of the proposed algorithm by contradiction, let us first assume that there exists a blocking pair formed by D2D pair d j ∈ D and resource block f n ∈ F under the matching µ, i.e. f n ≻ d j µ( d j ) and µ( d j ) f n , d j ≻ f n µ(f n ).
In the matching process, each D2D pair propose to its most preferred resource block according to its preference profile to maximize utility value.
Considering the assumption f n ≻ d j µ( d j ), D2D pair d j must have already proposed to resource block f n before proposing to µ( d j ) according to the rules defined in Algorithm1. However, the existence of µ(f n ) d j in matching means that resource block f n prefers µ(f n ) than D2D pair d j . Therefore, the channel block f n is not willing to break the current matching to pair with D2D pair d j , i.e. the condition d j ≻ f n µ(f n ) cannot exist when f n ≻ d j µ( d j ). Hence, the blocking pair formed by D2D pair d j and resource block f n does not exist, which contradicts with the original assumption. Thus, the matching µ derived from our algorithm is stable.

Convergence.
As stated in the proposed algorithm, each D2D pair that has not been matched would propose its most preferred channels based on its preference lists. Each channels accept the most preferred D2D pairs and reject the remains. As each D2D pair proposes only once for each channel in its preference list, the channel assignment procedure would end when all D2D pairs are accepted or rejected by the BS. From eqn. 20, the system sum rate of Algorithm1 increases after each successful swap operation. Since the system sum rate has an upper bound due to the cellular interference constraint and limited spectrum resources, the swap operations terminate when the system sum rate saturate. Hence, we can conclude the proposed matching algorithm converges after finite iterations.
Complexity. We recall that Algorithm1 consists of three stages. In the first stage, each D2D pair should calculate its preference function for all resources to create its channel preference profiles. Hence, the corresponding complexity of the channel assignment stage is O(|F| * |D|), where |F| and |D| are the number of channels and D2D pairs. The complexity of the second stage is O(|F| * q). However, we note that the complexity of the swapping stage depends on the number of iteration needed for Algorithm1 to converge. As will be seen in Section V, Algorithm1 could converge within a few iterations. Hence, the corresponding of complexity of the swapping stage is upper bounded by O(|F| * q * I iteration ), where I iteration represent the number of iterations required to converge. Then, the overall computational complexity of Algorithm1 can be approximately computed as O(|F|(|D| + q(1 + I iteration ))).

Numerical Simulation
In this section, we discuss and analyze various simulation results to evaluate the average sum rate performance of Algorithm1. We use MATLAB to compute the numerical solutions of our proposed scheme. In the simulation, we consider a single cell multi-antenna BS located at the cell center with a radius of 600m in which the D2D pairs and cellular users are randomly distributed. The channel used in the simulation is modeled as h = β d α , where β is small scale fast fading gain, whereas α is the path loss exponent and d is the distance between transceiver. Other simulation parameters used are included in Table  4 unless mentioned otherwise.
To validate the performance of our proposed algorithm, we evaluate and compare the performance of Algorithm1 with the following four different benchmark schemes.

Random Channel Assignment and Equal Power
allocation (RCAEP): Here, we randomly assign a channel for D2D pairs while meeting the interference tolerance requirements of cellular users. The transmit power of each D2D user is set to be the maximum transmission power. Simulation tool mat lab Figure 2. Average sum rate of Algorithm1 compared to optimal and random matchings, where Imax= -25dB q = 5.
2. Random Channel Assignment and Power allocation (RCAPA): Here, we randomly assign a channel for D2D pairs while meeting the interference tolerance requirements of cellular users. The transmit power of each D2D user is set according to the proposed scheme in Algorithm1.

Channel Assignment and Equal Power Allocation (CAEP):
Here, the channel assignment is implemented according to Algorithm1, while the transmit power of each D2D user is set to be the maximum transmission power.
4. Optimal Matching: The optimal matching explores every possible channel assignment solution to find the optimum one while the transmit power of each D2D user is set to be according to Algorithm1. Figure 2 evaluates the achievable average sum-rate performance for a different D2D transmission power to compare the proposed algorithm's performance with random and optimal matching algorithms. In the random matching, each D2D user selects resources randomly. The exhaustive matching algorithm explores every possible solution to find the optimum one. As shown in Fig.2, the difference between the average sum rate of the proposed algorithm and the exhaustive one is small. The proposed algorithm can reach 86% of the optimal result. We can also see that by using the proposed algorithm, the data rate is much higher than the random algorithm.
The performance of the proposed matching algorithm is compared with random and optimal matching in terms of the sum rate of the cellular and D2D communications in figure 3 and figure 4, respectively. Figure 3 shows that as the D2D transmission power increases, the sum rate of CUs decreases because increasing D2D transmission will lead to more interference over the existing cellular communication. In the random matching algorithm, each D2D user selects a channel randomly and sends a request to BS for asking this channel. If the mutual interference between the D2D user and the existing cellular communication is above the threshold, the BS will reject the request from the D2D user. We can see from figure 3, the cellular users achieve poor performance under random scheme. The reason is that cellular users are not paired with their best D2D partners, and hence, D2D-to-cellular interference will be higher. Figure 4 shows the sum data rate of D2D communications in the proposed matching is much higher than random matching. This is because, in the proposed matching, each D2D user requests channels from BS based on its own preference function. Moreover, as figure 4 shows, the sum data rate of the proposed matching algorithm is more than 83% of this rate in the optimal matching. Therefore, the proposed matching algorithm can provide near-optimal performance for D2D communications.
In figure 5, in terms of the average sum rate, we compare the performance of Algorithm1 with the three random resource allocation schemes. Figure 5 shows that as the interference tolerance level increases, the average sum rate increases. This is because more D2D pairs are allowed to share the channel. From figure 5, our proposed Algorithm1 always achieves the highest average data rate. This is because compared with the CAEPA, RCAPA, and RCAEPA schemes, to make full use of the benefits of D2D communications, Algorithm1 matches the cellular resource and D2D pair according to their preference value. Compared to CAEPA scheme, Algorithm1 can effectively improve the sum data rate via the power allocation algorithm. We can also see in figure 5 that there are data rate gaps between Algorithm1 and the other three schemes, indicating that  the joint optimization of channel assignment and power control can effectively improve the system performance. Figure 6 shows the average data rate as function of number of cellular users under three different interference tolerance levels, i.e., Im = -25dB, Im = -20dB and Im = -15dB. We see that by increasing the interference tolerance, the average sum rates increase. The reason is that increasing interference tolerance allow many D2D pair to share the resource. However, by introducing a small number of cellular users, there is a substantial probability that the interference from the cellular users reduces the D2D user rates. The reduction in D2D rates is not compensated in the average sum rate by the contribution of the cellular user rates. Furthermore, for an increasing number of cellular users, even though the rate per D2D link decreases,   there is a local minimum after which the average sum rate begins to increase again. Figure 7 and figure 8 show the cellular and D2D users' data rate as a function of the number of cellular users for three different interference tolerance levels. As shown in figure 7, the sum data rate of cellular users increases monotonically with the increase of the number of cellular users and decreases with the growth of the interference tolerance requirement. On the other hand, from figure 8, the sum data rate of D2D pairs decreases with the increase of the number of cellular users and increases with the growth of the interference tolerance requirement. Figure 9 shows the achievable average sum rate as function of number of D2D users with respect to three different maximum interference tolerance thresholds, i.e., Im = -25dB, Im = -20dB and Im = -15dB. We see  that by increasing the number of D2D users, the average sum rates increase, which, however, saturates as the number of D2D users becomes large enough. This is because some D2D pairs may not be allowed to transmit due to the interference constraints. Figure 10 and figure 11 show the cellular and D2D users' data rate as a function of the number of D2D users for three different interference tolerance levels. As can be seen from figure 10, the sum data rate of cellular users decreases with the increasing the number of D2D users and increases with the decrease of the interference tolerance requirement. On the other hand, from figure  11, the sum data rate of D2D pairs increases with the increase of the number of D2D users and interference tolerance requirement. Figure 12 demonstrates how the additional degree freedom resulting from multi-antenna BS impacts the     achievable average sum rate in the D2D underlay massive MIMO system. As can be seen from the result, the average sum rate increases with an increasing number of BS antennas. This is because increasing BS antenna yields a favorable propagation environment where the channel vectors between the users and the BS are pair-wisely orthogonal. Hence, this can diminish the effect of inter-user interference. From this, we can say that for D2D underlaid multi-antenna BS, Algorithm1 can address the severe D2D-to-cellular interference; hence, D2D communications and massive MIMO can coexist while maintaining the performance requirement of the primary cellular users. Figure 13 and figure 14 shows the D2D and cellular users' data rate as a function of the number of D2D users for three different number of BS antenna. As shown in figure 13, the sum data rate of D2D communications increases with an increasing number of BS antennas. This is because of increasing diversity gains and orthogonality of the channel vectors. Hence, in the D2D enabled massive MIMO system, we can improve the D2D communications sharing the cellular channels by increasing the number of BS antennas while maintaining the cellular users' quality of service requirements. On the other hand, figure 14 shows that as the number of D2D users increases, the sum rate of cellular users decreases because as more D2D pairs are allowed to share the same channel, the D2D-to-cellular interference will increase. Increasing BS antenna will also increase the cellular transmission rate. Figure 15 shows the average data rate versus the number of iterations needed for reaching a stable matching. We can observe that the average data rate increase until stable matching is established. As can be seen, the average data rate increases with an increasing interference threshold value. This   is because, with increasing value of interference threshold, the interference tolerance of existing cellular communication increases. So, more D2D pairs are matched, and consequently, the data rate of the proposed matching algorithm increases. Indeed, the simulation result demonstrates the convergence of the proposed algorithm.

CONCLUSION
In this paper, we model the joint channel assignment and power control optimization problem to manage interference and improve the average sum rate when D2D and massive MIMO coexist as a non-linear optimization problem which is NP-hard. For addressing this complicated problem and maintain low complexity and network scalability, we proposed a three-stage stable and distributed resource optimization scheme in which D2D users and BS can interact and make resource allocation decisions based on locally available information. The analytical results demonstrated that, compared to the random counterparts, the proposed scheme could largely improve the average data rate performance with much lower overhead and complexity.