A K-Anonymous Location Privacy-Preserving Scheme for Mobile Terminals

Mobile terminals boost the prosperity of location-based service (LBS) which have already involved in every aspect of People's daily life and are increasingly used in various industries. Aimed at solving the security and efficiency problem in the existing location privacy protection schemes, a K-anonymity location privacy preservation scheme based on mobile terminal is proposed. Firstly, number of rational dummy locations is selected from the cloaking region, from which more favorable locations are further filtered according to location entropy, so a better anonymity effect can be achieved. Secondly, the secure and efficient m-out-of-n oblivious transfer protocol is adopted, which not only avoids the dependency on the trusted anonymity center in existing schemes to improve the efficiency, but also meets the requirements for querying multiple interest points at one time. Security analyses demonstrate that this scheme satisfies such security properties as anonymity, non-forgeability and resistance to replay attack, and simulation results show that this scheme has higher execution efficiency and privacy level, while is low in communications costs.


Introduction
Along with the development of computer technology, global positioning systems and wireless communication networks, location-based service (LBS) technology [1][2][3] has become increasingly popular.Location-based service, LBS, refers to a location service provider that obtains the location of a device through various types of positioning technologies and provides the device with specific services requested by the device via the Internet.Its typical applications include vehicle navigation, online taxi hailing, takeout and ticketing services, etc.It not only brings great convenience to users, but also causes some changes in people's daily behavior.However, as users become increasingly dependent on LBS, private location privacy may as well at a great risk of disclosure accordingly [4][5].
When applying for LBS, mobile user needs to submit its current location and query information to LBS server in real time, which may help to establish user's location trajectory based on the temporal-spatial relationship.Thereby, this user's location trajectory can be employed to infer user's whereabouts, family address, and work place, so as to further obtain such privacy information as religious belief, living habits, medical information [6][7] ， etc.If such privacy information is obtained by any attacker, then user's location-related privacy is at great risk of disclosure.Consequently, location privacy protection technology is one of the research focuses in current field of mobile network security.
Focusing on the problem of location privacy protection, this paper employs both K-anonymity technique and Oblivious Transfer (OT) protocol to propose a K-anonymity location privacy protection scheme based on mobile terminal (KBMT).In this scheme, user firstly sends to LBS server k number of ID (including that of real user) as registration request, and LBS server generates pseudonyms and public/private key pairs for k number of ID after receiving registration request.When registration is completed, mobile terminal will generate and select dummy locations to complete location service query on the basis of OT protocol and LBS server.This paper makes contributions as follows:  We propose a K-anonymity location privacy protection scheme based on mobile terminal, which combines ID-based cryptosystem, K-anonymity technology and OT protocol to avoid privacy security's dependency on the trusted third party in the existing schemes, and to accomplish multiple services by one single request. We design a security enhancement algorithm for generating and selecting virtual locations to reduce the risk of privacy leakage of a user's real location by providing more confusing virtual locations. Our scheme verifies its effectiveness and security by establishing a simulation experiment environment.The rest of this paper is organized as follows: In Sect. 2. Work related to the research presented in this paper is presented.Some necessary preliminaries are described in Sect.3. The K-anonymity Privacy Protection Scheme based on Terminal is proposed in Sect. 4. The security analysis of the k-anonymous trajectory privacy protection scheme is given in Sect.5.Sect.6 focuses on the simulation of the proposed scheme.And conclusions are drawn in Sect.7.

Location privacy
In recent years, some scholars home and abroad have done lots of research in privacy protection technology and achieved certain positive results[24] [25].According to the different architectures in location privacy protection systems, those results can be divided into two categories: location privacy protection technology based on the trusted anonymity center, and location privacy protection technology based on mobile terminal.
Architecture based on the trusted anonymity center is also called trusted third-party (TTP) architecture, which is firstly proposed by Gedik and Liu [8] Location privacy protection scheme based on TTP architecture introduces the trusted anonymity center amid user and LBS server, and this center usually adopts some privacy protection techniques [9][10][11] to anonymize user's message of service request, as well as to complete the message transfer between user and LBS server, which in turn succeeds in protecting user's location privacy, and reducing the storage and computation costs of user's terminal.K-anonymity technology [12][13][14][15], as the most widely used location privacy protection technology, mainly forms a cloaking region including at least other number of different users and then this cloaking region replaces real user to send service request to LBS server, so as to reduce the precision of user's location, which will finally make the probability of attacker identifying the real user in cloaking area less than1 / K .In order to protect the safety of both the user's location and the query location, Kuang et al. [16]carried out a bidirectional K-disturbance based on the user's location and query location semantics, and the user matched the road section with the highest security according to the sensitivity preference and satisfied the Kanonymity.K-anonymity can prevent the identity information leak, but fails to prevent the attribute information leak.To verify this problem, Tu et al. [17]proposed a privacy preserving scheme to prevent semantic and re-identification attacks by employing three data masking methods: k-anonymity, l diversity and tcloseness.Wang et al.[21] proposed a location privacy preservation method based on k-anonymity and Voronoi maps, which ensures the privacy of location information while guaranteeing the security of the process and highquality service.
As the performance of mobile terminals keeps promoting, their calculation and storage capability are also improved greatly.As a result, privacy protection scheme based on mobile terminals becomes feasible, which may serve to solve the problem of performance bottleneck and of the security's dependency on the trusted anonymity center in existing schemes.Li et al. [22] proposed the use of hidden Markov transfer matrix model to predict the user's motion trajectory, and will be used The forecast position of the next moment is used as the query content of the previous moment; Yang et al.[23]propose a kanonymous location privacy protection scheme via dummies and Stackelberg game.The proposed scheme can effectively resist the single-point attack and inference attack while balancing the service quality and location privacy.Despite that these schemes avoid the dependency on the trusted anonymity center, they still focus only on user's privacy, while ignore the privacy of LBS server.If single user is able to infer the overall information of LBS server based on the partial information obtained, other users' private information is at the risk of disclosure and LBS server may be invalid.Therefore, the privacy of LBS server may as well be protected.

Architecture of location privacy protection system
Based on K-anonymity and OT protocol, this paper devised an architecture of location privacy protection without a third party, which is mainly composed of two entities: MT and LBS server, as is shown in Figure .1.The functions are as follows: MT: sending anonymization request to LBS server; generating & selecting dummy location nodes, sending location query request to LBS server and receiving query result.LBS: server: dealing with user's registration and query of interest points, encrypting the query results and returning them to MT.

Position entropy
Given a cloaking region that contains number of candidate locations, the probability of each location becoming real location is marked as： 1 ( ) then its value of location entropy is: Equations ( 1) and ( 2) can be used to obtain the location entropy of the candidate nodes, the higher the value is, the more secure the privacy protection can be.Obviously, when all are equivalent, the higher the entropy value of location nodes, the more secure the privacy protection can be.

System Initialization
In this phase, system parameters are generated as follows: Step 1: Select two cyclic groups 1 G and 2 G with order q ， in which 1 G is the addition cyclic group, 2 G the multiplication cyclic group, and q a big prime number.Let denote a bilinear pairing.
Step 2: Define three harsh function: 1 ; 3 H is the harsh function of SHA256; n denotes a integer, while * {0,1} is the binary string at any length.
Step 3: LBS server selects a random number * q s Z ∈ , let s be the system's private key, calculates its public key

PK sP
= , in which P is the generator of 1 G .
Step 4: LBS server stores the system's private key s , publicizes the public parameter: , } H H .

User registration
In this phase, MT sends k number of ID (including itself) to LBS server as registration query message, then LBS server generates pseudonyms and corresponding public/private key pairs for these ID and returns them to users.The specific steps are as follows: Step 1: MT sends k number of identity information to LBS server as registration query, in which user's identity U ID is located at the u -th position in

Generation and selection of dummy locations
The user B generates 2k number of dummy locations via MT, from which 1 k − number of more favorable locations are selected.The specific procedure is as follows: Step 1: Centering around B L ---the location of the user B, MT generates a dummy location i L by employing algorithm of uniformly distributed random points in rectangular region.Suppose the rectangular region is [ , ]  = where is a member of the false position set C ,and add it into location set Step 4 ： Based on formula (1), calculate Pr(i) ---the probability of each dummy location becoming real location, then select from 2k number of false locations 1 k − number of false locations with higher ( ) Step 5 ： MT allocates fake identities to k number of location nodes (including MT itself).

Location service request
In this phase, MT sends location service request to LBS server, and LBS server answers the service request.The steps are as follows: Step 1: LBS server randomly selects , and releases them as basic points of selection.
Step 2: The user B randomly selects Step 3 : Gather the generated fake identities, location node and dummy query information to form a query set: {( , , , ), ( , , , ), , ( , ,   )) (for multiple query requests, calculate multiple query results); if not valid, the user B discards this query result, returns to Step 1 and re-starts service request.

Correctness Analysis
The correctness of this scheme can be proved if the user can succeed in obtaining the query result u m , which is negligible.Therefore, the scheme meets the requirement for anonymity.

Resistance to replay attack
Definition 2: he attacker A re-sends the user's request message for registration and location service which have been processed by LBS server, so as to obtain the same results as the user B. The information known to the attacker A includes: Q ， ---the request message for location service, and 1 2 , , , n P P P L ---the selected basic point that LBS server publicizes.Theorem 2: if the attacker A obtains the same registration results and location service request results as the user B with negligible probability, then this scheme is able to resist replay attack.Proof: it is known that the result of the user's registration request is ), , ( ,in which the pseudonym is 3 ( ) { , ( , , , ), ( , , , )} . Since r and i a are both the one-time random number generated u randomly by LBS server, so the attacker still cannot obtain the same 0 Y and ( 1, 2, , ) as the user, even if he obtains the location request message to replay; while 2 ( ( , ) ) c m H e P sPK U = ⊕ + , in which r is the one- time random number, so the attacker cannot obtain the same ciphtertext ( 1, 2, , ) i k = L from the location service request.Even if the attacker obtains the ciphtertext i c , he still has to decrypt it.
Through the analysis above, if the attacker A attempts to obtain the same registration results and location service results as the user B via replay, the probability is negligible, so this scheme is able resist replay attack.{ , , , , , , , , } G G e n q P PK H H and confidential master key s .

Non-forgeability
(2) Training stage: the attacker A sends identity information i ID to the challenger C, and requests an answer from oracle model 1 H , whereas the challenger C performs key generation algorithm to generate and return the corresponding public/private key pairs to the attacker A. This stage can be repeated polynomial bounded times.
(3) Challenge stage: the attacker A randomly specify a user's identity u ID { , , , , , , , , } G G e n q P PK H H and the communications messages between the user and LBS server, in which PK sP = , while s is the random number generated by the system and in unknown to the attacker A.
During the stage of user's registration, the attacker launches adaptability query to ROM 1 H , inquires and obtains the corresponding harsh values.The process is as follows: The attacker A asks the challenger C for the harsh value

Communications costs
In an intact location service request, the communications costs mainly come in the stages of user registration and location service request, because those are the only places where information is exchanged, and the major communications data package includes: message of user's registration request, results of user's registration, message of location service request and the set of encryption results.The amount of communications costs depends on the size of the data package and anonymity degree K .The selection range of the parameter K in this experiment is [5,  15], as is shown in Fig. 2, and there is a linear relationship between communications costs and anonymity degree K in this scheme and other schemes.Since the scheme in reference [20] is based on the third-party anonymity center, and its communications must go through the center, so its communications costs increase with increasing anonymity degree K , at a faster speed than other two schemes.reference [18] in the scheme to construct the dummy anonymous set, need to verify the user's historical query probability, at the same time, discrete selection, so its communication time is slightly higher than the present scheme.reference [19] only submits a perturbation location for querying, so its communication overhead is slightly lower than this scheme.
As a result, the results of this experiment demonstrate that this scheme can reduce the communications costs in the existing K -anonymity technology, thus has a certain advantage.

execution time
In this simulation experiment, program efficiency is measured by the time required to execute the algorithms proposed in each program.And the time cost of executing the algorithms in this scheme and the comparison scheme occurs mainly in the virtual location generation and selection phase.Since the magnitude of anonymity determines the number of virtual locations to be generated as well as the optimal virtual location, the execution time varies with the anonymity K.And the selection range of parameter K is assumed to be [10,80] in this experiment.As shown in Fig. 3, when the value of anonymity K is taken low, the execution time of this scheme is close to that of reference [18] [20] and slightly higher than that of reference [19] but with the increase of K, the execution time of reference [19] is gradually higher than that of this paper.Reference [18] constructs an anonymous set while requiring discrete selection of locations, so its running time is gradually higher than this method.The execution time in reference [20] increases in a stable linear fashion as K increases.However, the algorithm in reference [20] is more complex and less efficient to execute than this scheme.

Privacy level
The location privacy level is usually measured by location entropy, measuring the quality of the chosen candidate nodes in the stage of dummy location generation and selection.The higher the entropy value is, the better the privacy is protected.With the occurrence probability of all location nodes (including the real location) being equivalent, the location entropy reaches its maximum level ideally.The higher the anonymity degree K is, the more confusing the user's real location will be; but if K is excessively high, the communications costs and efficiency will be affected, so the value of anonymity degree K is assumed as [10,100].As is shown in Fig. 4, the privacy level in three schemes all invariably increases with increasing anonymity degree K .However, when the value of K reaches a certain level, the increasing speed tends to be mild.That is because the obfuscation ability tends to be saturated as the dummy locations in the cloaking region become excessively dense, so they have little effect in protecting the privacy no matter how many dummy locations are added.Since the random scheme doesn't takes into account the rationality of map information and dummy locations, so it has the most unfavorable effect in privacy protection.Although reference [19] fully considered the query probability of interest points in the process of generating anonymous sets, and selected interest points with similar probability of querying the user's location to constitute the anonymous set, its privacy effect is still inferior to that of the present scheme, which is lower than the present scheme by 0.2 %.

Conclusions
This paper proposes a K-anonymity location privacy protection scheme based on mobile terminal, which adopts secure and efficient m-out-of-n oblivious transfer thus avoids the dependency on the trusted anonymity center in existing schemes, improves the execution efficiency, and reduces the communications costs.
Moreover, this scheme selects randomly 2k number of rational dummy locations from the cloaking region, then continues to select 1 k − number of more favorable dummy locations according to locations entropy, thus improves the privacy level.The security analyses demonstrate that this scheme satisfies such security properties as anonymity, resistance to replay attack and non-forgeability.And the simulation experiment is also conducted to verify the communications costs, execution efficiency and privacy level, and the results show that the proposed scheme is superior to other schemes.Therefore, this scheme is of important theoretical significance and applicable value in security research related to location privacy protection.For future work, it will consider applying deep learning models to location privacy to further enhance security.

Figure 1 .
Figure 1.Architecture of location privacy protection system in this scheme figuring out -time random number generated by LBS server by means of pseudo-random number generator based on encryption.Therefore, even if the attacker obtains the same identity information to resend registration request, he still cannot acquire the identical pseudonym, which means he cannot obtain the same results from registration request.It is known that the results of the user's registration request is 0 1 2 1 2 corresponding Random Oracle Model (ROM), devise a non-forgeability game based on chosen plaintext attack.The two sides in the game are the challenger C and the attacker A, and the model operates as follows: (1) Initialization: the challenger C generates open systematic parameter

Figure 2 .Figure 3 .Figure 4 .
Figure 2. The relationship between degree of anonymity K and communications costs send it to LBS server, as attack target---