Protecting Privacy in Data Release by Giovanni Livraga

Protecting Privacy in Data Release by Giovanni Livraga

Author:Giovanni Livraga
Language: eng
Format: epub
Publisher: Springer International Publishing, Cham


3.10 Queries and Data Utility with Loose Associations

The reason for publishing group associations among fragments, representing vertical views over the original data, is to provide some (not precise) information on the associations among the tuples in the fragments while ensuring not to expose the sensitive associations defined among their attributes (for which the degree of uncertainty k should be maintained). Group associations then increase the utility of the data released for queries involving different fragments. However, given a set of fragments, different group associations might be defined satisfying a given degree k of looseness to be provided. There are two different issues that have to be properly addressed in the construction of group associations: one is how to select the size k i of the grouping of each fragment f i such that the product of any two k i is equal to or greater than k; and one is how to group tuples within the fragments so to maximize utility.

With respect to the first issue of sizing the groups, there are different possible values of the different k i which can satisfy the degree k of protection. For instance, for a group association between two fragments, we can use (k,1), , and (1,k). In the case of multiple fragments, the best utility can be achieved by distributing as much evenly as possible the sizing of the groups, hence imposing on each group a size close to . An uneven distribution would in fact result in an over-protection of the group associations over some of the fragments (a value of looseness much higher than the required k for constraints covered by a subset of the fragments in ). Experiments show that this would lead to a significant reduction in the precision of the queries. For instance, a looseness of 12 over three fragments could be achieved with a (3,4,4)-grouping; a solution creating a (1,12,12)-grouping would indeed provide the required protection overall but would probably provide little utility for the association between the second and third fragments (whose association would in fact be 144-loose for the constraints that are relevant for the second and third fragment only).

With respect to the issue of grouping within a fragment, we first note that queries that involve a single fragment (i.e., all the attributes in the query belong to the same fragment) are not affected by fragmentation as they can be answered exactly by querying the fragment. For instance, with respect to the fragments in Fig. 3.22, query q = “select avg(Salary) from MedicalData group by Job” involves attributes that belong to the fragment F r only. Hence, the execution of the query over fragment F r returns exactly the same result as its execution over the original relation MedicalData in Fig. 3.17a. We therefore focus our discussion on queries that involve two or more fragments, on which group associations are to be defined, with the goal of determining how to group the tuples in fragments so that the induced group associations maximize query utility. In particular,



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.