Generalizing Data to Provide Anonymity when Disclosing Information


	Computer Science Laboratory

	Contact

	People at CSL

	Projects at CSL

	Research Programs

	Division Home

Generalizing Data to Provide Anonymity when Disclosing Information
by Pierangela Samarati and Latanya Sweeney.

March, 1998.

Abstract

The proliferation of information on the Internet and access to fast computers with large storage capacities has increased the volume of information collected and disseminated about individuals. The existence of these data sources makes it much easier to re-identify individuals whose private information is released in data believed to be anonymous. At the same time, increasing demands are made on organizations to release individualized data rather than aggregate statistical information. Even when explicit identifiers, such as name and phone number, are removed or encrypted when releasing individualized data, other characteristic data, which we term quasi-identifiers, can exist which allow the data recipient to re-identify individuals to whom the data refer.

In this paper, we provide a computational disclosure technique for releasing information from a private table such that the identity of any individual to whom the released data refer cannot be definitively recognized. Our approach protects against linking to other data. It is based on the concepts of generalization, by which stored values can be replaced with semantically consistent but less precise alternatives, and of k-anonymity. A table is said to provide k-anonymity when the contained data do not allow the recipient to associate the released information with a set of individuals smaller than k. We introduce the notions of generalized table and of minimal generalization of a table with respect to a k-anonymity requirement. As an optimization problem, the objective is to minimally distort the data while providing adequate protection. We describe an algorithm that, given a table, efficiently computes a preferred minimal generalization to provide anonymity.

BibT_EX Entry @techreport{SRI-CSL-98-04, AUTHOR = {Pierangela Samarati and Latanya Sweeney}, TITLE = {Generalizing Data to Provide Anonymity when Disclosing Information}, YEAR = {1998}, MONTH = {mar}, URL = {http://www.sdl.sri.com/papers/344/} }

About Us R&D Divisions Careers Newsroom Contact Us
© 2025 SRI International 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy