Disclosure Risk Measurement with Entropy in Two-Dimensional Sample Based Frequency Tables

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We extend a disclosure risk measure defined for population based frequency
tables to sample based frequency tables. The disclosure risk measure is based on information theoretical expressions, such as entropy and conditional entropy, that reflect the properties of attribute disclosure. To estimate the disclosure risk of a sample based frequency table we need to take into account the underlying population and therefore need both the population and sample frequencies. However, population frequencies might not be known and therefore they must be estimated from the sample. We consider two
probabilistic models, a log-linear model and a so-called Polya urn model, to estimate the population frequencies. Numerical results suggest that the Polya urn model may be a feasible alternative to the log-linear model for estimating population frequencies and the disclosure risk measure.

Bibliographical metadata

Original languageEnglish
Title of host publicationProceedings of UNECE worksession on statistical confidentiality
Subtitle of host publicationHelsinki, 5-7 October 2015
Pages1-10
Publication statusPublished - Mar 2015
EventUNECE worksession on Statistical Confidentiality - Tarragona
Event duration: 1 Jan 1824 → …

Conference

ConferenceUNECE worksession on Statistical Confidentiality
CityTarragona
Period1/01/24 → …