Comparison of Three Post-tabular Confidentiality Approaches for Survey Weighted Frequency Tables

Research output: Contribution to journalArticlepeer-review

Abstract

One of the most common forms of data release by National Statistical Institutes (NSIs) are frequency tables arising from censuses and surveys and these have been the focus of statistical disclosure limitation (SDL) techniques for decades. With the need to modernize dissemination strategies, NSIs are considering web-based flexible table builders where users can generate their own tables of interest without the need for human intervention. This has led to a shift in traditional disclosure risks of concern and a move towards inferential disclosure risk where statistical data can be manipulated and combined with other data sources to reveal sensitive information with a high degree of certainty. To protect against inferential disclosure risk, perturbative methods with more formal privacy guarantees are necessary. We examine three post-tabular confidentiality protection methods of additive random noise that can easily be applied ‘on-the-fly’ in a flexible table builder for generating survey weighted frequency tables: the computer science approach guaranteeing a formal privacy model called differential privacy and two SDL approaches of post-randomization and a new technique called drop/add-up-to-q. We demonstrate and compare their application in a simulation study based on survey weighted counts in tables.

Bibliographical metadata

Original languageEnglish
Pages (from-to)145-168
JournalTransactions on Data Privacy
Volume12
Issue number3
Publication statusPublished - 2019