Journal of Official Statistics, Vol.22, No.3, 2006. pp. 507524

Current Issue
Personal Reference Library (PRL)
Personal Page

A Comparison of Multiple Imputation and Data Perturbation for Masking Numerical Variables

Statistical disclosure limitation techniques are designed to provide legitimate users with access to useful data while simultaneously preventing disclosure of sensitive information. Two techniques that can be used to limit disclosure of sensitive numerical data are multiple imputation and data perturbation. While many studies have addressed the effectiveness of perturbation and multiple imputation individually, no studies have directly compared the two techniques. In this study, we compare the effectiveness of multiple imputation and data perturbation for numerical microdata. The results indicate that, in the absence of missing data, data perturbation performs better than multiple imputation. In addition, since only a single perturbed data set is released (unlike the multiply-imputed data sets that are released), data perturbation eases the burden on users of such data.

Confidentiality, privacy, data dissemination

Copyright Statistics Sweden 1996-2018.  Open Access
ISSN 0282-423X
Created and Maintained by OKS Group