Journal of Official Statistics, Vol.14, No.4, 1998. pp. 373–383
A Bayesian, Species-Sampling-Inspired Approach to the Uniques Problem in Microdata Disclosure Risk Assessment
Stephen M. Samuels
Abstract:One important measure of disclosure risk for microdata is the proportion of sample
uniques which are also population uniques. The distribution of this random variable
depends on the population only through its partition structure: the distribution of the
numbers of cells of each size. Partition distributions have been extensively studied in
population genetics. Portions of that research can be adapted to provide us with the
promise of a mathematical framework based on plausible prior distributions with easy to
interpret parameters, and a modified Polya urn sampling model from which risk assessment
is easily obtained.
Keywords:Partition structure; Polya urn; Poisson-Dirichlet.
Copyright © Statistics Sweden 1996-2018. Open AccessISSN 0282-423XCreated and Maintained by OKS Group