Journal of Official Statistics, Vol.19, No.4, 2003. pp. 383402

Current Issue
Personal Reference Library (PRL)
Personal Page

A Fast and Simple Algorithm for Automatic Editing of Mixed Data

In order to automate the data editing process the so-called error localisation problem, i.e., the problem of identifying the erroneous fields in an erroneous record, has to be solved. A new algorithm for solving the error localisation problem for mixed data, i.e., a combination of continuous and categorical data, has recently been developed. This algorithm is based on constructing a binary tree, and subsequently searching this tree for optimal solutions to the error localisation problem. In the present article we provide a mathematical description of the algorithm, and prove that the algorithm determines all optimal solutions to the error localisation problem. We also provide computational results for several realistic data sets involving only numerical data.

Branch-and-bound; data editing; Fellegi-Holt method; Fellegi-Holt paradigm; Fourier-Motzkin elimination.

Copyright Statistics Sweden 1996-2018.  Open Access
ISSN 0282-423X
Created and Maintained by OKS Group