Journal of Official Statistics, Vol.19, No.4, 2003. pp. 383–402
A Fast and Simple Algorithm for Automatic Editing of Mixed Data
Ton de Waal and Ronan Quere
Abstract:In order to automate the data editing process the so-called error localisation problem, i.e., the problem of identifying the erroneous fields in an erroneous record, has to be solved. A new algorithm for solving the error localisation problem for mixed data, i.e., a combination of continuous and categorical data, has recently been developed. This algorithm is based on constructing a binary tree, and subsequently searching this tree for optimal solutions to the error localisation problem. In the present article we provide a mathematical description of the algorithm, and prove that the algorithm determines all optimal solutions to the error localisation problem. We also provide computational results for several realistic data sets involving only numerical data.
Keywords:Branch-and-bound; data editing; Fellegi-Holt method; Fellegi-Holt paradigm; Fourier-Motzkin elimination.
Copyright © Statistics Sweden 1996-2018. Open AccessISSN 0282-423XCreated and Maintained by OKS Group