Journal of Official Statistics, Vol.15, No.4, 1999. pp. 517535

Statistical Methods for Developing Ratio Edit Tolerances for Economic Data

The U.S. Census Bureau developed general-purpose ratio edit software for use by the ten, sectors of the 1997 Economic Census. This software requires explicit bounds (tolerances) for each ratio edit. We investigated statistical methods of automatically setting tolerance limits, examining three methods: robust estimation (15% trimmed mean and standard deviation); resistant fences (EDA method based on first and third quartiles and interquartile range); and gap analysis (Distance Measurement Algorithm for the Selection of Outliers, D_MASO). We also developed an approach for symmetrizing skewed distributions of ratios using power transformations prior to tolerance development. We evaluated these methods on two sets of historical data: the 1994 Annual Survey of Manufactures (ASM) and the 1992 Business Census. In both data sets, we achieved success with some variation of resistant fences and recommend that this methodology be used in the absence of subject-matter expertise or known mathematical bounds on a ratio relationship.

EDA (Exploratory Data Analysis); resistant; robust; ratio edit.

