Journal of Official Statistics, Vol.3, No.4, 1987. pp. 419–429
Correction for Misclassification Using Doubly Sampled Data
Anders Ekholm and Juni Palmgren
Abstract:In doubly sampled data the units of a subsample are classified jointly by two methods: (i) a fallible but inexpensive, and (ii) a reliable but expensive. The rest of the units are classified only by method (i). We propose an extension of the generalized linear model (Nelder and Wedderburn (1972)) for such data. We model explicitly the nonsampling errors, i.e., the probabilities of misclassification. We then incorporate these into the model for the dependence of the response on the explanatory factors. There might be misclassifications both in the response and in the explanatory factors.
A car accident data set is analyzed in which 80 084 accidents were categorized only by the police, and 1 796 accidents were categorized both by the police and by personal interview of the accident victims. Our model is more explicit concerning the nonsampling errors than the models used for these data by Hochberg (1977) and by Espeland and Odoroff (1985).
Keywords:Error in explanatory factor; error in binary response; exponential family nonlinear model; generalized linear model; GLIM; misclassification model; structural model.
Copyright © Statistics Sweden 1996-2018. Open AccessISSN 0282-423XCreated and Maintained by OKS Group