Design and Estimation for Split Questionnaire Surveys
James O. Chipperfield, David G. Steel
When sampling from a finite population to estimate the means or totals of K population characteristics of interest, survey designs typically impose the constraint that information on all K characteristics (or data items) is collected from all units in the sample. Relaxing this constraint means that information on a subset of the K data items may be collected from any given unit in the sample. Such a design, called a split questionnaire design (SQD), has three advantages over the typical design: increased efficiency with which design objectives can be met, by allowing the number of sample units from which information on a particular data item is collected to vary; improved efficiency in estimation through exploiting the correlation between the K data items; and flexibility to restrict the maximum number of data items collected from a unit to be less than K. An SQD can be viewed as designing for the missing pattern of data. This article considers sevaral estimators, including the Best Linear Unbiased Estimator (BLUE), for an SQD. The results show that significant gains can be achieved. The size of the SQD gains depends upon the function describing the survey costs, the design constraints, and the covariance matrix of the data items of interest. These methods are evaluated in a simulation study with four data items.
Missing data, multi-phase, optimal sample design, cost function