Improving Comparability of Existing Data by Response Conversion
Stef van Buuren, Sophie Eyres, Alan Tennant and Marijke Hopman-Rock
Incomparability of information is the key problem in international comparisons. The usual way to improve comparability is to harmonise data collection efforts. However, harmonisation fails if the data have already been sampled, or if appropriate harmonisation cannot be achieved for whatever reason. This normally leaves no other option than to either make strong unwarranted assumptions about the data, or abandon any comparative work. This article proposes an approach that, under certain circumstances, might provide useful comparative analyses from existing incomparable data.
The method, termed Response Conversion (RC), addresses the problem of divergent formulations of survey questions. RC attempts to transform responses obtained on different items (questions) onto a common scale. Where this can be done, comparisons can be made using the common scale. The method consists of two steps. The first step is the construction of a conversion key by means of a statistical model, in our case the polytomous Rasch model. This can only be done if enough overlapping information is available between the different items, but when it is there, RC makes no assumptions about the distribution of the common scores across populations. A linkage map of studies by items provides an important tool to assess whether such overlapping information is available in the data at hand. The second step uses the conversion key to convert the information onto the common scale. This step is straightforward, and can be repeatedly done on a routine basis as new information arrives.
The properties of the Rasch model are well-known, but the models application in this context introduces some methodological issues. These include the assessment of the model fit in sparse data situations (including the assessment of unidimensionality and the absence of differential item functioning), the robustness of the results regarding the choice of the prior distribution, and the uncertainty introduced if only one item is measured. We believe that all issues can be adequately addressed, and that RC is one of the very few principled approaches for analysing incomparable data.
The method was developed within the EC Health Monitoring Program, and is illustrated for estimating walking disability in different countries.