Support #759


Data analysis
I wonder why SF-12 variables are available twice in Waves 2 to 5: as usual variables and as self-completion. For instance, b_sf1 and b_scsf1 represent the same question on general health. What is most confusing: there are a lot of participants who have valid answers on both variables, and a simple crosstabulation of these variables shows that the answers are not identical for many participants. That is, one and the same person may have reported that their health is excellent (according to b_sf1) and fair (according to b_scsf1). I would be happy to find out which of these (groups of) variables is the "real" record of participants' responses to SF-12.


Hi Maria,
in Wave 1 the SF12 was carried in the face-to-face (f2f) questionnaire but it was decided that it should be moved to the self-completion. However, as some people do not participate in the self-completion interview and we would not then have the general health information for them, the first element of the SF12 was carried in both questionnaires. As to using the information: Both accounts are ‘real’ and both may be affected by measurement error. If you want to use the entire SF12 you can only use the f2f version in Wave 1 and the self-completion version in later waves. If you are only interested in the general health question, in order to use a longitudinally consistent measure you may prefer to use the f2f version only. For waves 2+, you could also use the self-completion version for comparison. Differences could arise, for instance, due to the instrument effects (in the self-completion the interviewer will not know what the respondent answers).
You may also find a paper by our former colleague Alexandru Cernat on mode effects in SF12 data collection useful. It's published in Survey Research Methods 2015.

