Coding of 'missing' in earlier and later versions of Wave 1 data of variable 'a_scghq2_dv'
I first accessed the Wave 1 dataset around 2012 at which time there was a ‘missing’ category for the derived variable ‘a_scghq2_dv’, but no 'inapplicable'.
I am now returning to Wave 1 data for a new analysis, and have loaded the most recent version of the Wave 1 data. However, I have noticed that variable 'a_scghq2_dv' now has both a ‘missing’ and an ‘inapplicable’ category, and that the N for the 'missing' is much lower than previously
Can I assume that in the original dataset the ‘missing’ and ‘inapplicable’ were combined into an overall ‘missing’ category, but that in the most recent version of the dataset the ‘missing’ and the ‘inapplicable’ categories have been differentiated?
#1 Updated by Gundi Knies almost 5 years ago
- Category set to Derived variables
- Assignee set to Gundi Knies
- Target version set to X M
- % Done changed from 0 to 90
we try to assign the most appropriate missing values according to our definitions, i.e. -9 "missing or wild", -8 "inapplicable", -2 "refused", -1"don't know". With derived variables it is always a bit more problematic to do this as technically the respondent was not asked "a question" they could refuse/don't know etc. The default is -9 until we have got around to working out what the more appropriate missing values are.
In this specific case it is fair to assume that the new -8 are recodes of previous -9 due to the respondents not having provided a self-completion which means any questions that are part of the self-completion will not have been applicable to them. To be absolutely sure you can of course match the variables from the two release versions of the data and compare on a case-by-case basis.
Hope this helps,