Distributions of NS-SEC variables in earlier and later versions of Wave 1 data
I first accessed the Wave 1 dataset around 2012 at which time there was a ‘refusal’ category for the derived variable ‘a_jbnssec5_dv'.
I am now returning to Wave 1 data for a new analysis, and have loaded the most recent version of the Wave 1 data. However, I have noticed that there is now no 'refusal' category for this variable, and also that the distributions of the categories of the NS-SEC variables in the more recent dataset appear slightly different when compared with earlier versions of the dataset.
Any clarification would be welcome.
#1 Updated by Gundi Knies over 4 years ago
- Category set to Derived variables
- Status changed from New to In Progress
- Assignee set to Gundi Knies
- Target version set to X M
- % Done changed from 0 to 90
we try to assign the most appropriate missing values according to our definitions, i.e. -9 "missing or wild", -8 "inapplicable", -2 "refused", -1"don't know". With derived variables it is always a bit more problematic to do this as technically the respondent was not asked "a question" they could refuse/don't know etc. The default is -9 until we have got around to working out what the more appropriate missing values are.
With respect to changes in the distribution of NS-SEC, we have made a number of changes to the way in which we derive socio-economic classifications over the years as there were a number of coding issues. In this specific example, I think what might have caused small changes in the distribution at W1 is that we started to use a new version of jbes2000 in the construction of jbnssec_dv. jbes2000 used to be computed directly from the interview in W1 but from W2 onward this was only done if respondents had job changes. In order to provide this for all workers we started using code provided by CAMSIS and applied this to all waves in the same way. We have also identified a number of SOC 2000 codes in our data that are not listed in the official code frame.
Please consult the variable notes in the online documentation to see which variables have been used in the construction of derived variables. The User Guide also includes more information on derived variables, and on any changes that have occured since the last release.
More generally, we always recommend that users use the latest release version as we continue to fix issues that users have reported (and occasionally we add content that needed more data cleaning or coding).
Hope this helps,