Inapplicable and missing data
I am working with an appended dataset of the 7 waves, and I noticed something weird.
Some 'basic' variables, such as sex and highest qualification (qfhigh) have too many missing values or 'not applicable'. Also, sometimes in the same pidp the variable sex in with a value in one year and missing in one other year.
Is there any procedure that should be done to report the value of these variables when missing? Why it is so?
Thanks, best, g
#1 Updated by Stephanie Auty over 1 year ago
- Category set to Data documentation
- Status changed from New to In Progress
- Assignee set to Stephanie Auty
- Target version set to X M
- % Done changed from 0 to 10
Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.
Stephanie Auty - Understanding Society User Support Officer
#3 Updated by Stephanie Auty over 1 year ago
- Status changed from In Progress to Feedback
- Assignee changed from Stephanie Auty to Giorgio Piccitto
- % Done changed from 10 to 70
Which data files are you working with? I have checked w_sex in w_indresp and this does not seem to have many missing values. It may be inconsistent over time, as this variable is just based on what data is collected and is not edited. We produce a derived variable w_sex_dv, which is consistent over time and which we recommend should be used.
w_qfhigh has a high number of missing values because it is only asked of new entrants. Please use the variable w_qfhigh_dv which incorporates answers to w_qfhigh from previous waves and also new qualifications asked about since the respondent's first interview.