YPDKLM: irregular variable dist.
Dear US team,
I have recently started working with the alcohol consumption variables you provide in the youth self-completion questionnaire. Graphing the distribution of ypdklm, I found that the variable is distributed in a very irregular way. The questionnaires show that the question has 5 different response options. But in 2010, 2012 and 2014 the data shows 6 different response options, while in 2009, 2011, 2013 and 2015 the data shows 5 different response options. I include the histogram as an attachment in this email.
Could you please provide me with some clarification?
#1 Updated by Stephanie Auty about 1 year ago
- Category set to Data inconsistency
- Status changed from New to In Progress
- Assignee set to Stephanie Auty
- Target version set to X M
- % Done changed from 0 to 10
- Private changed from Yes to No
Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.
Stephanie Auty - Understanding Society User Support Officer
#2 Updated by Stephanie Auty about 1 year ago
- Status changed from In Progress to Feedback
- Assignee changed from Stephanie Auty to Amy Orben
- % Done changed from 10 to 80
If you look at the value labels you will see that the sixth response option is labeled "never but q45 answered". e.g. https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/2/datafile/b_youth/variable/b_ypdklm
Then, looking at the questionnaire you can see that q45, yp5alcdr, is "Thinking back over the last four weeks, how many times (if any) have you had five or more drinks on one occasion? (A ‘drink’ is one pint/bottle/can of beer or cider, 2 alcopops, one small glass of wine, a single measure of spirits)."
This follow up question is not asked in odd numbered waves, which is why the sixth response option does not appear in data from those waves.
Around 97-98% of the responses to the follow up question are "none" if their answer to the previous question was "never", so I would think these can be reasonably combined with response 5 at ypdklm. You will need to decide how to treat the other responses depending on your research question.
#4 Updated by Amy Orben about 1 year ago
Actually I have another question. See attached, the distribution shows that dklm has a lot more missing values in 2011, 2013 and 2015. I have a feeling it has something to do with ypregalco, but I am not sure why this only occurs in certain years as ypregalco is asked every year.
#5 Updated by Stephanie Auty about 1 year ago
If you read the questionnaires you will see that these missing answers are due to routing at ypeveralc. Waves 3, 5 and 7 included an additional question, ypregalco, and routing around that and ypdklm if the answer to ypeveralc was no. Waves 1, 2, 4, 6 and 8 did not include that routing so respondents answered ypdklm regardless of their answer to ypeveralc. This has been corrected for future waves. As in my previous answer, if someone has answered that they have never had an alcoholic drink and then answered ypdklm with "never" then these could be grouped with those who answered that they have never had an alcoholic drink and skipped ypdklm due to the routing.