Derived interview and demographic characteristics
I am unable to understand the derivations of several variables:
- w_intdaty_dv, w_intdatm_dv, w_intdatd_dv
For these, there is no derivation note and I do not understand how exactly this information differs from w_istrtdaty, w_istrtdatm, w_istrtdatd.
- w_age_dv, w_doby_dv
For these, the derivation note says "Derived from date of birth held in the sample administration data base and the derived date of interview. Recoded to missing for sample members whose interview outcome is inconsistent with the suggested age +/- one year."
- I do not understand what is the sample administration data base; therefore, how date of birth from this data set relates to the reported date of birth by the respondent. I checked and verified that the discrepancy between the derived and reported year of birth exists for both proxy and main interview individuals. So, it cannot be that discrepancy only comes from proxy interviews.
- I have also checked that in the dataset the discrepancy between reported and derived year of birth can be more than +/- 1 (actually it can go up to 24 years). So, I do not understand how exactly the rule "age +/- one year" was applied.
- w_ethn_dv, w_racel_dv
These variables have identical derived variable notes that it takes into account previous waves and reports from other hh members, and yet the values these can take are not identical. I don't really understand which of these takes into account a larger set of information provided in the dataset.
Thank you for your attention!
#2 Updated by Alita Nandi 11 months ago
- % Done changed from 10 to 50
- Assignee changed from Stephanie Auty to Nurfatima Jandarova
w_racel_dv: The ethnic group question, w_racel (as well as the versions asked of telephone respondents w_racelt etc) is asked in adult (16+ year old) interviews but only the first time a person is interviewed. So, this information can be found in different variables for different respondents. All this information is combined to create w_racel_dv.
w_ethn_dv: combines the ethnic group reported by adult respondents and ethnic reported by 10-15 year olds in the youth questionnaire. If this information is missing (for non-respondents, children below age 10) then information collected in the household interviews or the ethnic group reported by parents is used to impute missing information.
w_ethn_dv is thus avaialble for some non-respondents and children less than 10 years while w_racel_dv is missing for anyone who did not participate in an adult interview.
w_istrtdaty, w_istrtdatm, w_istrtdatd - are the interview dates for adult (16+ year old) interviews. But this is missing for children and non-respondents. For those cases this information is imputed using household interview dates and this imputed versions of these variables are: w_intdaty_dv, w_intdatm_dv, w_intdatd_dv. For further details see: https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation/wave/1/datafile/a_child/variable/a_intdaty_dv
#4 Updated by Stephanie Auty 10 months ago
- % Done changed from 50 to 80
- Status changed from In Progress to Feedback
Dear Nurfatima Jandarova,
Apologies for the delay in responding to the remaining part of this question.
The sample administration database is our own database for administering the survey so you will not have access to it. The interview outcome is recorded in f_ivfio. For example, if the suggested age, calculated from the date of birth and date of interview, was over 17 but the individual had completed a youth interview then this would be inconsistent and so the derived age would be recoded to missing.