We are using individual level data from BHPS waves G, L and Q, as well as the Understanding Society survey wave 4, so that there are 5-year intervals between each of the waves between 1997/98 and 2012/13. We are running an autoregressive model with lagged dependent variables to understand the longitudinal relationship between two variables from the main individual level survey dataset.
Three questions on longitudinal weighting:
1) Just to check - what are the correct weights to be used, given that we do not use all the waves, and only use the data for those individuals that are part of all 4 waves that we selected (2,155 complete cases)? "wlrght" (BHPS) and "w_indin91_lw" (Understanding society) look most suitable to me.
2) How do we avoid the high number of zero weights? I understand that these always occur if individuals did not reply continuously throughout all the waves (which would not matter in our case, because the variables of interest were anyways only asked every 5 years in the BHPS)?
3) If I understand the documentation correctly, only the last of the longitudinal weights from wave 4 in the Understanding Society survey will have to be used in a model, even though the model incorporates data from waves G, L and Q (BHPS) as well?
Many thanks in advance for your help.
#1 Updated by Alita Nandi over 4 years ago
2. you are correct that for your analysis you do not need respondents to respond continuously from waves 1 to 18 as part of the BHPS sample and waves 1-4 as part of Understanding Society sample, but weights provided are designed for this specific sample that respondeed continuously over this entire period. So, you can use the longitudinal weights from wave 4, but you will end up losing a large part of your sample who will have zero weights because they missed at least one interview.