Weights for longitudinal analysis
I am currently working with the long file using BHPS and USOC data (waves 1-25) and I am struggling to understand which is the best weight for a longitudinal analysis that combines the two dataset (bhps+usoc).
I have tried using the weights lrwght (bhps) and indin91_lw (usoc) as read on the manual. However, I lose a very large number of observations by doing that. Is that correct? What is it due to? Is it only because of sample attrition over time or are the weights correcting for something else I am not aware of?
Will the problem be the same if I use the 2001 version of the weight (lrwtuk1 for bhps)(indin01_lw for usoc)? If I analyse data from 2001 to 2016, should I use the 2001 version? Since the longitudinal weight is missing in the first wave (2001), is it correct to consider it to be 1 in the first year?
#4 Updated by Olena Kaminska about 2 years ago
Thank you for your question. Indeed the choice of weights depends on the type of analysis you are doing. If you are doing pure longitudinal analysis - you don't need to put data in the long format really and you should use the '91' weight from the last wave of analysis (if you are going back to 1991).
But because you are putting data in the long format I am wondering whether you are doing pooled analysis. Are you using each observation per wave or are you looking at differences between two adjacent waves? Depending on this the weights will be different.
#6 Updated by Rossella Icardi about 2 years ago
thanks for the response. Let me clarify something, I am doing a longitudinal analysis using waves 1991-2015. I have tried including the '91' weight, but I lose a large number of observations. Is that correct? Is it only due to attrition or does the weight correct for anything else?
Alternatively, which weighting approach would you recommend to run a longitudinal analysis for the years 1991-2015?
#7 Updated by Olena Kaminska about 2 years ago
Thanks for the clarification. If you are doing a longitudinal analysis you need just one weight - the weight that comes from the last wave in your analysis. If you start in 1991 then the weight is '91'. You drop lots of people partially because of attrition but mainly because there were a number of boosts to the sample since 1991 (the biggest being Understanding Society in 2009). You will have higher number of people if you start in 2001 (use '01' weight) and in 2010 (if you use ub weight) etc.
Hope this helps,