I'm trying to use this longitudinal weight variable for analyzing data across waves 2-5 from the self-completion questionnaire. I'm trying to apply the weight variable in my analysis but given that ~21,000 cases have been assigned a value of 0, this means that the weight is viewed as missing in software such as Mplus.
Any recommendations for how to get around this? I was thinking that it might be possible to make the 0 values non-zero (e.g. 0.0000000000000001) and was wondering what your thoughts would be on this? Any other suggestions would be welcomed.
#1 Updated by Olena Kaminska over 4 years ago
For most of the 0-cases in lw sc weight there is at least one missing response to self-completion questionnaire between waves 1 and 5. So, the weights just reflect this. Most of these people should not be in your analysis to start with (if you are using self-completion between waves 2 and 5).
Changing the weight value from 0 to any other may not influence your analysis if you are lucky and such cases are automatically out of your analysis just because they have missing data. If they stay in your analysis though - they are likely to distort your analysis - potentially by not changing your point estimates much but definitely inflating your standard errors hugely. This approach would by all means be wrong statistically and theoretically - avoid.
Hope this helps,
#2 Updated by Orla McBride over 4 years ago
A few more queries (if I may). I'm trying to analyse data from individuals from the self-completion questionnaire - the SF-12 specifically. Are you saying that even if a person has full data on the SF-12 across Waves 2-5 but they have missing data on another variable within the self-completion questionnaire between W1-W5, their weight value for e_indscus_lw is 0 and therefore they need to be excluded from my analysis? I was trying to include in the analysis as many cases as possible - i.e. those with data for the SF-12 at 2 or more of the four waves (W2-5).
The sample for weighted analysis will be cut in half if this is the case.
#3 Updated by Olena Kaminska over 4 years ago
0-weight indicates skipping the whole self-completion questionnaire, not a few questions within. From your clarification I am not sure whether you are after longitudinal analysis. Longitudinal weights are for longitudinal analysis, i.e. if you are using information from waves 1-5 (or similar) for each person. From your question I wonder if you are doing pooled analysis (balanced panel). If so, you may want to use cross-sectional weights instead.
Furthermore, giving that you are using Mplus which allows for complex models, it is worth checking if your model can control for nonresponse within the model. This is usually possible if the model does not require full information from each person. If so, use the 'base' weight (e.g. longitudinal sc weight at wave 2) and rely on nonresponse correction by the model. I would still cross-check it with the e_indscus_lw model based on full data - the results should be similar. If different, you should be very careful as you may have misspecified the model.
Hope this helps,
#4 Updated by Orla McBride over 4 years ago
Thanks for the help. Sorry but I need to clarify. No I am interested in longitudinal analysis - I want to model change in SF-12 scores at the individual level across waves 2-5. I want to include as many individuals as possible in the analytic sample, not just those with complete data on the SF-12 from waves 2-5. So if people have data for the SF-12 at 2 or more waves I want to include them in my analytic sample. I want to conducted weighted analysis. My issue is that a lot of people seem to have a weight of 0 on e_indscus_lw, but have full data on SF-12 for these waves. Take for example pidp 2853965, 68025847, 68035365, 68063251. There are many others. I still don't understand why these are 0 on this weight.
Thanks for your patience!
#5 Updated by Olena Kaminska over 4 years ago
For most of longitudinal analysis one can use only full data, i.e. only people who have responded in all waves of interest. That's what longitudinal weights are for.
Only a few models can deal with missing information within people. If you are sure that your model can use people who only completed w2, or w2 and w3, or any other combination, then you should use the 'base' weight. Because you start at wave 2 you should use b_indscus_lw as your weight in your analysis.
There still will be some 0-weights but your sample size will be much higher now. Some of these missed w1 self-completion, some of these are from BHPS (and are not part of sc lw weight), but some of these are TSMs and should be out by design.
Hope this answers your question,