BHPS weights for BHPS components in UKHLS wave 6
Our team is analysing gender pay gap using BHPS samples. We require two separate weights, A and B, and we exclude Northern Ireland and IEMBS samples.
A: We'd like to use UKHLS wave 6, adult cross-sectional data, BHPS sample only. I don't seem to find a weighting variable for BHPS sub-sample. Could you advise me? Currently, we use "indpxui_xw" but I'm wondering if there's a weight for BHPS sample only? (In the wage regression, we're entering work-life history beginning from 1991 to 2015, in order to analyse the long-term impact of work-life history)
B: We'd like to use UKHLS wave 6, cross-sectional data, UKHLS with BHPS sub-sample.(In the wage regression, we're entering work-life history beginning from 2010 to 2015). Again, we currently specified "indpxui_xw" as weights but this includes IEMBS.
Please let me know if this is clear and any advice would be greatly appreciated.
#1 Updated by Stephanie Auty over 2 years ago
- Category set to Weights
- Status changed from New to In Progress
- Assignee set to Peter Lynn
- Target version set to BHPS
- % Done changed from 0 to 10
- Private changed from Yes to No
Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.
Stephanie Auty - Understanding Society User Support Officer
#2 Updated by Peter Lynn over 2 years ago
For analysis A, if you are using data from 1991-2015, you should use f_indin91.lw.
For B, if you are using data from 2010-15, you should use f_indinub_lw or f_indpxub_lw (depending on whether all your variables are included in the proxy interview).
#4 Updated by Sook Kim over 2 years ago
Thank you for the reply. Currently, we're not conducting longitudinal analysis. Maybe I have confused you with the information about work history elements. Please ignore this part.
We're only utilising UKHLS wave 6. I believe there are no appropriate adult main (or proxy) cross-sectional weights for BHPS components for UKHLS wave 6. The most relevant weighting variable seems to be "indpxui_xw" but my understanding is that this is a combination of UKHLS samples, BHPS and IEMBS, which seem unsuitable for our current project?
Let me also explain the aim of the project. The current project's gender pay gap outcomes utilising the latest BHPS cohorts'(2014/15) will be compared in relation to the earlier work based on old BHPS samples (2007). This means we exclude UKHLS cohorts in the Understanding Society data.
If it's easier, would it be possible that I could explain on the phone? Please let me know what's best.
#6 Updated by Peter Lynn over 2 years ago
- Assignee changed from Peter Lynn to Sook Kim
- % Done changed from 60 to 80
Apologies if i had misunderstood. Yes, if your analysis uses only variables from wave 6, then a cross-sectional weight would be appropriate. However, as you note, the only cross-sectional weights included in the initial data release for wave 6 were indpxui_xw and indinui_xw, neither of which are suitable for use with the BHPS sample alone or with the BHPS+GPS+EMB sample (excluding IEMBS). Our reasoning was that for any analysis using only wave 6 data, more precise estimates can be obtained by including all samples.
However, the most recent update to the data files (released 29 March 2017) included some extra weights, responding to user demand. These include f_indpxub_xw, f_indinub_xw and f_indscub_xw. These should enable you to carry out the second part (B) of your analysis.
But there is still no weight optimised for analysis of the BHPS sample alone using only wave 6 data. The most appropriate weight for such analysis would be f_indin01_lw. Using this weight, you will lose some statistical power as some persons will be dropped from the analysis (zero weight) due to having missed some earlier wave(s), but estimates should still be unbiased (or rather, as unbiased as they would be with the optimal weight).