Share of EU born respondents with UK citizenship only
This is to ask you about the unusually high share of EU born respondents reporting only British citizenship (that is, no dual UK-EU nationality) in waves a_ to e_ (around 30%). In wave h_ this share is 8%, which makes more sense. I am aware that those respondents holding an UK passport in wave x are not asked about their citizenship again in wave x + 1 nor in subsequent waves.
My objective is to identify the citizenship(s) of wave h_ respondents, including those with dual nationality. That means I need to take into account the answers given at previous waves by those respondents who are not asked the citizenship question in wave h_ (h_citzn1==-8).
I’ve merged the citizenship variables in indresp from waves a_ to h_ and created a file named citizen.dta. My new citizenship variable uses the values of wave h_ and wave a_ only for those respondents that are not asked the question in wave h_. The final variable shows that there is an unusually high share of EU born respondents with UK citizenship only. This high share cannot be accounted for by naturalisations. Since 1990, there have been 250,000 naturalisations of EU citizens, which is about 8% of the population who arrived since that year.
I’ve attached a do file with the code I've used to calculate this (it starts in line 23)
#1 Updated by Stephanie Auty 4 months ago
- Private changed from Yes to No
- % Done changed from 0 to 10
- Assignee changed from Alita Nandi to Stephanie Auty
- Status changed from New to In Progress
Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.
Stephanie Auty - Understanding Society User Support Officer
#2 Updated by Stephanie Auty 3 months ago
- % Done changed from 10 to 50
- Assignee changed from Stephanie Auty to Olena Kaminska
- Category changed from Data analysis to Weights
Apologies for the delay in getting back to you.
Firstly, in calculating your final variable you will need to use data from all waves, not just Waves 1 and 8 as people who joined the survey between these waves will have answered in the wave they joined.
In your tabulation at the end of your syntax file, you are not weighting the data. In Wave 6 we introduced the immigrant and ethnic minority boost sample, so that is why you are seeing a large jump in types of answers. I will assign this issue to our survey statistician next to discuss which weights would make the results comparable.
#3 Updated by Olena Kaminska 3 months ago
I agree with Stephanie that you need to use weights to make sure you account correctly for the immigrant and ethnic minority boost, among other things, as well as ethnic minority boost that started at wave 1. Let me know if you need help with selecting correct weights for your analysis.
If you haven't used weights in you analysis, most likely this would explain the differences in your estimates. Please let us know if you still observe these differences once you use our weights.
Just to point one important thing: UKHLS is a longitudinal study. Importantly it does not represent recent immigrants in the years in-between boosts. In other words it represents the cross-sectional population including all immigrants in 1991 (GB only), 2001 (NI only), in 2009-10 (UK) and in 2014-15 (UK). In the years for example between 2009-10 and 2014-15 only immigrants that move in with people who were in UK before 2009-10 are represented, and even these are represented in lower proportions than in the cross-sectional population. We recognize this and this was one of the reasons we boosted our sample at wave 6, specifically concentrating on recent immigrants among other groups. Wave 6 onwards will also have large sample size for immigrants which would provide more precise estimates. So please if you compare estimates before and after wave 6 that relate to immigrants it is worth checking confidence intervals (weighted of course).
Hope this helps,
#4 Updated by Olena Kaminska 3 months ago
Apologies. I just realized that you very helpfully included your syntax. I just had a look at it and noticed that you are using longitudinal weights. I feel that cross-sectional weights may be more appropriate for you. Can you confirm that the only reason you use longitudinal weights is because you created citizenship variable using all the previous waves? If so, and you are not analysing people over time, you can use xw weights. This is because in order to be in your analysis people didn't have to participate in each wave 1-8, but had to have a value from at least one wave. If this is correct - try to use xw weights.