Valid occupation code versus job hours
Dear Understanding Society team,
I'm puzzled about the fact that in my data (all BHPS and US waves merged) I have approximately 35.000 individuals working with jbhrs>0 but when looking for the number of valid occupation codes for jbisco88_cc I have 47.000 unique observations. Do you know why there seems to be more observations for occupation codes then working in an occupation?
#1 Updated by Alita Nandi about 2 months ago
- Private changed from Yes to No
- % Done changed from 0 to 80
- Assignee set to C Josten
- Status changed from New to Feedback
Occupation question is asked of anyone who did paid work the week before the interview (jbhas=1 OR jboff=1). But hours worked (jbhrs) is asked of those who did paid work the week before the interview (jbhas=1 OR jboff=1) AND was not self-employed (jbsemp=1). So, there are more people with valid occupation code than with positive hours in paid employment.
Hope this helps.
On behalf of Understanding Society User Support Team
#2 Updated by C Josten about 2 months ago
Thanks for your fast and very helpful reply, which makes sense. The only follow up question - why is it that I also only have 35.000 observations when restricting unique observations to jbsemp>0? This should include both self-employed and employed, right?
#4 Updated by C Josten about 2 months ago
I am interested in employed and self-employed. This should match the number of individuals with a valid occupation code, right? But when I restrict to employed and self-employed (jbsemp=1 and jbsemp=2) I get a much smaller number of unique observations than when I look at valid occupation codes (jbisco88_cc). So I am just puzzled how this can be the case...
#5 Updated by Alita Nandi about 2 months ago
Those who have a job in the week before the interview (jbhas=1 OR jboff =1), whether in paid employment or self-employed (jbsemp=1 or jbsemp=2) are asked about their occupation. This is coded to SOC codes (JBSOC00) and then transformed into JBISCO codes using a look-up file. See description here
But for these 10k cases, this translation was not possible and so JBISCO88 is missing even though JBSOC00 is available.
Also note, that _cc versions of these variables are the less disclosive versions and so available in End User License version of the data. If you wanted the complete coding, then you would need to apply for Special License version of the data.
#6 Updated by C Josten about 2 months ago
Hi Alita! Sorry to bother you again but just to reiterate: the occupation code actually has more unique observations than the ones who have a job in the week before the interview (jbhas=1 OR jboff =1) - How can that be? If you're saying there is 10k JBISCO88 missing, than if anything there should be less observations in jbisco88 than people in some form of employment, no?
Thanks again for your help!!!
#7 Updated by Alita Nandi about 2 months ago
No problem. I have looked at the data again. Over the 18+9 waves of data:
jbhas=1|jboff=1 for 370979 cases
jbocc90_cc or jbsoc00_cc is not -8 (i.e., valid cases either >0 or -1 -2 -9) for 370412 cases
jbisco88_cc is not -8 (i.e., valid cases either >0 or -1 -2 -9) for 360642 cases
Are you using Stata? Then I can share the syntax file which I have used to get these figures.
#8 Updated by C Josten about 2 months ago
Thanks so much for looking into this. I think I have figured it now. I was looking at unique observations (i.e. by pidp) so it makes sense that there is more occupation codes than observations of work status if individuals stay employed but switch jobs. My sample is smaller as I have restricted for things.
Thanks for your help in understanding the difference in those variables and best!