Fertility History and birth events
I have a couple of questions regarding how to buid fertility hisotires and birth events for individuals in the BHPS. Actually, as I am aiming to match this information with answers to the variables regarding gender attitudes (i.e. variables Wopfam*) which are answered only by individuals whose individual interview outcome is equal to "full interview" (i.e. Wivfio=1), I ultimately would like to build a panel dta for id with Wivfio=1 which records the number of children they parent and which updates everythime there is a birth events (I'd like to distinguish between sons and daughters).
What is the best way to do this?
If I understand correctly I have at least two different procedures I could follow:
- if I am intereseted in children coabithing with the respondend then I should consider that n_children=Wnchild and to measure birth events I could use variations in the variable Wnchild in the Windresp.dta together with info the age of the natural children taken from Windall.dta (in this way I can check that an increase in the var Wnchild is actually due to the fact that a child who is 0 years old is entering the sample rather than due to the fact that an older child is going back to live with their parents)
- If instead, which woudl be my prefered optin, I want to measure n_children as "the stock" of all children I should rely on the bchildnt.dta and focus only on respondents present at this point in time for which I can recunstruct their all fertility history. Then I should merge this information based on pid with both Windresp.dta and Windall.dta and follow the procedure below. In other words, this second startegy would allow me to be certain about the "stock of children" of an individual and update it depending on birth events as described in the point above.
Could I please ask you wehther this is the right way to procede? Should I use different variables to record birth rather than Wnchild from Windresp.dta and age of child from Windall.dta? I have seen in previous open issues that users were mentioning other variables so I am unsure about the best way to do this.
Many thanks for your help!
#1 Updated by Maddalena Ronchi 11 months ago
- File example.PNG added
I'd like to raise an extra point: there some instances in which the number of children computed in bchildnt.dta do not correspond to the number of children observed in bindall.dta - even when children are younger than 16 years old.
For example in the case of bhid=<removed> (see attachemnt): according to bchildnt.dta the adult individuals in the household have 3 children - 2 boys and one girl - all under the age of 16.
However, once I look for that same household in bindall.dta I observe only two natural children - one girl and one boy - and the other buy is missing.
Which information should I trust?
#2 Updated by Maddalena Ronchi 11 months ago
- File example_bindall.PNG added
Also, there are opposite cases with respect to the one highlighted above, i.e. cases in which a supposedly natural child (according to bindall.dta) doesnt appear in bchildnt.dta (see screenshot of bhid=<removed> in bindall.dta which is missing in bchildnt.dta)
#3 Updated by Stephanie Auty 11 months ago
- % Done changed from 0 to 10
- Assignee set to Stephanie Auty
- Category set to Data inconsistency
Many thanks for your enquiry. The Understanding Society team is looking into it and we will get back to you as soon as we can.
Please do not upload data from our datasets to the forum as they should not be accessed apart from through the UKDA. I have savd the files locally for my reference and deleted them from this post.
Stephanie Auty - Understanding Society User Support Officer
#7 Updated by Maddalena Ronchi 10 months ago
Thank you very much Stephanie and sorry for uploading the file, it won't happen again now that I know I shouldn't do that.
While working on the data I have come up with a new question: how should I interpret the fact that I observe some pid present both in the dta egoalt and indall for some waves, but for other waves that same pid is present only in indall? Does it mean the relationship with the opid "broke"?
Thanks a lot!
#8 Updated by Maddalena Ronchi 10 months ago
Any news about my question? I'd be happy to just replicate the startegy suggested by the BHPS - if any - to build the fertility history of individuals.
I have recently bumped into this paper: https://economics.yale.edu/sites/default/files/files/Faculty/washington/mommy_effect_20june2018.pdf
They start from the fertility interview at the second wave and from there they subsequently update it. They say the follow the same method suggested by the BHPS. Are the codes for this method they refer to publicly available? If so I could just follow them.
Thanks a lot for your help!
#9 Updated by Stephanie Auty 10 months ago
- % Done changed from 10 to 60
- Assignee changed from Stephanie Auty to Maddalena Ronchi
The dataset British Household Panel Survey Consolidated Marital, Cohabitation and Fertility Histories, 1991-2009 is available here: https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=5629#!/details
If you are interested in the code or methods used then please contact the principle investigator as listed on the Data Service website.
To answer your question about egoalt and indall, egoalt only contains information about relationships between individuals in the household and so does not contain respondents who live in a one person household.
#10 Updated by Maddalena Ronchi 10 months ago
Thanks a lot Stephanie. I have written to the principal investigator as you suggested because unfortunately the gender is the children is not included in the database you mentioned.
I'd like to ask your advice on one last issue: let's suppose I want to start building my database from bchildnt.dta what dta should I then use to update the fertility histories? In particular, I can think of two ways:
1) use Windall and use the variable Wiviow1 (in particular when it takes value=7) / bnewhy=1
2) use Wegoalt and update over time the number of pid-opid relationship where rel=4 (i.e opid id the natural child of pid)
Is there any other (better) way? I am confused because the two methodologies lead to different results.
Thanks a lot!