Project

General

Profile

Support #1368

Unmatched households when merging hhresp with lsoa data

Added by Natalie Bennett 3 months ago. Updated 3 months ago.

Status:
Feedback
Priority:
Normal
Category:
Special license
Target version:
Start date:
06/23/2020
Due date:
% Done:

80%

Estimated time:

Description

Hi I am trying to distribute LSOA identifiers to household level data using wave 3. I have an issue when I merge the c_hhresp file with the c_lsoa11_protect file using the below stata code:

use c_hhresp.dta, clear
merge m:1 c_hidp using c_lsoa11_protect.dta

When performing this command I find 7 households from the hhresp file which are not found in the LSOA file, and 6,507 households from the LSOA file not found in the hhresp file.

I haven't been able to identify a reason from looking at the data as to why these data are only found in one file, rather than both. I was wondering if you could confirm if the above merge is correct and if so, what the reason is for these data which cannot be matched?

Many thanks

Natalie

History

#1 Updated by Gundi Knies 3 months ago

  • Private changed from Yes to No
  • % Done changed from 0 to 80
  • Target version set to X M
  • Assignee set to Natalie Bennett
  • Category set to Special license

Hi Natalie,
the universe of cases in the geographical identifier datasets is sampled households with a valid postcode on the ONS Postcode Directory, see the user guidance accompanying the file. Not all sampled households have a valid postcode, and not all sampled households participate in the household interview. The hhresp data file only has households with a household interview record.

Hope this explains the non-matches you find.
Best wishes,
Gundi

#2 Updated by Alita Nandi 3 months ago

  • Status changed from New to Feedback

To follow up on what Gundi has said,

All households in c_hhsamp who have a valid postcode for their addresses are available in c_lsoa11_protect file. c_hhresp comprises of only those sampled households who have completed a household questionnaire, so the 6507 cases in c_lsoa11_protect are those households who are present in c_hhsamp but not in c_hhresp. The 7 cases in c_hhresp but no in c_lsoa11_protect are households without a valid postcode.

You can check this by merging c_hhsamp with c_lsoa11_protect using c_hidp. You will find that there are no cases in the c_lsoa11_protect which are not in c_hhsamp, but there are 2898 cases in c_hhsamp which are not in c_lsoa11_protect.

Note, c_hhsamp c_hhresp c_lsoa11_protect are all at c_hidp level, so m:1 merging is not needed. You can use 1:1 merging.

Best wishes,
Alita
On behalf of Understanding Society User Support Team

#3 Updated by Natalie Bennett 3 months ago

Hi Alita and Gundi

Thanks both for your help, I understand the source of the non-matches now.

Thanks so much.

Best wishes

Natalie

#4 Updated by Natalie Bennett 3 months ago

Sorry just to follow up so I can justify this later, could you confirm why those 2,898 cases are not in the LSOA file?

Many thanks

Natalie

#5 Updated by Alita Nandi 3 months ago

Hi Natalie,

These are the households without valid postcodes.

Best wishes,
Alita

Also available in: Atom PDF