IHSN Survey Catalog
  • Home
  • Microdata Catalog
  • Citations
  • Login
    Login
    Home / Central Data Catalog / ZAF_2016_PHC_V01_M_V7.5_A_IPUMS
central

Community Survey 2016 - IPUMS Subset

South Africa, 2016
Get Microdata
Reference ID
ZAF_2016_PHC_v01_M_v7.5_A_IPUMS
Producer(s)
Statistics South Africa, IPUMS
Metadata
Documentation in PDF DDI/XML JSON
Study website
Created on
Sep 03, 2025
Last modified
Sep 03, 2025
Page views
29128
  • Study Description
  • Data Dictionary
  • Get Microdata
  • Identification
  • Version
  • Scope
  • Coverage
  • Producers and sponsors
  • Sampling
  • Survey instrument
  • Data collection
  • Data Access
  • Disclaimer and copyrights
  • Contacts
  • Metadata production
  • Identification

    Survey ID number

    ZAF_2016_PHC_v01_M_v7.5_A_IPUMS

    Title

    Community Survey 2016 - IPUMS Subset

    Abbreviation or Acronym

    PHC South Africa 2016 (IPUMS Harmonized Subset)

    Country
    Name Country code
    South Africa ZAF
    Study type

    Population and Housing Census [hh/popcen] IPUMS International

    Series Information

    DOI:10.18128/D020.V7.5

    Kind of Data

    Sample survey data [ssd]

    Unit of Analysis

    Persons and households

    UNITS IDENTIFIED:

    • Dwellings: no
    • Vacant Units: no
    • Households: yes
    • Individuals: yes
    • Group quarters: no

    UNIT DESCRIPTIONS:

    • Dwellings: A structure or part of a structure or group of strucutres occupied or meant to be occupied by one or more households
    • Households: A household is a group of persons who live together and provide themselves jointly with food or other essentials for living, or a single person who lives alone.
    • Group quarters: Not enumerated

    Version

    Version Description

    Version 7.5. The datasets contain selected variables from the original census microdata plus harmonized variables from the IPUMS-International database.

    Version Date

    2024-10-05

    Scope

    Notes

    Additional notes on a sample that is part of this study: South Africa 2016

    Topics
    Topic Vocabulary
    Demographic Variables -- PERSON IPUMS
    Appliances, Mechanicals, Other Amenities Variables -- HOUSEHOLD IPUMS
    Other Household Variables -- HOUSEHOLD IPUMS
    Geography: Global Variables -- HOUSEHOLD IPUMS
    Fertility and Mortality Variables -- PERSON IPUMS
    Nativity and Birthplace Variables -- PERSON IPUMS
    Utilities Variables -- HOUSEHOLD IPUMS
    Technical Household Variables -- HOUSEHOLD IPUMS
    Geography: IPUMS-I, IPUMS-DHS Variables -- HOUSEHOLD IPUMS
    Disability Variables -- PERSON IPUMS
    Education Variables -- PERSON IPUMS
    Constructed Family Interrelationship Variables -- PERSON IPUMS
    Geography: O-Z Variables -- HOUSEHOLD IPUMS
    Migration: Global Variables -- PERSON IPUMS
    Group Quarters Variables -- HOUSEHOLD IPUMS
    Constructed Household Variables -- HOUSEHOLD IPUMS
    Ethnicity and Language Variables -- PERSON IPUMS
    Migration: O-Z Variables -- PERSON IPUMS
    Household Economic Variables -- HOUSEHOLD IPUMS
    Technical Person Variables -- PERSON IPUMS
    Technical Household Variables -- HOUSEHOLD IPUMS
    Education Variables -- PERSON IPUMS
    Utilities Variables -- HOUSEHOLD IPUMS
    Demographic Variables -- PERSON IPUMS
    Constructed Family Interrelationship Variables -- PERSON IPUMS
    Dwelling Characteristics Variables -- HOUSEHOLD IPUMS
    Work Variables -- PERSON IPUMS
    Appliances, Mechanicals, Other Amenities Variables -- HOUSEHOLD IPUMS
    Group Quarters Variables -- HOUSEHOLD IPUMS
    Geography: Global Variables -- HOUSEHOLD IPUMS
    Disability Variables -- PERSON IPUMS
    Nativity and Birthplace Variables -- PERSON IPUMS
    Technical Person Variables -- PERSON IPUMS
    Constructed Household Variables -- HOUSEHOLD IPUMS
    Other Household Variables -- HOUSEHOLD IPUMS
    Household Economic Variables -- HOUSEHOLD IPUMS
    Utilities Variables -- HOUSEHOLD IPUMS
    Appliances, Mechanicals, Other Amenities Variables -- HOUSEHOLD IPUMS
    Other Household Variables -- HOUSEHOLD IPUMS
    Household Economic Variables -- HOUSEHOLD IPUMS
    Geography: O-Z Variables -- HOUSEHOLD IPUMS
    Technical Household Variables -- HOUSEHOLD IPUMS
    Demographic Variables -- PERSON IPUMS
    Ethnicity and Language Variables -- PERSON IPUMS
    Nativity and Birthplace Variables -- PERSON IPUMS
    Migration: Global Variables -- PERSON IPUMS
    Disability Variables -- PERSON IPUMS
    Disability Variables -- PERSON IPUMS
    Fertility and Mortality Variables -- PERSON IPUMS
    Education Variables -- PERSON IPUMS
    Technical Person Variables -- PERSON IPUMS
    Migration: Global Variables -- PERSON IPUMS

    Coverage

    Geographic Unit

    Municipality

    Universe

    Population residing in private dwellings.

    Producers and sponsors

    Primary investigators
    Name Affiliation
    Statistics South Africa
    IPUMS University of Minnesota

    Sampling

    Sampling Procedure

    MICRODATA SOURCE: Statistics South Africa

    SAMPLE SIZE (person records): 3328793.

    SAMPLE DESIGN: Systematic stratified sample drawn by Statistics South Africa based on the 2011 census.

    Weighting

    Computed by census agency and should be used for most types of analysis.

    Survey instrument

    Questionnaires

    A single "Household Questionnaire" for information on dwelling, household, and individuals. All data to be collected on a tablet computer.

    Data collection

    Dates of Data Collection
    Start End
    2016-03-07 2016-04-22
    Time periods
    Start date End date
    2016-03-16 2016-03-16
    Mode of data collection
    • Face-to-face [f2f]
    Data Collection Notes

    de facto, CENSUS DAY: March 7, 2016

    Data Access

    Access authority
    Name
    Statistics South Africa
    Confidentiality
    Is signing of a confidentiality declaration required? Confidentiality declaration text
    yes IPUMS International distributes integrated microdata of individuals and households only by agreement of collaborating national statistical offices and under the strictest of confidence. Before data may be distributed to an individual researcher, an electronic license agreement must be signed and approved. To gain access to the data, a researcher must agree to the following: (1) Implement security measures to prevent unauthorized access to census microdata. Under IPUMS International agreements with collaborating agencies, redistribution of the data to third parties is prohibited. (2) Use the microdata for the exclusive purposes of scholarly research and education. Researchers must explicitly agree to not use microdata acquired for any commercial or income-generating venture. (3) Maintain the confidentiality of persons, households, and other entities. Any attempt to ascertain the identity of persons or households from the microdata is prohibited. Alleging that a person or household has been identified is also prohibited. (4) Report all publications based on these data to IPUMS International, which will in turn pass the information on to the relevant national statistical agencies. Once a project is approved, a password is issued and data may be acquired through the Internet. Penalties for violating the license include: revocation of the license, recall of all microdata acquired, filing of a motion of censure to the appropriate professional organizations, and civil prosecution under the relevant national or international statutes. These safeguards mirror the principles from the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality. Employees of the Minnesota Population Center who work with the census microdata to produce the harmonized database also sign agreements to respect the confidentiality of the data. IPUMS International works with each country's statistical office to minimize the risk of disclosure of respondent information. The details of the confidentiality protections vary across countries, but in all cases, names and detailed geographic information are suppressed and top-codes are imposed on variables such as income that might identify specific persons. In addition, IPUMS International uses a variety of technical procedures to enhance confidentiality protection. These include the following: (1) Swapping an undisclosed fraction of records from one administrative district to another to make positive identification of individuals impossible. (2) Randomizing the placement of households within districts to disguise the order in which individuals were enumerated or the data processed. (3) Aggregating codes of sensitive characteristics (e.g., grouping together very small ethnic categories) (4) Top- and bottom-coding continuous variables to prevent identification of extreme cases. The safety record for public-use census microdata is apparently perfect. In almost four decades of use, there has not been a single verified breach of statistical confidentiality. The measures implemented by the IPUMS International are designed to extend this record.
    Access conditions

    An adapted version of the dataset, harmonized for international comparability, is available from IPUMS International (https://international.ipums.org/international/) under the following conditions:

    IPUMS International distributes integrated microdata of individuals and households only by agreement of collaborating national statistical offices and under the strictest of confidence. Before data may be distributed to an individual researcher, an electronic license agreement must be signed and approved. To gain access to the data, a researcher must agree to the following:

    (1) Implement security measures to prevent unauthorized access to census microdata. Under IPUMS International agreements with collaborating agencies, redistribution of the data to third parties is prohibited.

    (2) Use the microdata for the exclusive purposes of scholarly research and education. Researchers must explicitly agree to not use microdata acquired for any commercial or income-generating venture.

    (3) Maintain the confidentiality of persons, households, and other entities. Any attempt to ascertain the identity of persons or households from the microdata is prohibited. Alleging that a person or household has been identified is also prohibited.

    (4) Report all publications based on these data to IPUMS International, which will in turn pass the information on to the relevant national statistical agencies.

    Once a project is approved, a password is issued and data may be acquired through the Internet. Penalties for violating the license include: revocation of the license, recall of all microdata acquired, filing of a motion of censure to the appropriate professional organizations, and civil prosecution under the relevant national or international statutes.

    These safeguards mirror the principles from the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality. Employees of the Minnesota Population Center who work with the census microdata to produce the harmonized database also sign agreements to respect the confidentiality of the data.

    Citation requirements

    Steven Ruggles, Lara Cleveland, Rodrigo Lovaton, Sula Sarkar, Matthew Sobek, Derek Burk, Dan Ehrlich, Quinn Heimann, Jane Lee. Integrated Public Use Microdata Series, International: Version 7.5 [dataset]. Minneapolis, MN: IPUMS, 2024. https://doi.org/10.1 [dataset]. Minneapolis, MN: IPUMS, 2024. https://doi.org/10.18128/D020.V7.5

    Researchers should also acknowledge the statistical agency that originally produced the data: South Africa, Statistics South Africa. Community Survey 2016

    The licensing agreement for use of IPUMS International data requires that users supply IPUMS International with the title and full citation for any publications, research reports, or educational materials making use of the data or documentation.

    Copies of such materials are also gratefully received at ipums@umn.edu.

    Printed matter should be sent to:
    IPUMS International
    Minnesota Population Center
    University of Minnesota
    50 Willey Hall
    225 19th Avenue South
    Minneapolis, MN 55455

    Disclaimer and copyrights

    Disclaimer

    The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.

    Copyright

    (c) Copyright 2016, Statistics South Africa and Minnesota Population Center

    Contacts

    Contacts
    Name
    Statistics South Africa

    Metadata production

    DDI Document ID

    DDI_ZAF_2016_PHC_v01_M_v7.5_A_IPUMS

    Producers
    Name Abbreviation Affiliation Role
    IPUMS IPUMS University of Minnesota Integration Harmonization Documentation
    Date of Metadata Production

    May 21, 2024

    Metadata version

    DDI Document version

    Version 7.5 October 2024. NEW FEATURES.

    --Historical data from NAPP project now available from IPUMS-International.
    --Historical census data from Canada, Denmark, the United Kingdom, Germany, Iceland, Norway, Sweden, and the United States for the period 1703 to 1911 are now available from IPUMS-International. The complete count and sample datasets were previously disseminated by the North Atlantic Population Project (NAPP). Where possible, the data have been integrated into existing IPUMS-International variable coding schema. Some new variables have been created that are available only for these pre-1960 datasets. NAPP data users should note that many NAPP variables are available from IPUMS-International by different names. For a complete list of NAPP variables that have been renamed in IPUMS-Interational, refer to the crosswalk.
    --Individual country shapefiles for the third-level administrative level of geography are now available for a few IPUMS samples.
    --New spatially harmonized previous-residence variables at the second administrative level of geography are available for several samples in this data release. More information is available here. Users should note that many older migration variables are available by different names. Refer to this table for a crosswalk of old and corresponding new migration variables.
    --IPUMS now hosts the Census Mosaic data collection. Census Mosaic identifies, gathers, harmonizes, and distributes surviving historical census microdata from regions of Continental Europe where complete centralized records are not available. The Mosaic project was founded by a consortium of historical social scientists in Europe. Data can be downloaded as static files from the Census Mosaic website. Although the data are not yet integrated fully into IPUMS International, variables have been standardized and harmonized to be roughly compatible with IPUMS coding structures.

    NEW SAMPLES.

    --Full-count datasets for Great Britain 1851, 1861, 1871 (Scotland only), 1891, and 1901.
    --Full-count dataset for Sweden 1910. Denmark (1845, 1880, and 1885)
    --Labor force surveys from Spain and eight new labor force surveys from Italy added to the series.

    Newly added countries:
    Benin, Cote d'Ivoire, Finland, Guatemala, Honduras, Laos, Lesotho, Mauritius, Myanmar, Papua New Guinea, Russia, Slovak Republic, Suriname, Togo, and Zimbabwe

    New samples for:
    Bolivia, Cambodia, Cambodia, Chile, Cuba, Cote d'Ivoire, Egypt (1848 and 1868, historical samples), Fiji, Guinea, Ireland, Israel, Italy, Lao PDR, Mexico, Morocco, Nepal, Netherlands, Palestine, Peru, Philippines, Puerto Rico, Rwanda, Senegal, Sierra Leone, South Africa, Switzerland, Uganda, United States, United Kingdom, United States, Vietnam, and Zimbabwe

    SUPPLEMENTAL DATA.

    Data from censuses from Benin and Lesotho that record individual fertility and/or mortality events were made available in IPUMS-International. These files can be downloaded and linked to data produced by the extract system.

    NEW VARIABLES.

    --IPUMS-International now provides harmonized and year-specific geography variables for all countries including 13 new samples from Dominican Republic, Germany, Indonesia, Israel, Malaysia, Mongolia, Nicaragua, Nigeria, Palestine, Paraguay, Thailand, United Kingdom, and Uruguay. First-level and second-level year specific geography variables are also available for all countries. IPUMS provides corresponding, downloadable GIS boundary files for all harmonized and year specific geography variables. More information about IPUMS geography variables is available here.
    --IPUMS International now provides spatially harmonized previous-residence variables at the first administrative level of geography. The codes for the spatially harmonized previous-residence variables match the spatially harmonized place of current residence. More information is available here.
    --IPUMS International provides spatially harmonized previous-residence variables at the first administrative level of geography for all samples; previously available country-specific migration variables at the first administrative level that were not fully harmonized spatially have been phased out. Spatially harmonized previous-residence variables at the second administrative level of geography are available for selected samples. More information is available here. Users should note that many older migration variables are available by different names. Refer to this table for a crosswalk of old and corresponding new migration variables.
    --IPUMS International now provides spatially harmonized previous-residence variables at the first administrative level of geography for all samples. Spatially harmonized previous-residence variables at the second administrative level of geography are available for several samples in this data release. More information is available here. Users should note that many older migration variables are available by different names. Refer to this table for a crosswalk of old and corresponding new migration variables.
    --Lower (third) level geography codes and GIS files have been added for Bangladesh, China, Ethiopia, Mali, Rwanda, and Zimbabwe. Some geography codes and labels might have changed for these countries to accommodate the newer lower level geography.
    --Added more detailed 3-digit industry and occupation variables for China 2000.

    EDITED SAMPLES.

    --Revised full-count data for Great Britain 1881
    --Revised full-count datasets for Sweden 1890 and 1900. The revision includes the following changes that improve comparability across Sweden datasets:
    --Revisions to certain ethnicity and work variables (and the underlying source data): ORIGIN, LABFORCE, OCCHISCO, OCRELATE, OCSTATUS.
    --Revisions to unharmonized source variables: SE1890A_HISCOSE, SE1890A_HISCRELSE, SE1890A_HISCSTATSE, SE1890A_OCCMULTISE, SE1900A_HISCOSE, SE1900A_HISCRELSE, SE1900A_HISCSTATSE, SE1900A_OCCMULTISE.
    --A new United States 1850 full-count dataset now matches the corresponding dataset distributed by the USA IPUMS data project. The source variable US1850A_0502 (HISTID) provides a linking key to match person records to the USA version of the data. The IPUMS International version of the data contains names, which the USA version cannot distribute.

    EDITED VARIABLES.

    An error affecting HHWT for South Africa 2007 was corrected. The existing values were adjusted by a factor of 0.01.

    AGEMARR was edited to add data for Hungary 1980 and 1990.

    Harmonized and year-specific geography variables for Brazil and Colombia have been edited to accommodate for the availability of refined municipal boundaries. Users should be aware that codes and labels have changed in all harmonized and year specific geography varaibles for these two countries.
    Errors affecting BPLSE2 (formerly BPLPARSE) for Sweden 1890 and the underlying source variable were corrected. Several thousand cases were incorrectly coded as 258101000. These cases have been updated with the correct code: 258171000.

    Harmonized geography variables for Italy, Philippines, Rwanda, and United States have been edited to accommodate new samples. Users should be aware that codes and labels have changed in all harmonized and year specific geography varaibles for these countries. More information about IPUMS geography variables is available here.
    The codes for the source variable RW2002A_0419 were corrected to include 0 and 8 as possible responses, which were previously identified as 'unknown years' within primary education.

    Errors affecting EDUCFJ for Fiji 2006 were corrected.
    A problem with PERWT for Tanzania 2012 was corrected. The previous weights were adjusted to properly reflect population totals.

    MOMLOC, POPLOC, and PARRULE were updated for the United States 2010 and 2015 samples to include additional information on subfamilies. Prior to this correction, persons above age 17 were not receiving links to their co-resident mothers and fathers.

    An error affecting codes for the URBAN variable in Egypt 1986 for Cairo, Alexandria, Port-Said, and Suez was corrected.

    An error in INCEARN affecting Venezuela 2001 was corrected. Earned income in the source variable VE2001A_0440 is interpreted as a monthly amount, thus adjustments previously applied to convert data from daily or weekly income were supressed.

    All the six Brazil samples in IPUMS International were replaced with higher density samples.

    An edited version of the Chile 2017 sample was introduced to correct an error in household breaks.

    Errors affecting codes for GEO1_ZA in South Africa 2011 and ENUTS1 in United Kingdom 1991 were corrected.

    Harmonized geography variables for Cambodia, Fiji, and Nepal have been edited to accommodate new samples. Users should be aware that codes and labels have changed in all harmonized and year-specific geography varaibles for these countries. More information about IPUMS geography variables is available here.
    An error in PERWT affecting Nepal 2001 was corrected.
    Errors affecting a code in GQ for Brazil 2010 and Indonesia 2010 were corrected. Both census samples now identify 1-person units created by splitting a large household.

    An error in MARRNUM affecting Indonesia 1976 was corrected. Some codes for GEO1_EG2006 and GEO2_EG2006 were edited.

    Harmonized geography variables for Bolivia, Cuba, Guinea, Ireland, Morocco, Palestine, Senegal, South Africa, and Uganda have been edited to accommodate new samples. Users should be aware that codes and labels have changed in all harmonized and year-specific geography variables for these countries. More information about IPUMS geography variables is available here.
    An error in INCEARN affecting Brazil 1980 was corrected.
    An error in EDATTAIN affecting Ireland 1971 and 1981 was corrected.

    A small proportion of person records in Mexico 1960 were re-classified in MIGRATEP based on information about their current and previous residence. These were previously coded to 'different major administrative unit', even though their place of residence suggests that their last move was within the same major administrative unit.
    The second-level technician (higher) degrees for Spain 1991, 2001, and 2011 were re-classified into post-secondary technical education in EDATTAIN.
    An error affecting codes for SEX for Egypt 1848 and 1868 was corrected. The values for male and female had been reversed.

    A problem with HHWT and PERWT for Canada 2011 was corrected. The previous weights were adjusted to properly reflect population totals.
    Harmonized geography variables for Cambodia, Lao PDR, Mexico, Peru, Switzerland, Vietnam, Puerto Rico, United Kingdom, and United States have been edited to accommodate new samples. Users should be aware that codes and labels have changed in all harmonized and year-specific geography variables for these countries. More information about IPUMS geography variables is available here.

    Harmonized geography variables for Chile and Sierra Leone have been edited to accommodate new samples. Users should be aware that codes and labels have changed in all harmonized and year-specific geography variables for these countries. More information about IPUMS geography variables is available here.
    An error affecting codes for COMPUTER for Senegal 2013 was corrected.
    An error affecting labels available in IND for Peru 1993 was corrected.
    An error affecting codes for persons previously residing abroad for MIG1_5_BO in Bolivia 2001 and 2012 was corrected.
    EDUCAR, EDATTAIN, and YRSCHOOL were adjusted in the Argentina samples to incorporate information on completion of education levels in the data harmonization.
    HHWT and PERWT were calibrated in Kenya 1979 to properly reflect the population distribution by province.
    In GQ (group quarters status), persons residing in hospitals of all types were reclassified to 'institutional group quarters' from 'other group quarters,' making their treatment consistent with GQTYPE.

    Errors affecting codes for BPLBJ2 in Benin 1979, 1992, and 2002 were corrected.
    Errors affecting codes for GEO2_BR1970 in Brazil 1970 were corrected.

    Back to Catalog
    IHSN Survey Catalog

    © IHSN Survey Catalog, All Rights Reserved.