In October 2001, South Africans were enumerated to collect information on persons and households throughout the country, using a uniform methodology.
Household data collected included data on each household and each person present in the household on Census night, as well as data on services available to the household. Data on household residents, and residents of hostels and the other types of collective living quarters was also captured, as well as data on individuals who spent census night in institutions and hotels.
Kind of Data
Census/enumeration data [cen]
Unit of Analysis
The units of analysis for the South Africa Census 2001 were households and individuals
v1.1: Edited, anonymised dataset for licensed distribution
Version 1.0 of the South African Census 2001 was released by Statistics South Africa in 2004.
This version, version 1.1, was downloaded from Statistics South Africa's website on 24 October 2011.
Version 1.1 differs from the previous version in the following way:
(1) Geography variables, formerly provided in a separate data file, are now included in the "Person" and "Household" files.
(2) All variable names have been changed - the prefix "qp" has been changed to "p" for all variables in the "Person" file; the prefix "qh" has been changed to "h" for all variables in the "Household" file and the original variable name changed further e.g. "qh29_telephone" is now "h29_tele"
(3) Values for "Missing" responses have been changed: missing values denoted with "." in version 1.0 are now represented by "999" (missing), and some have been redefined as "Not Applicable"
The South African Census 2001 dealt with the following topics:
Household characteristics, including dwellling type, home ownership, household assets, access to services and energy sources;
Individuals' characteristics, including age, population group, language, religion, citizenship, migration, fertility, mortality and disability, as well as means of travel; and economic characteristics of individuals, including employment status and employment activities.
All variables from data obtained via the census questionnaire are included in this dataset, which is a 10% sample dataset, as well as derived variables and imputation flags. Geographic type is excluded from the final sample. Instead two additional geographical variables are supplied, namely: Urban and rural (Census 1996 classification) and size and density of locality.
The South African Census 2001 has national coverage.
To preserve confidentiality EA numbers are excluded from the dataset, and the lowest geographical level in the original Census 2001 dataset is sub-place. As further assurance of the confidentiality of the data, municipalities with 200 or fewer households are logically grouped with adjacent municipalities. The following municipalities are grouped:
There are approximately 21 000 sub-place units across the country. Many of these sub-places cover large areas and populations and therefore are not useful for research requiring detailed geography. In response to requests for geographically disaggregated Census 2001data Statistics SA has provided small area statistics data based on Census 2001. This national product is the only product to be provided to users who request data at a level lower than sub-place. The product is based on a small area layer (SAL) that was created by combining all EAs with a population of less than 500 with adjacent EAs within the same sub-place. The final SAL consists of 56 255 polygons. Apart from the SAL the product also contains all the higher levels of geography.
The South African Census 2001 covered every person present in South Africa on Census Night, 9-10 October 2001 including all de jure household members and residents of institutions.
Producers and sponsors
Statistics South Africa
The data in the South African Census 2001 dataset is a 10% unit level sample drawn from Census 2001 as follows:
• A 10% sample of households in housing units, and
• A 10% sample of collective living quarters (both institutional and non-institutional) and the homeless.
• A sample consisting of all persons in the households and collective living quarters, and the homeless, drawn for the samples described above
• A sample consisting of all mortality information for the households in housing units drawn in the 10% sample of households.
The 10% household and person sample files each contain a weight variable. This weight variable is the adjustment factor for the undercount (for households or persons, as appropriate) multiplied by 10 to inflate the 10% samples to the relevant population. In the person records, aggregated totals of sparsely populated codes, such as very old ages, might differ substantially from real totals due to sampling fluctuations – no scaling of the weights was done. In the household records aggregated totals will be approximately equal to real totals.
Mortality was not adjusted for undercount and therefore there is no weight variable in the Mortality data file.
Dates of Data Collection
Data Collection Mode
Data Collection Notes
The enumeration primarily took place over the period 10 October to 30 October 2001. However, in some situations it was necessary to continue enumeration through to November 2001 to ensure that as many people as possible were included.
Three questionnaires were administered for the South African Census 2001, questionnaire A (for persons in households), questionnaire B (for persons in institutions) and questionnaire C (for institutions). The Household questionnaire covered household characteristics, such as dwellling type, home ownership, household assets, access to services and energy sources. A component of the questionnaire captures fertility data. Both the household and persons in institutions questionnaires collected data on individuals' characteristics, including age, population group, language, religion, citizenship, migration, mortality and disability, as well as means of travel. Economic characteristics of individuals included employment activities and data on unemployment.
The following publication can be consulted for a detailed account of the editing undertaken for the South African Census 2001:
Computer editing specifications / Statistics South Africa. Pretoria: Statistics South Africa, 2003 369p. [Report No. 03-02-43 (2001)]. ISBN 0-621-34566-0.
Adjusting for undercount
In every census, there are bound to be some people, households, or even entire EAs, that are missed, or some people who are counted twice. During November 2001, a post-enumeration survey (PES) was undertaken to determine the degree of undercount or overcount in Census 2001. For those who are interested in the details, a separate publication is available, describing the methodology of the PES. The numbers and percentages relating to households and hostels in all Census 2001 products are adjusted according to the PES findings through the application of weights. Data relating to other collective living quarters are not weighted, as the PES did not cover these
However, this version is the ten percent sample dataset, which provides raw and weighted data for a small sample of questionnaires, and was not adjusted with regard to the undercount.
As part of the quality check for Census 2001, a Post-Enumeration Survey (PES) was conducted in November 2001, approximately one month after the census. Fieldworkers re-visited a scientifically selected sample of almost 1% of the census enumeration areas, to do an independent recount. The published census results are adjusted for undercount according to the findings of the PES. In addition to the check on coverage, the PES also involved an independent re-measurement of the basic characteristics of the population. Details on this process are available in the publication:
Statistics South Africa. 2004. Census 2001: post-enumeration survey: results and methodology. Report no. 03-02-17 (2001).
Publications based on datasets distributed by DataFirst should acknowledge relevant sources by means of bibliographic citations. To ensure that such source attributions are captured for social science bibliographic utilities, citations must appear in footnotes or in the reference section of publications. The bibliographic citation for this dataset is:
Statistics South Africa. South African Census, 2001 [dataset]. Version 1.1. Pretoria: Statistics South Africa [producer], 2003. Cape Town: DataFirst [distributor], 2011.
The information products and services of Statistics South Africa are protected in terms of the Copyright Act, 1978 (Act 98 of 1978). As the State President is the holder of State copyright, all organs of State enjoy unhindered use of the Department's information products and services, without a need for further permission to copy in terms of that copyright. Where a copy of the information is made available to any third party outside the State, the third party must be made aware of the existence of State copyright and ownership of the information by the State. The State (through Statistics SA) retains the full ownership of its information, products and services at all times; access to information does not give ownership of the information to the client. The use of any data is subject to acknowledgement of Stats SA as the supplier and owner of copyright. Statistics South Africa (Stats SA) will not be liable for any damages or losses, except to the extent that such losses or damages are attributable to a breach by Stats SA of its obligations in terms of an existing agreement or to the negligence or wilful act or omissions of the Stats SA, its servants or agents, arising out of the supply of data and or digital products in terms of that agreement. The user indemnifies Stats SA against any claims of whatsoever nature (including legal costs) by third parties arising from the reformatting, restructuring, reprocessing and/or addition of the data, by the user.
Copyright 2003, Statistics South Africa
DDI Document ID
University of Cape Town
Date of Metadata Production
DDI Document version
Version 1.2 (August 2011) additional metadata added and metadata document id changed.
Version 1.3 (August 2012) metadata on the Small Area Statistics spatial data added in the coverage section.