Sampling Procedure
The 1998 South African Demographic and Health Survey (SADHS) covered the population living in private households in the country. The design for the SADHS called for a representative probability sample of approximately 12,000 completed individual interviews with women between the ages of 15 and 49. It was designed principally to produce reliable estimates of demographic rates (particularly fertility and childhood mortality rates), of maternal and child health indicators, and of contraceptive knowledge and use for the country as a whole, the urban and the non-urban areas separately, and for the nine provinces. As far as possible, estimates were to be produced for the four South African population groups. Also, in the Eastern Cape province, estimates of selected indicators were required for each of the five health regions.
In addition to the main survey of households and women 15-49 that followed the DHS model, an adult health module was administered to a sample of adults aged 15 and over in half of the households selected for the main survey. The adult health module collected information on oral health, occupational hazard and chronic diseases of lifestyle.
SAMPLING FRAME
The sampling frame for the SADHS was the list of approximately 86,000 enumeration areas (EAs) created by Central Statistics (now Statistics South Africa, SSA) for the Census conducted in October 1996. The EAs, ranged from about 100 to 250 households, and were stratified by province, urban and non-urban residence and by EA type. The number of households in the EA served as a measure of size of the EA.
CHARACTERISTICS OF THE SADHS SAMPLE
The sample for the SADHS was selected in two stages. Due to confidentiality of the census data, the sampling was carried out by experts at the CSS according to specifications developed by members of the SADHS team. Within each stratum a two stage sample was selected. The primary sampling units (PSUs), corresponded to the EAs and will be selected with probability proportional to size (PPS), the size being the number of households residing in the EA, or where this was not available, the number of census visiting points in the EA. This led to 972 PSUs being selected for the SADHS (690 in urban areas and 282 in non-urban areas. Where provided by SSA, the lists of visiting points together with the households found in these visiting points, or alternatively a map of the EA which showed the households, was used as the frame for second-stage sampling to select the households to be visited by the SADHS interviewing teams during the main survey fieldwork. This sampling was carried out by the MRC behalf of the SADHS working group. If a list of visiting points or a map was not available from SSA, then the survey team took a systematic sample of visiting points in the field. In an urban EA ten visiting points were sampled, while in a non-urban EA twenty visiting points were sampled. The survey team then interviewed the household in the selected visiting point. If there were two households in the selected visiting point, both households were interviewed. If there were three or more households, then the team randomly selected one household for interview. In each selected household, a household questionnaire was administered; all women between the ages of 15 and 49 were identified and interviewed with a woman questionnaire. In half of the selected households (identified by the SADHS working group), all adults over 15 years of age were also identified and interviewed with an adult health questionnaire.
SAMPLE ALLOCATION
Except for Eastern Cape, the provinces were stratified by urban and non-urban areas, for a total of 16 sampling strata. Eastern Cape was stratified by the five health regions and urban and non-urban within each region, for a total of 10 sampling strata. There were thus 26 strata in total.
Originally, it was decided that a sample of 9,000 women 15-49 with complete interviews allocated equally to the nine provinces would be adequate to provide estimates for each province separately; results of other demographic and health surveys have shown that a minimum sample of 1,000 women is required in order to obtain estimates of fertility and childhood mortality rates at an acceptable level of sampling errors. Since one of the objectives of the SADHS was to also provide separate estimates for each of the four population groups, this allocation of 1,000 women per province would not provide enough cases for the Asian population group since they represent only 2.6 percent of the population (according to the results of the 1994 October Household Survey conducted by SSA). The decision was taken to add an additional sample of 1,000 women to the urban areas of KwaZulu-Natal and Gauteng to try to capture as many Asian women as possible as Asians are found mostly in these areas. A more specific sampling scheme to obtain an exact number of Asian women was not possible for two reasons: the population distribution by population group was not yet available from the 1996 census and the sampling frame of EAs cannot be stratified by population group according to SSA as the old system of identifying EAs by population group has been abolished.
An additional sample of 2,000 women was added to Eastern Cape at the request of the Eastern Cape province who funded this additional sample. In Eastern Cape, results by urban and non-urban areas can be given. Results of selected indicators such as contraceptive knowledge and use can also be produced separately for each of the five health regions but not for urban/non-urban within health region.
Result shows the allocation of the target sample of 12,000 women by province and by urban/nonurban residence. Within each province, the sample is allocated proportionately to the urban/non-urban areas.
In the above allocation, the urban areas of KwaZulu-Natal have been oversampled by about 57 percent while those of Gauteng have been oversampled by less than 1 percent. For comparison purposes, it shows a proportional allocation of the 12,000 women to the nine provinces that would result in a completely self-weighting sample but does not allow for reliable estimates for at least four provinces (Northern Cape, Free State, Mpumalanga and North-West).
The number of households to be selected for each stratum was calculated as follows:
- According to the 1994 October Household Survey, the estimated number of women 15-49 per households is 1.2. The overall response rate was assumed to be 80 percent, i.e., of the households selected for the survey only 90 percent would be successfully interviewed, and of the women identified in the households with completed interviews, only 90 percent would have a complete woman questionnaire. Using these two parameters in the above equation, we would expect to select approximately 12,500 households in order to yield the target sample of women.
- The number of sample points (or clusters) to be selected for each stratum is calculated by dividing the number of households in the stratum by the average "take" in the cluster. In SADHS, each cluster will correspond to a census EA. Analytical studies of surveys of the same nature suggest that the optimum number of women to be interviewed is around 20-25 in each urban cluster and 30-35 in each non-urban cluster. However it was decided that these numbers would be lower for the SADHS, given the practice of small cluster "take" in surveys conducted in South Africa and that the field cost is generally reasonable. If we selected 10 households in each urban cluster and 20 households in each non-urban cluster, the distribution of sample points or EAs would be as follows:
- Some rearrangement was then necessary so that in each stratum there was an even number of EAs. This is recommended for the purpose of calculating sampling errors using Taylor linearization in which the first step is to form pairs of homogeneous clusters.
In the Eastern Cape, the sample was distributed equally among the five health regions since estimates are required at the level of health region. Within each health region the sample was distributed proportionally to urban/non-urban according to the distribution of population in 1993. Table A7 shows the proposed number of EAs to be selected.
- In allocating the number of EAs to the five health regions of the Eastern Cape, we tried to follow the rule of an even number of clusters per sampling stratum while aiming for a regional sample of approximately 600 households (resulting in about 600 women aged 15-49).
STRATIFICATION AND SYSTEMATIC SELECTION OF EAS
Stratification and selection of the EAs for the SADHS was done by CSS according to the following specifications. Explicit stratification of the EAs was by province and by urban/non-urban within province except in Eastern Cape where the strata were the urban and non-urban areas of each of the five health regions. EAs that contain only institutions such as prisons and mine hostels were excluded from the sampling frame. Within each EA type, the EAs were ordered according to geographic or administrative units as adopted by SSA for the census. The number of EAs were then selected independently within each explicit stratum and with probability proportional to size. The measure of size used for selection was the number of households enumerated in each EA by the census.
The selection procedure that SSA used in each explicit stratum was as follows:
1. calculating the selection interval for the EAs:
where Mi is the size of the stratum (total number of households or population in the stratum according to the census) and a is the number of EAs to be selected in the stratum;
2. calculating the cumulated size of each EA;
3. calculating the series of sampling numbers R, R+I, R+2I, ..., R+(a-1)I, where R is a random number between 1 and I;
4. comparing each sampling number with the cumulated sizes.
The first EA to be selected was the first EA on the list whose cumulated size was equal or greater than the first sampling number. The second EA to be selected was the next EA on the list (after the first selected one) whose cumulated size was equal or greater than the second sampling number, and so on.