Sampling Procedure
The 2010 MDHS called for a nationally representative sample of about 25,600 interviews of women between the ages of 15 and 49. The survey was designed to provide information on fertility and childhood mortality, family planning, maternal and child health, knowledge and behaviour regarding AIDS and other sexually transmitted infections (STI), domestic violence, and HIV prevalence and other health issues among the adult population.
Administratively, Malawi is divided into 28 districts. The sample was designed to provide estimates in 27 districts for most health and demographic indicators. The district of Likoma is small and therefore was combined with Nkhata Bay. Indicators are also shown for the Northern, Central, and Southern Regions of the country.
- Northern Region: Chitipa, Karonga, Likoma, Mzimba, Nkhata Bay, and Rumphi
- Central Region: Dedza, Dowa, Kasungu, Lilongwe, Mchinji, Nkhotakota, Ntcheu, Ntchisi, and Salima
- Southern Region: Balaka, Blantyre, Chikhwawa, Chiradzulu, Machinga, Mangochi, Mulanje, Mwanza, Neno, Nsanje, Mwanza, Neno, Nsanje, Phalombe, Thyolo, and Zomba
In addition, a men's survey was conducted in a subsample of one in three households selected for the women's survey. All men age 15-54 in the subsample of households were eligible for the men's survey. The men's survey was designed to collect information on family planning, knowledge and behaviour regarding AIDS and other STIs, and adult health issues. All men age 15-54 and all women age 15-49 in the households selected for the men's survey were also eligible for HIV testing.
SAMPLING FRAME
The sampling frame used for the 2010 MDHS was based on summary data for the enumeration areas (EAs) of the 2008 Malawi Population and Housing Census (PHC). The sampling frame consists of 9,145 EAs throughout the nation. Maps delineating the EA boundaries were created. Of the 9,145 EAs, 1,076 are urban and 8,069 are rural. The EA size (i.e., number of regular households in the EA or village) varies from 0 to 954, with an average of 249 households. The sampling frame was stratified into the 27 districts. Within each of the districts, the sampling frame was further stratified by urban and rural areas.
SAMPLE ALLOCATION
Sample allocation plays an important part in sample design because it relates to the survey precision at the national level. In the absence of accurate information on the main survey indicators at the domain level, the best allocation is proportional allocation. The allocation is proportional to the domain's population size. Because the desired sample size at the national level is large (at least 27,200 households), survey precision at the national level was not the only goal for the design of the 2010 MDHS. Rather, given the number of study domains (27 domains), the survey precision at the domain level was an important objective for the 2010 MDHS.
To ensure comparability across the study domains, the sample size for each domain should be similar. Due to the range in population size of the districts, however, proportional allocation could not be used. This would lead to very different levels of precision between the estimates for these districts. The initial plan for the sample design included a flat sample of 1,000 households per district. However, this plan was revised to allow for a larger sample size in the districts of Lilongwe and Blantyre because these two districts contain the major urban centers in the country. The sample size in these districts was increased to 1,300 households, and the target sample size was decreased from 1,000 households to 950 in the eight smallest districts to reach approximately the same target sample size of households at the national level (27,345). Using this approach, the larger domains would be undersampled and the smaller domains would be oversampled to achieve accurate representation of each domain. [Given the small size of the urban population (10 percent), oversampling is applied to urban areas to ensure that the survey precision is comparable across urban and rural areas].
The sample allocation between urban and rural areas is a power allocation, which is an allocation between proportional allocation and equal size allocation. A power value is applied to achieve a satisfactory sample size. Oversampling or undersampling any particular domain does not pose any problems for representativeness if sampling weights are properly calculated and applied in tabulation.
The sample allocation must be converted to a number of primary sampling units (PSUs). It was decided to select 20 households in an urban cluster and 35 households in a rural cluster.The total number of clusters is 849, with 158 urban clusters and 691 rural clusters. The total number of households selected is 27,345, with 3,160 urban households and 24,185 rural households.
SAMPLING PROCEDURE AND UPDATING OF THE SAMPLING FRAME
The 2010 MDHS sample is a stratified sample selected in two stages. Stratification is achieved by separating each study domain into urban and rural areas. Areas are defined as urban or rural based on the classification in the 2008 Malawi PHC. Therefore, the 27 domains are stratified into a total of 54 sampling strata.
Samples are selected independently in every stratum, by a two-stage selection. This means that 54 independent samples were selected, one from each sampling stratum. Implicit stratifications were achieved at each of the lower geographical or administrative levels by sorting the sampling frame according to the geographical/administrative order and by using probability proportional to the size in the first stage of sampling. The explicit and implicit stratifications together guarantee a better scattering of the sampled points.
In the 2010 MDHS design the primary sampling units (PSUs) are the enumeration areas (EAs) from the 2008 Malawi PHC, and the secondary sampling units (SSUs) are the households.
In the first stage of selection for the 2010 MDHS, the 849 EAs were selected with a probability proportional to the size EA. The EA size is the number of households it contains. After this selection and before the data collection, a household listing operation was conducted during May-June 2009 in all of the selected 849 EAs. The listing operation consisted of visits to every selected EA. During the visits, records were made of every structure found on the ground; structures were identified by type (residential or not); number of households in each residential structure were identified; and a location map and a sketch map were drawn to show boundaries of the EA and the location of each structure within it. A household list was set up for each selected EA (or PSU). The resulting lists of households served as the sampling frame for the selection of households in the second stage.
In the second stage of selection, a fixed number of 20 households were selected in urban PSUs and 35 households were selected in rural PSUs by equal probability systematic sampling. To improve the sampling frame and minimize the task of household listing, a few large EAs were subdivided into smaller segments. During fieldwork, a few clusters were found to be dramatically smaller than they were at the time of listing. Despite selecting every household in these clusters, the sample size did not reach the predetermined number. This situation resulted in a net decrease of 38 households between the sample design and fieldwork phases of the survey. Thus, the final sample included 27,307 eligible households.
The decision on the number of households selected per PSU is a trade-off between fieldwork efficiency and precision. All women age 15-49 in the selected households and all men age 15-54 in one-third of the selected households were eligible to be interviewed. The advantages of this two-stage selection procedure are:
- 1: The selection procedure is simple to implement and reduces possible nonsampling errors in the selection process.
- 2: It is easy to locate the selected households, reducing nonsampling errors and nonresponse.
- 3: The interviewers interview only the households in the pre-selected dwellings. No replacement of dwellings was permitted, preventing survey bias.
MEN'S SUBSAMPLE
In the households selected for the women's survey in each PSU, a subsample of one in three households was selected for the men's survey. All men age 15-54 in the selected households were eligible for the men's survey. Conducting a men's survey in a subsample of the total number of households selected was a result of budget restrictions, yet the subsample still allowed for acceptable precision in order to calculate men's indicators. The minimum sample size is larger for women than for men because complex indicators, such as total fertility and infant and child mortality rates, require larger sample sizes to achieve sampling errors of acceptable size, and these data come from interviews with women. The men's subsample was selected randomly from the list of selected households in each PSU. The men's sample is representative for the study domains and for the country as a whole.