The survey MICS, carried out by the National Statistics Office (INE), with technical and financial support from the United Nations Children's Fund (UNICEF) provides new data which serves to establish a representative information base on health, nutrition, water, hygiene and sanitation, education, and demography, amongst others. In the first place we intend these statistics to enable us to up-date the statistical base line on the conditions of the Angolan population. In the future it will be used by the many sectors already mentioned in the regular tasks of planning, programming, monitoring and evaluation. This survey is the first operation and most extensive of its kind to collect and up-date data to be carried out in the country as a whole since Independence, and as such we believe that it will fulfil this objective. Secondly, the results presented here can be used as a point of departure for more detailed studies, which could contribute to a better understanding of the causes which determine the living conditions of the population, and for the better definition of programmes and policies which favour the child, other vulnerable groups and the most disadvantaged. To further these objectives, INE will be able to make the database available to interested parties in order to facilitate more specific research.
Kind of Data
Sample survey data [ssd]
The survey was national, including all the country's provinces, urban and rural areas, areas administered, and at that time, controlled by the Government or by UNITA.
Producers and sponsors
Authoring entity/Primary investigators
Instituto Nacional de Estadística
United Nations Children's Fund
United Nations Children's Fund
Department for International Development - British Government
National Committee of UNICEF of Portugal
National Committee of UNICEF of Spain
Interviews were carried out in 4,337 households during the fieldwork stage, which lasted from August-December 1996.
The sampling plan for MICS was intended to obtain a multiple purpose sample to be applied in 6 extended regions, defined as the research areas, on the basis of, on the one hand a UNICEF interventionist plan in Angola, and on the other hand, taking into consideration their geographical features.
From this sample it was possible to obtain estimates at the national level and at the level of the six geographically defined regions. Estimates at a more disaggregated level were not advisable, otherwise running the risk of losing the representative nature of the results.
Due to the war situation the country was facing factors, such as a displaced population and difficult access to certain localities. The last population census dates from 1983-84. Apart from the fact that these data are out of date, they only refer to a part of the country. For this reason, all the information available was used, and from various sources, to construct the Sampling Frame in order to select Primary Sampling Units (P.S.U.). Sources such as the Electoral Register/Census of 1992, information from the Ministry of Territorial Administration (MAT) the Provincial Governments and the social and economic provincial profiles prepared for the donors' round table in Brussels in 1995 for UNDP were used.
With the exception of the first stratum of the region "Capital C" (constituted by the Province of Luanda) the sampling of selected families for the research was probabilistic with 3 selection stages. In each region a potentially self-weighted sample of households was selected, though this characteristic self-weighting could be lost due to various factors, especially variations in population estimates.
The unit used in the first stage (PSU) was the "comuna" (the smallest administrative area in Angola) whose selection within each region was made independently, systematically and with probability proportional to the estimated size of the population.
The village in the rural areas or the neighbourhood in the urban areas constitutes the unit used in the second stage (SSU) and its selection was made without replacement and from a list of villages, which were accessible and based on information collected by regional co-ordinators. Thus, the selection was generally proportional to the number of inhabitants in the villages. In some cases (absence of population information) the treatment was:
1. When no information was available concerning the people of the village, the selection was made on a simple random basis (enumeration method);
2. When a list of villages did not exist or any information concerning its or their population (inhabitants) selection was made randomly from a point on the map, after it had been divided into 20 parts. (Map method)
Finally, the family constituted the Third Stage Unit (TSU) and its selection was without replacement and with equal probability within each selected village. The method used by PAV (Extended Vaccination Programme) was applied to barrios or neighbourhoods outside Luanda. This method consists in spinning a bottle to select a random direction. Following this the first family surveyed is randomly selected in this direction. The other families are those closest to the first.
In the case of Luanda, the sample was probabilistic with two selection stages. The unit used in the first stage was the census section in the Demographic Census of 1983/84, updated when the Priority Survey on Household Living Conditions was carried out in 1995 (IPCVD) by INE. The selection of primary sampling units was made independently and systematically with probability proportional to the number of dwellings. The secondary sampling unit was the family, whose selection was without replacement and with equal probability within each selection made. The selection of families in Luanda was made using a complete list of families taken from the selected census section.
The final probability of selection for each household is obtained from the product of the probabilities at each selection stage. The analysis of the weighted results was used to facilitate national and regional estimates and in order to correct the information used in the selection of PSUs and SSUs in the next selection stage.
The sample size was defined with a level of confidence of 95% to estimate the proportion of variable keys for the research based on information available to UNICEF. The level of precision was 5%, with the exception of some variables linked to breast-feeding in which more limited age-groups were used. In these cases the level of precision used was 8%.
The estimation of the necessary sample size was made separately for each of these key variables. Quite different sizes were obtained for the sample from each of the variables, having in the end to opt for the largest size. This confers a higher level of accuracy on the other variables than that originally expected. Or, that is to say, estimates can be obtained from the survey data with a maximum error of plus or minus 5%, with the exception of those variables related to breast-feeding where the maximum error was plus or minus 8%.
The "Design Effect" (Deff) is a factor used to adjust the variance obtained from a complex sampling design using clusters with the variance of a simple random sample.
In the definition of the sample, size 2 was assumed as the lowest value and 10 as the highest, using the highest value only in the case of Water and Sanitation.
In the analysis of data the confidence intervals of 95% were calculated for the main indicators using Program Epi Info 6, which calculates the value of DEFF directly from the data,
The sample size was fixed at 4,410 families distributed equally among the six regions, resulting in a sample of 735 families, 21 primary units (PSUs) and 21 secondary units (SSUs) for each of the six regions. In this way in each secondary unit selected, 35 families were chosen.
In summary, the size of the national sample was defined in: (21 clusters per region) X (35 families per cluster) X (6 regions) = 4,410 families.
Deviations from the Sample Design
MICS is the first survey since the country's independence to be carried out on a national scale.During its implementation it was necessary to call on the co-operation and help of a large number of organisations in order to overcome a whole series of political and logistic difficulties.
In spite of this help it was not always possible to reach the selected “comunas” and in some cases it was not possible to have access to all the villages which constitute the “comuna”. This lack of access was generally due to mines, collapsed bridges or lack of security.
Initially a total of 28 “comunas” were selected, however, these were in fact inaccessible. They were replaced respectively by those that were nearest and accessible, the term “nearest” having been defined as the distance between the main towns and villages of the “comuna”. The replacement of inaccessible “comunas” served to maintain the size of the sample for each region, where the nearest “comuna” was used to try and represent what had been rejected or replaced.
Obviously the situation of these replaced “comunas” will be different or probably worse than the situation of those used to replace them.
This leads us to say that the estimates arrived at as a result of the survey cannot represent the whole Angolan population and the regions, but only the population that was accessible. In the results analysis it was possible to use "weightings" in order to try and make adjustments to represent approximate numbers, as part of the population was inaccessible, but it was never possible to get exact information about this same population. All the data should be seen in this light.
However, if it is considered that the population of these “comunas” might have been overestimated on the basis of the survey, and that some of them were practically under-populated, then we can estimate the proportion of the initial sample that was lost as between 10-20%. We found that the regions with greatest access problems were those to the East and Central South. Data may well be affected in these regions.
Dates of Data Collection (YYYY/MM/DD)
Mode of data collection
Seventeen people in the computing department entered the MICS data from September till the end of December 1996. The co-ordinator of the IT area and the IT supervisors oversaw this work. The IMPS and SPSS PC software packages were used in order to process the data. Subsequently the data was edited, which lasted almost 2 and half months. This work was carried out in an integrated form by the analysis team, in collaboration with those in the computer processing team. The editing served to:
- Calculate the identification numbers of the individuals and their families;
- Calculate ages using the dates of birth of those interviewed;
- Make links between the modules, especially between the list of family members (Module B) and the modules for individuals, in order to evaluate losses at the individual level;
- Carry out coherence tests on the questions and check any incoherent data in the questionnaires;
- Examine the distribution of variables for extreme or improbable frequency values and check questionnaires;
The quality of the data was evaluated by the National Co-ordinating Committee during the fieldwork stage and by the IT team during data entry, and also by the analysis team during editing and analysis.
The following problems were identified:
- Analysis of problems related to the fieldwork reports;
- Difficulties in the field;
- Some of the interviewers had difficulty in understanding technical words, e.g. the meaning of “drains”;
- Difficulty in drawing up calendars of local events due to lack of information; in Unita areas, national events cannot be used as they relate to political events and dates decided on by the party in power;
- The interviewers tended not to use the ages calendar;
- The more elderly people interviewed did not know their dates of birth;
- The module on women, module D, where women had to record past births;
- Non-accounting of nearly 300 cases in the last version of Module A due to IT problems.
During the analysis it was necessary to look at questions where the proportion of responses "Don't know", "No reply" or without any type of reply (missing) was high. Amongst them there were notably questions relating to nutrition (anthropometry) with 5% of "don't knows" or with no response. To a great extent evaluation of quality was carried out using reports from control visits undertaken by the technical team (INE and UNICEF); some problems were detected during data editing carried out by the same technical team. Evaluation was made of the co-ordinators and of the information provided on ages.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download of the data files (for datasets obtained on-line)
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.