The primary objective of Cambodia Socio-Economic Survey (CSES) 1997 was to obtain data for the measurement of living standards in geographic stratification and different segments of the Cambodian society. The other objectives were to provide information needed by a variety of users such as government institutions, donor agencies, non-government organizations; to assist NIS to train its staff in planning, designing and conducting a household based survey system and institutionalize survey taking capability. The expansion of the scope of the survey to meet the data needs of a wide variety of users and thus minimize the duplication of household surveys and promote the acceptance of CSES as the national household survey programme was also an important objective.
Specifically the survey had the following objectives:
i) To provide data required for the measurement of living standards through a single source of data for a comprehensive and detailed analysis of living standards and poverty in Cambodia.
ii) To provide information on school facilities, schooling and enrollments, cost of education and related information.
iii) To provides information on health issues, utilization of health facilities and costs incurred in treating illnesses.
iv) To provide information on demographic and economic characteristics of the population such as age-sex distribution, marital status, fertility, mortality, literacy, employment incomes.
v) To derive information on socio-economic conditions of villages including infrastructure and access to education and health facilities.
vi) To establish survey taking capability within NIS for the Institute to conduct multi-objective large scale household-based survey programmes.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
Unit of Analysis
Producers and sponsors
Authoring entity/Primary investigators
National Institute of Statistics
Ministry of Planning
Project Executing Agency
United Nations Development Programme
Swedish International Development Cooperation Agency
A sampling frame based on a national population census is not available for Cambodia. A list of village and village population prepared by the United Nations Transitional Authority in Cambodia (UNTAC) for conducting the general elections held in 1993, which was updated for undertaking the household surveys conducted during the past few years was used as the sampling frame.
As in other household surveys conducted recently, the coverage of the survey had to be restricted due to security reasons; excluded areas were considered not sage for the enumerators to conduct fieldwork. Accordingly two provinces and a number of communes from 15 other provinces were excluded from the frame. The truncated frame used for the survey covered 100% of the villages in Phnom Penh, 91.2% of villages in the Other Urban towns and 86.3% of the Rural villages. The proportions of excluded households were lower, and amounted to only 4.8% of households in other urban areas and 11.6% of households in the rural sector.
A two-stage stratified random sampling design was adopted with villages as primary sampling units (PSU's) and households as secondary sampling units (SSU's). Considering the socio-economic stratification, the spread of the items canvassed, and sample size of the survey, Cambodia was divided into 3 strata viz. Phnom Penh, Other Urban areas and Rural areas. The frame which had villages grouped by communes, and communes by districts and provinces in effect, provided for an implicit stratification of the universe for the probability proportional to size (PPS) systematic random sampling procedure adopted in the selection of the PSU's. The procedure also provided for the preparation of estimates for the four geographic zones namely, Plains, Tonle Sap Lake, Plateau and Mountains and Coastal regions of the country.
Deviations from the Sample Design
The sampling design for the CSES 1997 considered several factors including the precision of data required by the users, the capacity of the national statistics office to conduct the survey, and most importantly the time constraint imposed to complete survey field work before the end of July 1997. Taking into account these factors, and specially the experience gained from the two socio-economic surveys conducted in 1993/94 and 1996, including estimates of feasible work loads, a sample of 6000 households to be selected from 474 villages was considered to be sufficient and manageable.
The design also took into consideration the need for separate analyses of three geographical domains, namely Phnom Penh, other urban areas aggregated together, and the rural area. In deciding the sample allocation to the three domains, it was decided that a size of around 1000 households would be adequate for the first two domains and the rest should be allocated to Domain 3 - Rural area, since it was envisaged that more detailed analysis of the poverty groups in this domain would be undertaken.
Despite the length of the questionnaire, the respondents had cooperated with the survey staff and provided answers to both questionnaires and it was possible to achieve a 100% response rate. At this stage it is not possible to comment on item non-response, and completeness of information provided by the respondents, and the respondent’s fatigue arising from the length of the interviews which may have had a bearing on these issues.
The estimates have been formed by weighting the data from the sample households to provide estimates that relate to all households in each domain. The weighting factors were calculated based on the probabilities of selection for the sample
The design weights are used to compensate for differences in the selection probabilities. The weight for the PSU is inversely proportional to its selection probability.
The probability of selection of j th household in normal size PSU's and blocks in the h th domain is
ph( i ) x ph( j / i ) = ph( ij ) ( Eq. 3 )
where ph( i ) = ah Mhi / Mh
and ph( j / i ) = nh / Mhi*
Thus the design weights whij for these units are
whij = 1 / ph( ij )
Mh x Mhi*
= ----------------------- ( Eq. 4 )
ah x Mhi x nh
For the large PSU's which were segmented, the probability of selection of the jth household in the sth segment in the ith PSU in the hth domain is
ph( i ) x ph( s / i ) x ph( j / is ) = ph( isj ) ( Eq. 5 )
where ph( i ) = ah Mhi / Mh
ph( s / i ) = 1 / si
and ph( j / is ) = nh / Mhis* ( Eq. 6 )
The design weight for such large PSU is
whisj = 1 / ph( isj )
Mh x Mhis* x si
= ------------------------------ (Eq. 7 )
ah x Mhi x nh
The design for CSES is not self weighting and therefore it is necessary to compute weight for each PSU, block or segment selected in the sample and these weights have to be used in the estimation procedure.
Dates of Data Collection (YYYY/MM/DD)
Mode of data collection
The supervisor is responsible for
(i) administering the Village Questionnaires (Form 2),
(ii) preparing the two Household Questionnaires for each village (for example, completing certain information on the Cover Page of each questionnaire, as described in this manual),
(iii) checking all completed questionnaires to ensure that they have been filled up completely and well, and
(iv) for making random visits to households that have been interviewed by interviewer to make sure that the answers are consistent with the completed questionnaire.
(v) The supervisor is also expected to occasionally observe interviewers while they are conducting household interviews, especially during the first one or two weeks of the field work.
The district-level supervisor is responsible for checking the village questionnaires and for monitoring the survey's overall progress in those villages.
Type of Research Instrument
Four questionnaires were used in the survey.
Form 1: Household Listing Form was used to prepare the current list of households for sampling.
Form 2: Village Questionnaire was used to collect village level data on socio-economic infrastructure and facilities including prices and wages from key informants.
Form 3: Core Questionnaire was used to collect demographic and socio-economic characteristics of the population.
Form 4: Social Sector Module was designed to collect detailed information on education and health service utilization and related household expenditures.
All completed questionnaires were brought to NIS for processing. Although completed questionnaires were checked and edited by supervisors in the field, specially because of the length of questionnaires and the complexity of the topics covered the need for manual editing and coding by trained staff was accepted as an essential priority activity to produce a cleaned data file without delay. In all, 39 staff comprising 35 processing staff and 4 supervisors were trained for three days by the project staff. An instruction manual for manual editing and coding was prepared and translated into Khmer for the guidance of processing staff. Manual processing of questionnaires commenced in mid August 1997.
In order to produce an unedited data file, keying in the data as recorded by field enumerators and supervisors, (without subjecting data to manual edit as required by the Analysis Component Project staff), it was necessary to structure manual editing as a two-phase operation. Thus in the first phase, the processing staff coded the questions such as those on migration, industry, and occupation which required coding. Editing was restricted to selected structural edits and some error corrections. These edits were restricted to checking the completeness and consistency of responses, legibility, and totaling of selected questions. Error corrections were made without canceling or obliterating the original entry made by the enumerator, by inserting the correction close to the original entry.
Much of the manual editing was carried out in the second phase, after key entry and one hundred percent verification and extraction of error print outs. A wide range of errors had to be corrected which was expected in view of the complexity of the survey and the skill background of the enumeration and processing staff. The manual edits involved the correction of errors arising from incorrect key entry, in-correct/ failure to include identification, miss-coding of answers, failure to follow skip patterns, misinterpretation of measures, range errors, and other consistency errors.
The results provide estimates at the level of the three domains Phnom Penh, other urban areas, and the rural sector into which the entire geographical area covered by the survey was divided. The survey design has provided for statistically reliable estimates for most characteristics at these levels of stratification.
The expenditure data from CSES 1997 presented here are not strictly comparable with the data from the SESC 1993/94, which canvassed very detailed data on consumer expenditure. SESC 1993/94 collected data on over 450 items of consumption expenditure, the type of information required to establish weights in the construction of consumer price indices. At that level of disagregation it is possible to achieve results closer to actual consumption levels. Such surveys are required infrequently once in 5 –7 years because of costs and time involved in designing, conducting and processing such surveys. CSES 1997 had used a shorter list comprising 33 commonly used consumer items that were considered to be adequate to monitor consumption expenditure over time. In addition to this issue arising from differences in the scope of the two surveys, the researchers should take note of the decline in household size and changes in household structure which are important determinants of household expenditure.
National Institute of Statistics
Each dataset has an "Access policy". The NIS recommends three levels of accessibility:
- Public use files, accessible to all
- Licensed datasets, accessible under conditions
- Datasets only accessible in a data enclave, for the most sensitive and confidential data.
1. The data and other materials will not be redistributed or sold to other individuals, institutions, or organizations without the written agreement of the National Institute of Statistics.
2. The data will be used for statistical and scientific research purposes only. They will be used solely for reporting of aggregated information, and not for investigation of specific individuals or organizations.
3. No attempt will be made to re-identify respondents, and no use will be made of the identity of any person or establishment discovered inadvertently. Any such discovery would immediately be reported to the National Institute of Statistics.
4. No attempt will be made to produce links among datasets provided by the National Institute of Statistics, or among data from the National Institute of Statistics and other datasets that could identify individuals or organizations.
5. Any books, articles, conference papers, theses, dissertations, reports, or other publications that employ data obtained from the National Institute of Statistics will cite the source of data in accordance with the Citation Requirement provided with each dataset.
6. An electronic copy of all reports and publications based on the requested data will be sent to the National Institute of Statistics.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download of the data files (for datasets obtained on-line)
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
National Institute of Statistics
Ministry of Planning
Development Economics Data Group
Documentation of the DDI
Date of Production
Version 01 (June 2011) - Adopted from "KHM-NIS-CSES-1997-v1" DDI. Source http://www.nis.gov.kh/nada/?page=catalog