The Cambodia Socio-Economic Survey (CSES) 2003-2004 is the fifth Cambodia Socio Economic Survey conducted by National Institute of Statistics, since the Socio Economic Surveys in the years 1993/94, 1996, 1997 and 1999 and the CSES survey has been conducting annually from 2007 to 2010 such as 2007, 2008, 2009 and 2010 CSES.
The CSES is a household survey with questions to households and the household members. In the household questionnaire there are a number of modules with questions relating to the living conditions, e.g. housing conditions, education, health, expenditure/income and labour force. It is designed to provide information on social and economic conditions of households for policy studies on poverty, household production and final consumption for the National Accounts and weights for the CPI.
The main objective of the survey is to collect statistical information about living standards of the population and the extent of poverty. Essential areas as household production and cash income, household level and structure of consumption including poverty and nutrition, education and access to schooling, health and access to medical care, transport and communication, housing and amenities and family and social relations. For recording expenditure, consumption and income the Diary Method was applied for the first time. The survey also included a Time Use Form detailing activities of household members during a 24-hour period.
Another main objective of the survey is also to collect accurate statistical information about living standards of the population and the extent of poverty as an essential instrument to assist the government in diagnosing the problems and designing effective policies for reducing poverty, and in evaluating the progress of poverty reduction which are the main priorities in the "Rectangular Strategy" of the Royal Government of Cambodia.
The 2003-2004 Cambodia Socio-Economic Survey covered the following topics:
- Basic Household Information
- Information on Migration
- Food Consumption during the Last 7 Days
- Education and Literacy
- Household Economic Activities
- Household Liabilities
- Household Income from Other Sources
- Durable Goods and Other Expenses
- Construction Activities in the Past 12 Months
- Fertility and Child Care
- Health Check of Children
- Current Economic Activity
- Household Expenditure and Consumption
- Household Income and Receipts
- Time Use
- Demographic Information
- Economy and Infrastructure
- Rainfall and Natural Disasters
- Retail Prices
- Employment and Wages
- Access to Common Property Resources during the Last 5 Years
- Sales Prices of Agricultural Land in the Village
- Recruitment of Children for Work outside the Village
Producers and sponsors
National Institute of Statistics
Ministry of Planning
The World Bank
United Nation Development Programme
Swedish International Development Agency
Statistics Sweden (SCB)
The Cambodia Socio-Economic Survey 2003-04 (CSES) is conducted in a nationwide representative sample of 15,000 households within 900 sampling units (villages). It is divided into 15 monthly representative samples of 1000 households in 60 villages.
The sampling design and implementation was made in March 2003. A three-stage sample design was devised. Since NIS already had a master sample based on the Population Census 1998, consisting of 600 villages, it was used. But in order to reach the preferred number of 900 villages, the sample was extended to include an additional 300 villages.
In the first stage, a sample of villages was selected in the head office. The villages were initially stratified into 45 strata (province*urban/rural). The villages were selected using systematic sampling with probabilities proportionate to size (PPS). The size measures used for the selection were number of households in the village according the 1998 Census. The resulting sample thus consisted of 900 villages, of which 600 are in rural areas and 300 in urban areas.
In the second stage one Census Enumeration Area (EA or alternatively PSU) was selected randomly also in the head office. At the beginning of the fieldwork, all households in the selected EA were listed using a household listing form, and following internationally recommended procedures. A systematic sample of households was then drawn in a third stage. The third stage sample was 20 households in rural areas and 10 households in the urban areas.
The work on sample design was carried out in the following areas:
- Estimation of sampling errors and design effects in the CSES 1999
- Calculation of optimal sample size within primary sampling units
- Sample size and sample allocation for CSES 2003
The work was done in a group of NIS staff in the form of expert assisted hands-on training in sampling design and calculation of sampling errors.
In previous surveys PSUs have been villages. It was decided to use village as PSU also for the CSES 2004 mainly because the communes were considered too large (and too few) to serve efficiently as PSUs. Another factor weighing in favor of villages was the fact that there already exists a master sample of villages at NIS.
The master sample consists of 600 villages (88 urban and 512 rural villages). The selection of villages was made with PPS sampling, hence facilitating an approximately self-weighing design with equal workloads in the villages. It was discussed whether a further stratification on 3-4 crude income-level strata should be done in urban Phnom Penh in order to secure a good spread of the sample over different income levels. It was decided not to do such stratification. Phnom Penh has a large sample (90 villages) selected with systematic sampling over a geographically ordered sample frame; this will in itself secure a reasonably good spread of PSUs.
The master sample is allocated over the strata proportionally to the total number of households in the strata. A problem with the master sample is that due to the proportional allocation the urban sample is too small to provide for good estimates in the urban domain. It was therefore decided to expand the sample to include 600 rural villages and 300 urban villages.
Secondary Sampling Units (SSU)
The 600 villages in the master sample are divided in small segments containing approximately ten households each by using census enumeration area maps. As a consequence the boundaries of the segments would be difficult to identify in the field. There would be a risk that housing units constructed after the census will be missed when households are listed within segments during the fieldwork. It was therefore decided not to use the segments in the second stage sampling. The available options are in this situation either (a) to select households directly on stage in the village or (b) to use the enumeration areas as secondary sampling units. Selecting households directly would require a listing of all households in the village prior to the fieldwork. Such a listing would become time-consuming in large villages. It was therefore decided that enumeration areas would be used as SSUs, and that one enumeration area is selected within each sampled village.
Villages were selected with a systematic PPS procedure within each stratum. For each sampled village one census enumeration area (EA) was selected. As the enumeration areas are roughly of the same size, the selection was done with equal probability sampling.
Ten (10) households were selected in each sampled village in the CSES 99. Calculations indicated that this sample size was close to optimum. Since the optimum is rather flat, the loss in efficiency from sample sizes of 12-15 is fairly small.
From a purely sampling efficiency point of view, a larger sample than 15 households per village should not be taken. However, factors relating to interviewers' security and well-being weighed in favor of having two interviewers per village in the rural areas. A workload of 10 households between the two interviewers in the village was considered too small. A workload of 15-20 households would be reasonable. All things taken together resulted in a sample of 10 households in urban areas (with one interviewer per village) and 20 households in rural areas.
The resulting sample consisted of 300 urban PSUs and 600 rural PSUs. From the urban PSUs 10 households were selected while 20 households were selected from rural PSUs. The sample thus contained 15000 households to be interviewed during 15 fieldwork months with 1000 different households each month.
The CSES 2004 enjoyed almost a 100 percent response rate. The high response rate together with close and systematic fieldwork supervision by the core group members were a major contribution for achieving high quality survey results.
The design weights are used to compensate for differences in the selection probabilities. The weight for the PSU is inversely proportional to its selection probability.
A further adjustment of the weights was done in order to calibrate the weights so that estimates of population totals would agree with projections based on the 1998 Population Census. Weights for households in the household file were adjusted so that estimated number of households agreed with census projections for each zone, urban and rural. Weights for individuals in the person file were adjusted so that the estimated number agreed with the projected number in each sex and age group in each zone (urban and rural).
The calculation of the sampling weights is shown in a step-by-step manner in the file sample weights for 900 villages.xls . There is also a copy of the file in SPSS-format. Some of the villages are very large. The best procedure would have been to put the very large villages (villages with a size Mhi larger than Mh/nh) in a separate stratum. This was not done. As a result there are a few villages where the inclusion probability exceeds 1.00 and consequently the first stage sampling weight is below 1.00. To rectify this we set the first stage sampling weight (W1) equal to 1.00 for these villages. When doing so we had to adjust the weights downwards for the other villages in the stratum in order to have the same sum of weights for the stratum as before the adjustment. (The original first stage weights are called prelW1 in the Excel-file and the adjusted weights are called W1. The corrections are calculated in the file: correction of weights for big villages.xls).
The second stage sampling weights are calculated as /mhij , i.e. the number of households in the village (according to the chairman) over the number of sampled households (10 or 20). There were actually two variables indicating the number of households in the village, one was the number obtained by the interviewer (variable E_HHs) and the other was the number obtained by the supervisor (HHs_Vill). A check revealed some apparent data entry errors, which were corrected. The second stage weights are shown as variable W2 in the excel file.
The household sampling weights (Wprel) are calculated by multiplying W1 by W2. All the sampled households in the village get the same household weight. A check of the weights revealed that there were a few extremely low and high weights. These “outliers” will tend to inflate the variance for some estimates. We decided to trim the weights by adjusting the extreme weights towards the center. Weights above 300 were adjusted downwards to 300 and weights below 30 were adjusted upwards to 30. In all only six weights were adjusted, the set of weights after trimming is named W in the file. The distribution of the household weights is shown in diagrams 1 and 2 below. The variation of the weights reflects changes in village sizes between the Census 1998 and the time of the survey. If the current number of households were the same as during the census in all the sample villages there would be no variation at all in the weights. The rather large variation in the weights is by and large a consequence of the long time lag between the census and the survey.
The distribution of the weights for the urban households shows a tendency towards bimodality. Most of the weights are centered around 100 - 120 but there is a cluster of weights around 200 - 220. The reason for this slight abnormality is that the proportional allocation of the sample to strata was not strictly followed. In two urban strata the sample size was below proportion, resulting in substantially larger sampling weights.
Dates of Data Collection
Data Collection Mode
Any survey of the CSES dimensions needs a comprehensive system for quality management and monitoring. Only then can deviations from the target be tended to in time to avoid shortfalls. Interviewers and supervisors were initially divided into teams of five persons (one supervisor and four interviewers), making in total 50 teams for the fieldwork.
The CSES management group within NIS therefore set up a meticulous monitoring scheme to be implemented from the very beginning. The monitoring team did include at least five NIS staff. Commonly the DG of NIS has spent one week monthly while other top ranked NIS officers have been out for two weeks on average. At times other officials from NIS or the Ministry have participated.
Inspections entailed both announced and unannounced visits. Every team was visited at least twice during their fieldwork periods. The purposes of these visits were several. One important purpose was to get a disciplinary effect on supervisors and enumerators from their knowledge that such inspections must be expected throughout the fieldwork month, including also at the very end of the diary month. Also important was to give feedback and encouragement to fieldworkers and to complement training by advice and suggestions and to sort out any problem that had arisen in the course of fieldwork in the village. Another area of concern was to ensure that the household listing and sampling was done in accordance with the procedures that were devised. In Annex 6 an example (January 2004) of the Field Supervision Plans is included, the other months look very much the same.
The Supervisor's Role
The supervisor was the leader of the team and was responsible for the coordination of the interviews, collaboration with local authorities, and checking for errors in the interviewed questionnaires. Enumerators were required to re-interview in case of errors found in the questionnaire. The supervisor then brought the final checked questionnaire to NIS. The supervisor was also responsible for the village questionnaire and the interviews of the village chief or representative. In the early stages of survey planning, a Survey Core Group of high-level NIS officers, chaired by the DG, was formed. After assigning supervisors and enumerators to the villages, five core group members had the task of monitoring fieldwork activities in the sampled villages. This supervision was done during two weeks a month. The major tasks were to check the presence of supervisor and enumerators, the status of the fieldwork, and the cooperation between the authorities of the village and the fieldworkers. They also addressed the issue of non-response households and partially filled-out questionnaires. It was found that non-response overall was very low and that the cooperation with local authorities was good. All taken together, the fieldwork was completed in a very satisfying manner.
Data Collection Notes
Interviewers and supervisors were initially divided into teams of five persons (one supervisor and four interviewers), making in total 50 teams for the fieldwork. Each month, 25 teams were working in the field with a workload of 10 households per interviewer. In urban areas, 4 PSUs were allocated to one team while in rural areas, 2 PSUs were allocated. The fieldwork plan was designed in order to gather around 60 households monthly per team.
For a given month, the team arrived in the village three days before the first day of the month to tend to preparatory tasks like discussing with village authorities, filling out the Household Listing Form, and thereafter sample those households to be interviewed.
The Village Form was filled out by the supervisor.
The Household Questionnaire had 16 sections that were filled out by the interviewer during the first visit to the household, and in the following four weeks according to the following scheme:
FIRST VISIT: Initial visit
WEEK 1: Education and literacy, Housing
WEEK 2: Household economic activities, Household liabilities, Household income from other sources, and other expenditures (partial non-food recall)
WEEK 3: Durable goods and other expenses, Construction activities in the past 12 months, Nutrition, Fertility and child care, Mortality
WEEK 4: Health check of children, Current economic activity, Health, HIV/AIDS, Victimization
Once the month ended, the team went back to the NIS headquarters in Phnom Penh.
Questionnaires from the same PSU was delivered to the Data Management team by the supervisor in a packet including all of the documents used and produced in the fieldwork, including maps, enumeration lists, questionnaires, diaries, etc. Before going to the villages, teams were briefed and introduced to minor adjustments of the interviewing procedure that had to be made as a result of monitoring activities and feed-back from the data processing. Annex 5 contains an example (the first survey month) from the allocation of teams to PSUs.
The fieldwork started in November 2003 and was scheduled to end in December 2004. However, some more basic data was needed for the analyses, and the fieldwork was extended to include January 2005.
Fifty (50) supervisors and 200 enumerators were recruited by NIS and trained for the fieldwork. The training took place in Phnom Penh and lasted three weeks for supervisors and two weeks for enumerators. Before the start of each fieldwork month, there were briefing and retraining sessions. Each fieldwork team included one supervisor and four enumerators. In urban areas one enumerator was responsible for one PSU and for interviewing 10 households, while in rural areas two enumerators were responsible for one PSU and for interviewing 20 households. In all, 125 enumerators and supervisors, divided into 25 teams, were carrying out the fieldwork at the same time. Two such team groups were formed and each team group alternated monthly.
Enumerator and Supervisor training
Initial training was provided during nine days for a group of 20-30 staff (not all were attending all the time). This training included a translation into Khmer of selected parts of the questionnaire, and a field test in a village outside Phnom Penh where the participants performed test interviews in 16 households. The experiences from this exercise were followed up during the course. The course also included general aspects on survey methodology and ways of controlling for errors. Many of the findings from this training served as input to later stages.
Prior to the start of the fieldwork intensive interviewer and supervisor training was carried out. The 200 interviewers and 50 supervisors recruited were split into two groups, each consisting of 100 interviewers and 25 supervisors. The two groups later alternated so that the first group did their fieldwork during odd survey months (i.e. November, January, March …) while the second group covered the even survey months (i.e. December 2003, February, April …).
The training was designed with this in mind. The first group was trained in October 2003 while the second group was trained in November 2003 using premises at the NIS head office. Training of the first group was provided in English by a WB consultant and simultaneously interpreted in Khmer by the appointed NIS officer. The second group was trained by NIS only.
Common was that the supervisors were first trained during one week, and then jointly with their interviewers for two weeks. Before all fieldwork months the group in turn was gathered at the NIS to walk through the questionnaire and manuals in order to correct errors that were detected during the briefing sessions or the monitoring operations, and to learn how to handle any changes that were introduced to the survey instruments.
Five different questionnaires or forms were used in the survey:
Form 1: Household listing sheets to be used in the sampling procedure in the enumeration areas.
Form 2: Village questionnaire answered by the village leader about economy and infrastructure, crop production, health, education, retail prices and sales prices of agriculture, employment and wages, and recruitment of children for work outside the village.
Form 3: Household questionnaire with questions for each household member, including modules on migration, education and literacy, housing conditions, crop production, household liabilities, durable goods, construction activities, nutrition, fertility and child care, child feeding and vaccination, health of children, mortality, current economic activity, health and illness, smoking, HIV/AIDS awareness, and victimization.
Form 4: Diary form on daily household expenditure and income
Form 5: Time use form detailing activities of household members during one 24-hour period.
The questionnaire is one of the first items in a strategy for quality control in data collection through surveys. Any piece of information to be collected must be formulated as a question so that all interviewers can be trained to read the questions in the same way. The questions must be formulated in such a way that all interviewers feel comfortable reading the questions aloud and that all respondents understand the questions in the same way. The layout of the questionnaire must be done so that the interviewer immediately understands how the respondent's answer should be recorded. A lot of work is normally needed to meet these requirements that are built into the process of communication in the interview situation. This is the kind of work in which final perfection is elusive and further improvements can always be made.
The initial work on questionnaire design resulted in a first draft prepared by NIS in early 2003. With expert assistance from Statistics Sweden in March the same year, a systematic walk-through question by question was done. A number of essential problems to be solved were then identified while errors or minor problems were attended to at once. At the end of the exercise some issues remained that were discussed at a meeting with users and stakeholders and then were referred to a larger group within NIS.
Another set of serious deficiencies in the questionnaires was recognized by a WB consultant in June 2003. At this stage the questionnaires included additional modules or questions demanded by stakeholders. It was suggested that the growing complexity of the whole survey would require more training of the fieldwork staff than was originally planned for. Major revisions of the questionnaires were also made at this point.
The pilot was carried out in June in two provinces with a sample of 870 households. When analyzing the survey, the NIS core team identified another number of problems still to be tended to and agreed to some fairly substantial changes in the questionnaires that were used. The many interventions by various parties, already mentioned, with special interests had under way led to the inclusion of several modules and questions with different formats and standards for questions. A main thrust of the agreements by the NIS core team was to ensure comparability with previous rounds of the CSES and more uniformity of formats and standards in the questionnaires.
The proposals, which the NIS core team for the CSES 2004 did agree upon, were then implemented. Using the combined technical expertise at hand, improvements in formulations and in the formatting of all the questionnaires were made. In the beginning of August the household questionnaire was finished while diary forms, the time use form and the village questionnaire remained to be completed.
NIS was able to start work with the translation of the household questionnaire from English to Khmer in September. In the meantime, a World Bank consultant had started to review the existing supervisor and interviewer manuals.
According to the NIS time plan, translation and printing of the questionnaires and manuals had to be completed by early October for training of supervisors to start 6 October and of interviewers to start 13 October. The first group of fieldworkers could then be dispatched to the sampled villages all over Cambodia after completed training 25 October. The training cycle of the second group of supervisors and interviewers would be done in November for the fieldwork starting 1 December.
The resulting set contains 5 forms or questionnaires:
Household listing form
Time Use form
The Household listing was done prior to the sampling, and recorded household information on e.g. location, number of members, principal economic activity. The Village questionnaire was used to gather basic common information on demography, economy, infrastructure, rainfall and natural disasters, education, health, retail prices, employment and wages, access to common property, sales prices of land, and recruitment of children for work.
During the first months of data entry here were delays in the editing and coding operations that induced backlogs in the data entry operation due to the staff situation. In July 2004 a number of entry operators were retrained to assist in carrying out the coding work to eliminate the backlogs. A WB consultant was assigned to help out with problems from the coding operation and to correct possible errors in the entry program.
Estimates of Sampling Error
In order to provide a basis for assessing the reliability or precision of CSES estimates, the estimation of the magnitude of sampling error in the survey data shall be computed. Since most of the estimates from the survey are in the form of weighted ratios, thus variances for ratio estimates will thus be presented.
The Coefficients of Variation (CV) on national level estimates are generally below 4 percent. The exception is the CV for total value of assets where there are rather high CVs especially in the urban areas, which should be expected.
The CVs are somewhat higher in the urban and rural domains but still generally below 7 percent. For the five zones, the average CVs are in the range 5 to 13 percent with a few exceptions where the CVs are above 20 percent. For provinces the CVs for food consumption are 9 percent on average.
The sample take within Primary Sampling Units (PSU) was set to 10 households per PSU in the CSES 1999. When data on variances became available, it was possible to make crude calculations of the optimal sample take within PSU. Calculations on some of the central estimates in the CSES 1999 show that the design effects in most cases are in the range 1 to 5.
Intra-cluster correlation coefficients have been calculated based on the design effects. These correlation coefficients are somewhat high. The reason is that the characteristics that are measured tend to be concentrated (clustered) within the PSUs. The optimal sample size within PSUs under different assumptions on cost ratios and intra-cluster correlation coefficients was then calculated. The cost ratio is the average cost for adding a village to the sample divided by the average cost of including an extra household in the sample. In the CSES, it was chosen to adopt a fairly low cost ratio due to the fact that the interview time per household is long. Under this assumption the optimal sample size is probably around 10 households per village for many of the CSES indicators.
Total population: 13,439,000
New population estimates show that the population increased from close to 11 million in 1994 to 13.5 million in 2004. It is expected to pass 15 million by 2010 according to a revised population projection.
Percentage of Labor force participation aged 10 and over (%)
Phnom Penh: 60.8
Other Urban: 69.5
Percentage of economically active population aged 10 and over (%)
The Statistics Law Article 22 specifies matters of confidentiality. It explicitly says that all staff working with statistics within the Government of Cambodia "shall ensure confidentiality of all individual information obtained from respondents, except under special circumstances with the consent of the Minister of Planning. The information collected under this Law is to be used only for statistical purposes."
1. The data and other materials will not be redistributed or sold to other individuals, institutions, or organizations without the written agreement of the National Institute of Statistics.
2. The data will be used for statistical and scientific research purposes only. They will be used solely for reporting of aggregated information, and not for investigation of specific individuals or organizations.
3. No attempt will be made to re-identify respondents, and no use will be made of the identity of any person or establishment discovered inadvertently. Any such discovery would immediately be reported to the National Institute of Statistics.
4. No attempt will be made to produce links among datasets provided by the National Institute of Statistics, or among data from the National Institute of Statistics and other datasets that could identify individuals or organizations.
5. Any books, articles, conference papers, theses, dissertations, reports, or other publications that employ data obtained from the National Institute of Statistics will cite the source of data in accordance with the Citation Requirement provided with each dataset.
6. An electronic copy of all reports and publications based on the requested data will be sent to the National Institute of Statistics.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download of the data files (for datasets obtained on-line)
National Institute of Statistics
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
DDI Document ID
Development Economics Data Group
Documentation of the DDI
Date of Metadata Production
DDI Document version
Version 02 (October 2011) - Datasets imported
Version 01 (June 2011) - Adopted from "KHM-NIS-CSES-200304-v1.0" DDI. Source http://www.nis.gov.kh/nada/?page=catalog