Sampling Procedure
A total of 1,000 face-to-face household interviews per country were to be conducted, with adult (18 years and over) occupants and with no upper limit for age. The sample was to be nationally representative. The EBRD’s preferred procedure was a two stage sampling method, with census enumeration areas (CEA) as primary sampling units and households as secondary sampling units. To the extent possible, the EBRD wished the sampling procedure to apply no more than 2 stages.
The first stage of selection was to use as a sampling frame the list of CEA's generated by the most recent census. Ideally, 50 primary sampling units (PSU's) were to be selected from that sample frame, with probability proportional to size (PPS), using as a measure of size either the population, or the number of households.
The second sampling stage was to select households within each of the primary sampling units, using as a sampling frame a specially developed list of all households in each of the selected PSU's defined above. Households to be interviewed were to be selected from that list by systematic, equal probability sampling. Twenty households were to be selected in each of the 50 PSU's.
The individuals to be interviewed in each household were to be selected at random, within each of the selected households, with no substitution if possible.
ESTABLISHMENT OF THE SAMPLE FRAME OF PSU’s
In each country we established the most recent sample frame of PSU’s which would best serve the purposes of the LITS sampling methodology. Details of the PSU sample frames in each country are shown in table 1 (page 10) of the survey report.
In the cases of Armenia, Azerbaijan, Kazakhstan, Serbia and Uzbekistan, CEA’s were used. In Croatia we also used CEA’s but in this case, because the CEA’s were very small and we would not have been able to complete the targeted number of interviews within each PSU, we merged together adjoining CEA’s and constructed a sample of 1,732 Merged Enumeration Areas. The same was the case in Montenegro.
In Estonia, Hungary, Lithuania, Poland and the Slovak Republic we used Eurostat’s NUTS area classification system.
[NOTE: The NUTS (from the French "Nomenclature des territoriales statistiques" or in English ("Nomenclature of territorial units for statistics"), is a uniform and consistent system that runs on five different NUTS levels and is widely used for EU surveys including the Eurobarometer (a comparable survey to the Life in Transition). As a hierarchical system, NUTS subdivides the territory of the country into a defined number of regions on NUTS 1 level (population 3-7 million), NUTS 2 level (800,000-3 million) and NUTS 3 level (150,000-800,000). At a more detailed level NUTS 3 is subdivided into smaller units (districts and municipalities). These are called "Local Administrative Units" (LAU). The LAU is further divided into upper LAU (LAU1 - formerly NUTS 4) and LAU 2 (formerly NUTS 5).]
Albania, Bulgaria, the Czech Republic, Georgia, Moldova and Romania used the electoral register as the basis for the PSU sample frame. In the other cases, the PSU sample frame was chosen using either local geographical or administrative and territorial classification systems. The total number of PSU sample frames per country varied from 182 in the case of Mongolia to over 48,000 in the case of Turkey. To ensure the safety of our fieldworkers, we excluded from the sample frame PSU’s territories (in countries such as Georgia, Azerbaijan, Moldova, Russia, etc) in which there was conflict and political instability. We have also excluded areas which were not easily accessible due to their terrain or were sparsely populated.
In the majority of cases, the source for this information was the national statistical body for the country in question, or the relevant central electoral committee. In establishing the sample frames and to the extent possible, we tried to maintain a uniform measure of size namely, the population aged 18 years and over which was of more pertinence to the LITS methodology. Where the PSU was based on CEA’s, the measure was usually the total population, whereas the electoral register provided data on the population aged 18 years old and above, the normal voting age in all sampled countries. Although the NUTS classification provided data on the total population, we filtered, where possible, the information and used as a measure of size the population aged 18 and above. The other classification systems used usually measure the total population of a country. However, in the case of Azerbaijan, which used CEA’s, and Slovenia, where a classification system based on administrative and territorial areas was employed, the measure of size was the number of households in each PSU.
The accuracy of the PSU information was dependent, to a large extent, on how recently the data has been collected. Where the data were collected recently then the information could be considered as relatively accurate. However, in some countries we believed that more recent information was available, but because the relevant authorities were not prepared to share this with us citing secrecy reasons, we had no alternative than to use less up to date data. In some countries the age of the data available makes the figures less certain. An obvious case in point is Bosnia and Herzegovina, where the latest available figures date back to 1991, before the Balkan wars. The population figures available take no account of the casualties suffered among the civilian population, resulting displacement and subsequent migration of people.
Equally there have been cases where countries have experienced economic migration in recent years, as in the case of those countries that acceded to the European Union in May, 2004, such as Hungary, Poland and the Baltic states, or to other countries within the region e.g. Armenians to Russia, Albanians to Greece and Italy; the available figures may not accurately reflect this. And, as most economic migrants tend to be men, the actual proportion of females in a population was, in many cases, higher than the available statistics would suggest. People migration in recent years has also occurred from rural to urban areas in Albania and the majority of the Asian Republics, as well as in Mongolia on a continuous basis but in this case, because of the nomadic population of the country.
SAMPLING METHODOLOGY
Brief Overview
In broad terms the following sampling methodology was employed:
· From the sample frame of PSU’s we selected 50 units
· Within each selected PSU, we sampled 20 households, resulting in 1,000 interviews per country
· Within each household we sampled 1 and sometimes 2 respondents
The sampling procedures were designed to leave no free choice to the interviewers. Details on each of the above steps as well as country specific procedures adapted to suit the availability, depth and quality of the PSU information and local operational issues are described in the following sections.
Selection of PSU’s
The PSU’s of each country (all in electronic format) were sorted first into metropolitan, urban and rural areas (in that order), and within each of these categories by region/oblast/province in alphabetical order. This ensured a consistent sorting methodology across all countries and also that the randomness of the selection process could be supervised.
To select the 50 PSU’s from the sample frame of PSU’s, we employed implicit stratification and sampling was done with PPS. Implicit stratification ensured that the sample of PSU’s was spread across the primary categories of explicit variables and a better representation of the population, without actually stratifying the PSU’s thus, avoiding difficulties in calculating the sampling errors at a later stage.
In brief, the PPS involved the following calculations:
· Cumulated size of the selected PSU (CEA, NUTS, etc)
· Scaled cumulated size based on the number of selected PSU’s (50) and the total size of the PSU’s (depending on country)
· Randomly shifted scaled cumulated size using a random number between 0-1
The selected PSU’s were those, where the integer part of the shifted scaled cumulated size changed.
Appendix A of the survey report (organised in country sections), shows the 50 PSU’s selected in each country, as well as where these were geographically located. As can be seen from the selected PSU’s in each country, the population in each PSU ranged from a few hundred people to several hundreds of thousands, especially in metropolitan and urban areas. In some large PSU’s (e.g. Tashkent in Uzbekistan, Almaaty in Kazakhstan, etc) the PPS had apportioned, more than 1 sampling area within the same PSU; this is because of the large population of those units.
Although we would have liked to have PSU’s of approximately equal size (preferably with population less than around 2,000 inhabitants), this was not feasible, because the PSU’s obtained from the various sources described in section 4.3.1, did not go down to that level of detail.
The PSU sampling methodology described in this section was implemented in 28 counties. The exception was Mongolia. In Mongolia, we had to adapt the PSU sampling process to account for the current availability and quality of the data, the very small population density, and the fact that between 30-50% (according to some estimates) of the population live nomadic lives both in urban and rural areas.
The normal stratification used in Mongolia for comparable surveys (like the Asiabarometer) and which methodology we followed also in this case, is to explicitly stratify the sample with the allocation of 19 PSU’s (38%) to the area (1st stratum) of the capital Ulaanbaatar (metropolitan) and the remaining 31 to other urban and rural areas (2nd stratum). We then used PPS selection of PSU’s within each stratum.
PSU changes
In a number of countries (Armenia, Bosnia and Herzegovina, Estonia, FYROM, Kyrgyz Republic, Lithuania, Romania, Russia, Tajikistan, Ukraine and Uzbekistan), a few (between 1 and 9) of the originally selected PSU’s, mostly in rural areas had to be replaced during the course of the fieldwork. The replaced PSU’s are given in Appendix A, under each country section. To the extent possible we tried to replace PSU’s by selecting other PSU’s matching the population and socioeconomic profile and proximity of the originally selected areas.
The most common reason for PSU replacement was because of geographical remoteness and consequent difficulties in accessing the area, especially given the poor road and transport infrastructure in many rural parts. There were also cases where PSU’s had low population densities which meant that distances between settlements were great, and where villages which were shown on maps, had subsequently been broken-up or been abandoned. Had we known before the PSU selection how difficult it was to access these PSU’s we would have excluded them from selection from the onset.
In some other cases, poor weather conditions and localised flooding exacerbated the problems and because of time limitations, we could not wait until the weather conditions improved to re-visit the PSU’s which were ultimately replaced.
PSU’s excluded from sampling
Certain territories of some countries (Albania, Azerbaijan, Kazakhstan, Mongolia, Moldova, Russia, Serbia, and Tajikistan) were excluded from the original sampling, either because there were conflicts in those areas or political instability, or because the selected areas were inaccessible. In Serbia’s case it was agreed before the start of the project that Kosovo will not be included in this survey.
Selection of dwellings within each chosen PSU
This part of the sampling process presented the most challenges because of the significant differences in the quality, depth, availability and size of PSU’s at this level and other pertinent data in each country. As can be seen from the selected PSU’s, some of the PSU’s were very large. Listing all eligible households and applying a single stage sampling within each PSU’s (or 2nd stage sampling as part of the overall process) was impracticable because of timescale and budget limitations. Listing all the households especially in large PSU’s (sometimes whole cities) would have meant census enumeration plus listings.
2nd stage sampling
In most of the countries it was necessary to apply more than two sampling stages to select households. These stages are described below.
The 2nd stage involved the selection of 4 segments/areas within each PSU, which would allow listing of dwellings and ultimately the sampling of households to be more practicable. For each selected PSU we obtained a hard copy map of the area and split this into small segments/zones. To the extent possible we aimed to have zones with equal populations although, as it turned out, this was not always feasible. Each segment was then given an identification number starting from from the north-east segment. As illustrated in the diagram below we numbered the segments from left to right ("reading a book" method) Segments which did not contain dwellings (such as parks and non-built up areas) were not numbered as above and were excluded from sampling.
The next step was to select 4 zones with the intention of conducting 5 household interviews in each (total of 20 per PSU). The selection of the zones was done using systematic, equal probability sampling.
Prior to fieldwork commencing, interviewers accompanied by fieldwork supervisors visited each selected segment/area and listed on paper all eligible dwellings (likely to be habited by households), including apartments in blocks of flats. Each eligible dwelling was assigned a unique serial number. It is important to note that during this exercise we were listing dwellings and not households as the latter would have taken a considerable time to do. Furthermore, we did not want to disturb some households twice (i.e., the fist time to find out how many households lived in a dwelling and the second time to interview, if selected). For the purposes of this research we assumed that dwellings were inhabited by one household. The same assumption was made for the apartments in blocks of flats.
Non-eligible dwellings such as hospitals, prisons, night clubs, offices etc, were not listed as these were excluded from the scope of the LITS. In the case of remote settlements, it was not always feasible to conduct this preparatory work because of the logistical difficulties involved. In such cases, we estimated the number of dwellings from the population and average size of the household in that area.
3rd stage
The 3rd sampling stage involved the selection of the eligible dwellings (assuming 1 household in each) within each of the selected areas. The nominal number of dwellings was 5. However, before proceeding with the sampling process each country estimated - based on previous experience - the number of household contacts needed to complete 5 interviews by taking into account the usual refusal rate and the likelihood of no interviews for reasons such as not finding anybody at home, or no reply. The number of additional dwellings varied between 3 and 4 depending on the country and the PSU.
The total number of dwellings (5 plus 3-4 possible replacements), were selected from the lists prepared by the fieldworkers during the listing exercise using systematic, equal probability sampling. From the number of selected dwellings (5+replacements) we again applied systematic,
equal probability sampling ("4th stage") but in this case the purpose was to "isolate" those which were replacements. The interviewers were provided with the contact details of the 5 selected dwellings (primary targets) and were told that they should exhaust all possible efforts to conduct interviews with the households of those dwellings only. The interviewers were not told about the reserve dwellings, the existence of which, and the possibility of using them was only known to fieldwork managers and senior supervisors.
Our aim whilst developing and implementing the sampling methodology was to ensure that the sampling procedures left no free choice to the interviewers. In those cases where more than one household resided in the same dwelling we interviewed the household which first opened the door. We made 3 attempts to interview the selected households before proceeding to the replacement households.
Additional sampling stages
In some cases and once the 4 areas were selected (as discussed in the previous section) it was necessary to apply additional sampling stages. This could have occurred when the field team visited the area for the purpose of listing all the dwellings in that area and discovered that because of the large number of dwellings it would have been impracticable to list all of them. In such cases the originally selected area (the four described in the previous section) were further divided into smaller segments. Numbering and selection of the smaller segments was done using the same procedures as those discussed in section 4.3.2.3 of the survey report.
Country sampling stages
In the majority of countries, the sampling process involved 3 stages, the 1st for PSU, the 2nd for areas with PSU’s and the 3rd for dwellings within areas. In Azerbaijan, Bulgaria, Serbia, Montenegro, and Estonia, we applied two stages of sampling. In Azerbaijan and Bulgaria we had information on the number of dwellings in each PSU and we did the selection using systematic, equal probability sampling. In Serbia, Montenegro and Estonia although information on the number of dwellings within each PSU’s was available, the holders of this information refused to share it with us. In these countries, selection of the dwellings was done by the statistical institutes using systematic equal probability sampling and a list was provided to us. In Hungary and Russia and for some PSU’s (not all) it was necessary to apply more than 3 stages (as explained in section 4.3.2.3.1 of the survey report).
Selection of household respondents
In each household we sampled sometimes one and sometimes two respondents. The first respondent was always the head of the household or other knowledgeable member, being the person(s) deemed to have the most knowledge on household issues (roster and expenses). The second person who was sampled was the person aged 18 years and over, who last had a birthday in the household.
Where the head of the household did not know the precise date of birth of adult members, or the list of birthdays was incomplete we used the Kish grid method to select the "principal" respondent. There were cases where the head of the household and the principal respondent was the same person. This would happen if the head of the household also had been the person to last have a birthday. There could never be more than two respondents per household. The head of the household was responsible for answering Sections 1 and 2 of the questionnaire (household roster and expenses) and the principal respondent Sections 3 -7 (life in transition).