YEM_2013_LFS_v01_M
Labour Force Survey 2013-2014
Name | Country code |
---|---|
Yemen, Rep. | YEM |
Labor Force Survey [hh/lfs]
The Labour Force Survey (LFS) 2013-14 is the second such survey that has been carried out in the last 15 years in Yemen. Similar to the first survey in 1999, the LFS 2013-14 was conducted by the Central Statistical Organization with assistance from the International Labour Office. More recently, CSO has conducted a population census in 2004 and a household child labour survey of adults and children in 2010. The primary objective of LFS 2013-14 was to provide current data on the employment and unemployment situation at national and governorate level using the preliminary version of the new standards concerning statistics of work, employment and labour underutilization adopted by the 19th International Conference of Labour Statisticians (Geneva, October 2013).
After reprocessing the LFS 2013-14 data according to the international standards of 1982 (13th ICLS) to make the results comparable to the definitions used in the population census of 2004 and the household child labour survey of adults and children conducted in 2010, the results show that the labour force participation rate has somewhat increased during the ten-year period from 2004 to 2013-14. Both the number of unemployed and the unemployment rate, measured on a comparable basis, show a slight increase from 2004 to 2010 before a sharp decrease in 2013-14.
The survey covered all civilian non-institutional households living in urban and rural areas of the country. During field operations, certain sample areas could be covered due to particular circumstances.
Sample survey data [ssd]
Individuals
Households
The scope of this study includes:
Employment, unemployment and underemployment.
Qualification of the labour force and participation in training programs.
Characteristics of labour migration.
Microfinance projects evaluation.
National Coverage.
The survey covered the civilian non-institutional settled population excluding certain areas with difficult access or low population densities, in particular, the nomad population, displaced populations who are homeless, population living in public housing (boarding, hotels, prisons, hospitals, etc.), individuals enlisted in the Armed Forces, who are residing permanently within camps and do not spend most days of the year with their families. Similarly, for marine crews and expatriates outside the country and other categories of persons in remote islands.
Name | Affiliation |
---|---|
Central Statistical Organization | Yemen |
Name |
---|
The Ministry of Social Affairs and Labour |
Internatinal Labour Organization |
The sample design is a two-stage stratified sample of enumeration areas in the first stage of sampling and a fixed number of sample households at the second stage of sampling. The resulting sample is spread evenly over the four quarters of the survey period.
Accordingly, the Central Statistics Organization (CSO) has drawn a stratified sample of census enumeration areas recomposed as primary sampling units (PSUs). Sample selection has been made with probability proportional to the number of households as determined in the 2004 population census. In the second stage of sampling, after relisting of the sample enumeration areas, a fixed number of households (16 sample households) are drawn as clusters with equal probability from each sample enumeration area. The strata consist of the urban and rural areas of the 21 governorates in Yemen.
According to the sample design, urban areas are oversampled and rural areas under-sampled. This is because a relatively larger sample size is required in urban areas where heterogeneity is greater in comparison with rural areas. Also, because the cost of transportation and field operations is relatively greater in rural areas, it is more cost effective to under sample the rural areas relative to the less costly operations in urban areas. The differential sampling rates are then corrected through the sample weights so that the final results accurately reflect to the overall employment pattern.
The sample selection of the cluster of 16 households in each sample enumeration area was drawn after fresh listing of the totality of the households living in the sample enumeration area at the time of listing. This procedure updates the census information that dates back to 2004. The listing operations are carried out in each quarter before survey interviewing. The updated lists are send to CSO in Sana'a for data entry and sample selection of households for transmission to the survey team in each area. Instructions were given so that sample households that could not be found in the field or were absent or refused to be interview should not be substituted with other households as this procedure may introduce bias in the results. Instructions were also given that in cases where the minimum number of
households in the sample enumeration areas was to be found to be less than the required 16 in each quarter, all households in the enumerate on area should be taken in the sample.
The sample size in terms of number of sample households is given in Table B2 below for each quarter and for urban and rural areas separately. The effective sample size was lower due to non-response and other problems of coverage. 13376 households at national level (3344 per quarter), where 6656 were urban (1664 per quarter) and 6720 were rural (1680 per quarter).
The total sample size was determined on the basis of the requirement of producing national estimates of the unemployment rate with 1.5% margin of errors at the national level, assuming an overall non-response rate of 15%, and a design effect of 3. For the determination of the national sample size, the expected unemployment rate was set at 15% and the expected number of sample households to reach one person of working age, 15 years old and over, in the labour force was set at 0.6.
The sampling weights were calculated based on the sample design and response rates. The sample design determines the probability of selection of each unit that in principle is a known non-zero value between 0 and 1. The response rates were obtained from information on the cover page of the LFS questionnaire after visiting the sample household: completed interview, partially completed interview, absent, refusal, vacant/demolished, out-of-scope (shop, workshop, office, …) and other, where the basic weight is the inverse of the probability of selection and the response rate in the proportion of completed and partially completed households interviewed in the total number of sample households selected.
For the calculation of the probability of selection, the number of households listed is obtained from the listing form and the number of sample households selected is the sample-take, the fixed number of sample households selected in each sample enumeration area, b=16. The number of sample households with completed and partially completed interviews is obtained from the cover page of the filled-in household questionnaire. The probability of selection of the sample enumeration area is proportional to size to the number of households in the sample enumeration area according to the sampling frame.
Finally, the extrapolation weights were adjusted to conform to known population projections. This process of adjustment is called calibration. Calibration means using calibrated weights such that the application of these weights to the auxiliary variables will give estimates exactly equal to the known population totals on those auxiliary variables. Here calibration was made on the sampling weights of all quarters except the first to conform to the projected population of the quarters based on the population estimate of the first quarter. This procedure was adopted because independent and reliable projections were not available.
For the calculation of annual estimates, the quarterly sampling weights were simply divided by four. This procedure is equivalent to calculating annual estimates by the simple arithmetic average of the quarterly estimates.
The questionnaire of the Yemen LFS 2013-14 was designed on the basis of the ILO model LFS questionnaire (version A) and other national LFS questionnaires used in the region. The draft questionnaire was field tested with six households in Sana’a, each member of the field staff interviewing one sample household in his or her area. The experience gained in the field test was reviewed and led to some modifications of the draft questionnaire. The English version of the final questionnaire is reproduced in Annex C of the revised report.
Apart from the cover page and the back page, the core LFS questionnaire contains 52 questions. There are 11 questions on the social and demographic characteristics of the household members in the household roster. In the individual questionnaire addressed to the working age population on 15 years of age or older, there are 3 questions to identify the employed persons and 19 questions on their employment characteristics including time-related underemployment followed by 8 additional questions on income from employment. The individual questionnaire also includes 5 questions to identify the unemployment and the potential labour force and 5 follow-up questions on unemployment characteristics.
Start | End | Cycle |
---|---|---|
2013-08-17 | 2013-09-01 | Q1 |
2013-11-02 | 2013-11-17 | Q2 |
2014-02-01 | 2014-02-16 | Q3 |
2014-05-01 | 2014-05-16 | Q4 |
Start date | End date |
---|---|
2014 | 2014 |
Data entry was carried out in parallel with the interviewing of sample households. It was conducted at the Central Statistical Organization headquarter in Sana’a where all data processing operations except tabulation were centralized.
The supervisory staff of the data entry operations was responsible for editing the questionnaires before actual data entry. Editing at this stage involved review of the questionnaire regarding its filled-in contents including ensuring that there is no missing block of information for household members aged 15 years old and over and correct coding of occupation, branch of economic activity and other variables.
Occupation was coded at the 6-digit level using the International Standard Classification of Occupations (ISCO-88). Branch of economic activity was coded at the 5-digit level, based on the International Industrial Classification of All Economic Activities (ISIC Rev3.1).
The data files were further processed at ILO headquarters in Geneva. They were first converted into a single file with 86,778 records and augmented with several fields, in particular, the sampling weights (“weight”) and the key derived variables: employed (E), unemployed (U), time-related underemployment (TRU), potential labour force (PLF) as well as other derived variables such as informal sector employment (IS) and informal employment (IE).
The following rounding rule was adopted for the presentation of the results. Estimates of levels are rounded to three zeros (’000) for values equal or above 1000. Estimates of percentage rates are rounded to the first decimal point.
Sampling errors arise due to the fact that the survey does not cover all elements of the population, but only a selected portion. The sampling error of an estimate is based on the difference between the estimate and the value that would have been obtained on the basis of a complete count of the population under otherwise identical conditions.
Knowing about the magnitude of sampling errors is crucial for interpreting the survey results. It allows decision on the precision of the estimates and on the degree of confidence that may be attached to them, especially relevant in the case of small population subgroups for which the survey results may not be statistically significant due to the small number of observations on which the estimates may be based. Information on sampling errors is also crucial for sample design for future surveys.
In principle, sampling errors may be decomposed into two components: (i) sampling bias; and (ii) sampling variance. Sampling bias reflects the systematic error that may occur due to the failures of the sample design, for example, certain elements of the population receiving zero probability of selection. The sampling variance, on the other hand, reflects the uncertainty associated to a sample estimate due to the particular sample used for its calculation, among all possible other samples that could have been selected from the frame with the same sampling design.
The calculation of the sampling variance of survey estimates for complex multistage designs is generally based on the following principle: the variance contributed by the later stages of sampling is, under broad conditions, reflected in the observed variation among the sample results for first-stage units. Thus, the sampling variance of a variety of statistics, such as totals, means, ratios, proportions, and their differences can be obtained on the basis of totals calculated for primary sampling units (PSUs).
One use of the standard deviation is to assess the level of precision of survey estimates. A low relative standard deviation indicates a high precision of the estimate. In general, the lower the relative standard deviation of an estimate, the higher is the precision of the estimate. The relative standard deviation of an estimate is the ratio of the standard deviation to the size of the estimate. Thus, the working age population and the labour force, with relative standard deviations of 1.4%, are estimated with slight more precision than employment, relative standard deviation of 1.5%. The results also show that unemployment and time-related underemployment are the estimates with the least precision respectively, 3.8% and 11.4%.
Another use of the standard error is for the calculation of confidence intervals. Under certain broad assumptions, it can be stated that the true value of the variable of interest lies in between the survey estimate and a multiple of the standard error, with certain degree of probability.
As it is not practical to compute and report sampling variances and standard deviations for every published statistic of the labour force survey, it is customary to calculate general standard errors using an approximate relationship between the variance of an estimate and its size. Estimates with high values, more than 10 million, have relative standard deviation less than 1.7% and the confidence intervals (margins of errors) around +/- 225,000. At the other extreme, estimates below 50’000 have high relative standard deviations more than 14.3%, and margin of errors around 14,000.
COVERAGE ERRORS
Probability sampling requires each element in the target population to have a known non-zero probability of being selected in the sample. This condition is violated if the target population is not fully represented in the sample frame or if the sample selection of units from the frame is not according to the procedures specified in the sample design. The violation of these conditions generates coverage errors.
Coverage errors may occur due to imperfect frame (under-coverage, over-coverage, or duplication of units) or to practical problems such as confusion in boundary of units or in rules of association between units of different types. Coverage errors may also occur at the stage of selection of individual persons in the sample household because of failure to identify some eligible persons, for example, lodgers, domestic workers or other non-family members of the household. It can even happen due to incorrect data on personal characteristics, for example, if the age of the person is incorrectly recorded as below the age set for measuring labour force characteristics (under-coverage error), or vice versa the age is incorrectly recorded as above the threshold age (over-coverage error).
A measure of coverage errors in the LFS 2013-14 is obtained by comparing the number of households in the sampling enumeration area obtained during the listing operations with the corresponding number according to the population census 2004, discussed earlier in connection with Table B5.
NON-RESPONSE ERRORS
Non-response occurs due to failure to obtain the required information from the units selected in the sample (unit non-response) or to failure to obtain some items of information for the selected unit (item non-response). Unit non-response may occur due to incorrect address of the sample household, or inaccessibility of certain dwellings or refusal of the sample household to be interviewed, or because no one was at home when the interviewer contacted the household, or for other reasons. Vacant or demolished dwellings, non-existent or out-of-scope addresses, such as finding an enterprise or workshop instead of a household dwelling, are not generally considered as unit non-response.
Among the 13,167 target sample households, some of 12,646 provided data for all members of the households and 16 for some but not all members. In addition, 323 eligible sample households could not be contacted due to temporary absence and 155 refused to participate in the survey. There were also 11 sample households who could not be contacted because the dwelling was found vacant or the address could not found. Finally, there were 2 sample dwellings found destroyed and an additional 14 that could be interviewed for other reasons.
In total, there were 13,140 eligible households, among which 13,140 responded and 478 not responded, giving a non-response rate of 3.6%. The non-response rate was about the same in all quarters (4.4% in Q1, 3.3% .in Q2, 2.8% in Q3 and 4.0% in Q4). Corrections for non-response errors were made by inflating the design weights for each quarter by the inverse of the response rate (one minus the non-response rate defined above) for each sample enumeration area as described earlier. This procedure assumes that non-respondent households within an enumeration area have similar characteristics as the responding households in those areas.
RESPONSE ERRORS
Response errors refer to errors originating at the data collection stage. In relation to an individual respondent, response errors may occur because the respondent was unwilling to divulge certain information or because the respondent did not know the answer to the question asked or did not fully understand the meaning of the question. Response errors can also occur due memory lapses, for example by forgetting to report an event, or incorrectly reporting its timing. Response errors may also occur because of errors made by the interviewer or by the instrument used for measurement. Interviewers may introduce errors because of haste and misreporting the responses, or because of misunderstanding of the survey concepts and procedures, or preconceptions and subjective biases. The questionnaire itself may be faulty, with wrong question wordings and incorrect skipping patterns.
The measurement of response errors is one of the most difficult parts of quality assessment of survey data. It generally requires carefully designed re-interview programs. In the absence of such data, the quality of survey responses may be assessed by measuring the degree of self-response against proxy-response, or by testing the internal consistency of certain sets of inter-related responses, or by comparison of the survey results with corresponding information from more reliable external sources such as administrative sources.
The total number of teachers in primary and secondary education from the administrative source (229,405) is substantially higher than the corresponding estimate from the labour force survey (171,722). In relative terms, the difference is more significant for secondary education than for primary education. The difference between the two sources may be due to differences in definitions and classifications. The survey estimates refer to teachers in their main jobs, while the administrative source cover all teachers, whether those on their main or secondary activity.
The number of civil service employees from the administrative source (589,806) is about ten percent higher than the corresponding estimate from the labour force survey (533,444). The unaccounted difference (-56’162) may be partly due to civil service employees classified in other branches of economic activity in the labour force, for example, civil service employees employed in public radio and television institutions, or national museums or embassies abroad. The differences between the survey estimates and the administrative source are larger than the sampling errors, especially, for women employees.
Other comparisons with administrative sources may be performed, for example, comparing the survey estimate of the number of unemployed jobseekers reporting to be registered at the labour office or at civil service bureau with the corresponding data from the administrative source.
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.