Survey ID Number
THA_1987_DHS_v01_M
Title
Demographic and Health Survey 1987
Estimates of Sampling Error
The sample of women selected in the TDHS is only one of many samples of the same size that could have been selected from the same population, using the same design. Each one would have yielded results that differed somewhat from the actual sample selected. The variability observed between all possible samples constitutes sampling error, which, although it is not known exactly, can be estimated from the survey results. Sampling error is usually measured in terms of the "standard error" of a particular statistic (mean, percentage, etc.), which is the square root of the variance of the statistic across all possible samples of equal size and design. The standard error can be used to calculate confidence intervals within which one can be reasonably assured the true value of the variable for the whole population falls. For example, for any given statistic calculated from a sample survey, the value of that same statistic as measured in 95 percent of all possible samples of identical size and design will fall within a range of plus or minus two times the standard error of that statistic.
If the sample of women had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the TDHS sample design depended on stratification, stages, and clusters and consequently, it is necessary to utilize more complex formulas. The computer package CLUSTERS was used to assist in computing the sampling errors with the proper statistical methodology.
In addition to the standard errors, CLUSTERS computes the design effect (DEFT) for each estimate, which is defined as the ratio between the standard error using the given sample design and the standard error that would result if a simple random sample had been used. A DEFT value of one indicates that the sample design is as efficient as a simple random sample and a value greater than one indicates the increase in the sampling error due to the use of a more complex and less statistically efficient design.
On the survey data file, sample blocks/villages have been given sequential numbers reflecting the order in which they were selected. For the two stage sample in Bangkok, clusters (241-288) form the primary sampling units. Because of systematic selection in the specified order, these can be taken as pairs to form 24 "implicit" strata for variance computation. (Alternatively, they can be paired successively, number 241 with 242, 242 with 243, etc., to form 47 successive pairs for more stable variance estimates). In each of the remaining sampling domains, with three sampling stages, each pair of successive blocks/villages forms a single primary sampling unit {PSU), e.g., 001 and 002 003 and 004 together, etc. This gives 24 PSUs per domain. These pair results PSU can be paired into 12 implicit strata.
Practical methods of variance computation require certain weighted aggregates only at the PSU level, separated into implicit strata. Sample weights have been coded on to the record of each individual sample case in the survey data file. Variances can therefore he estimated on the basis of the above information reflecting the structure of the sample.
In general, the sampling errors for the country as a whole are small, which means that the TDHS results are reliable. For example, for the variable children ever born, the overall mean from the sample is 2.747 and its standard error is 0.042. Therefore, to obtain the 95 percent confidence limits, one adds and subtracts twice the standard error to the sample estimate, i.e., which means there is a high probability (95 percent) that the true average number of children born for all thai women within the interval of 2.664 to 2.830.