Estimates of Sampling Error

The estimates from a sample survey are affected by two types of errors--nonsampling and sampling. Nonsampling errors result from mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made to minimise this type of error during the implementation of the TDHS, nonsampling errors are impossible to avoid and difficult to evaluate statistically.

Sampling errors, on the other hand, can be evaluated statistically. The sample of women selected in the TDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that would differ somewhat from the results of the actual sample selected. The sampling error is a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.

Sampling error is usually.measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which isthe ratio of the standard deviation to the square root of the sample size. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.

If the sample of women had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the TDHS sample is the result of a three-stage stratified design, and, consequently, it was necessary to use more complex formulas. The computer package CLUSTERS, developed by the International Statistical Institute for the World Fertility Survey, was used to compute the sampling errors for 42 variables with the proper statistical methodology.

The CLUSTERS package treats any percentage or average as a ratio estimate, r = y/x, where y represents the total sample value for variable y, and x represents the total number of cases in the group or subgroup under consideration. The variance of r is computed using the formula given below, with the standard error being the square root of the variance,

var(r) = 1-f mh 2 Zh

x 2 mh-----i i=1 k

in which

Zhi = Yhi-r.Xhi , and Z h = yh-r.xh

In addition to tile standard errors, CLUSTERS computes the design effect (DEFT) for each estimate, which is defined as the ratio of the standard error using the given sample design to the standard error that would result if a simple random sample had been used. A DEFT value of 1.0 indicates that the sample design is as efficient as a simple random sample, whereas a value greater than 1.0 indicates the increase in tile sampling error due to the use of a more complex and less statistically efficient design. CLUSTERS also computes the relative error and confidence limits for the estimates.

The results for the 42 variables mentioned, which are those considered to be of primary interest, are presented in an appendix to the Final Report for the country as a whole, for urban and rural areas, for the five regions, and for age groups. The type of statistic (mean or proportion) and the base population for each variable are given in Table C.1 of the Final Report. Tables C.2 to C.12 present the value of the statistic (R), its standard error (SE), the number of unweighted (N) and weighted (WN) cases, file design effect (DEFT), the relative standard error (SE/R), and the 95 percent confidence limits (R_+2SE), for each variable.

Additionally, sampling errors were calculated for tile total fertility rote of the last year prior to the survey date and the infant mortality rate for the 5 years preceding the survey, for the national total, and for urban-rural areas. These calculations were undertaken using the Jacknife methodology rather than the CLUSTERS package because of the nature of these two estimates. The Jacknife methodology is based on having replicate values for the estimates and applying the sim pie standard error formulae to these replicates.

Tile TDHS included 478 clusters. Each replication considers all clusters but deletes one cluster at a time for the calculations and then creates pseudoindependent replicates. In total, 478 replications for the infant mortality and total fertility rates create tile pseudoindependent values:

e(.i) = 478 * estimate (all clusters) - 477 * estimate (all minus i ~h)

e estimate (all clusters)

and tile sampling errors for the estimate is given by:

SE (estimate) = {5-(e(.i) -e)-" / (478 * (478-1)) }'/2.

The results of the calcnlations using the Jacknife methodology to estimate sampling errors for the infant mortality rate and tile total fertility rate for the national total, for urban and rural areas, and for the five major regions is shown in Table C.13 of the Final Report.

Tile confidence interval (e.g., as calculated for EVBORN) can be interpreted as follows: the overall average from the national sample is 3.041 and the standard error is 0.044. Therefore, to obtain the 95 percent confidence limits, one adds and subtracts twice the standard error to the sample estimate, i.e., 3.041 _+ 0.088. There is a high probability (95 percent) that the true average number of children ever born to all women age 15 to 49 is between 2.954 and 3.128.

Of the 42 variables for which CLUSTERS was used for the estimation of sampling errors, 28 are based on women, and 14 are based on children under age 5. Ill general, the relative standard error for most estimates for the country as a whole is small, except for estimates of very small proportions. There are some differentials in the relative standard error for the estimates of subpopulations such as urban and rural areas. For example, for the variable SECATT (secondary school attendance), the relative standard errors as a percent of the estimated proportion for urban and rural areas are 4.6 percent and 12.5 percent, respectively. The same istrue for SECGRD(proportion of women who completed secondary school) with values of 5 percent and 14.2 percent, for XCUPIL (current use of the pill) with values of 8.1 and 13.6 percent, for XCUIUD (current use of IUD) with values of 3.4 and 8.5 percent, and for XCUPAB (current use of periodic abstinence) with values of 17 percent and 0 percent, for urban and rural areas, respectively for each variable. Of the 42 variables, 24 were found to have SE/R values of less than 0.03, which means that the SE of those variables is at most 3 percent of the estimate. SE/R values are between 0.031 and 0.059 for 13 variables, and greater than 0.06 for only 5 variables; the maximum value being 16.6 percent. The variables with the highest SE/R ratio are the ones calculated for relatively rare events.

The DEFT value is less than 1.3 for 24 variables; between 1.31 and 1.5 for 13 variables; and greater than 1.51 for only 5 variables. The maximum DEFT value obtained is 1.668. The average of 42 variables is 1.301. The average is 1.213 in urban areas and 1.293 in rural areas for 41 variables (due to the exclusion of the URBAN variable).