Survey ID Number
RUS_1996_RLMS-R7_v01_M
Title
Russia Longitudinal Monitoring Survey - Higher School of Economics 1996
Sampling Procedure
The sample was designed to allow the analysis of household data, as well as of data on all individuals residing in those households. "Household" was defined as a group of people who live together in a given domicile, and who share common income and expenditures. Households were defined to include unmarried children, 18 years of age or younger, who were temporarily residing outside the domicile at the time of the survey.
In addition, however, naturally kept track of the identity of particular households and individuals so that it would be possible to conduct meaningful longitudinal analyses. Occasionally, this proved to be complicated. For example, several households in Round V split into two households without moving from their dwelling units in Round VI. They were no longer sharing income and expenditures, and therefore no longer qualified as a single household under the definition given above. Both households were interviewed, and the link to the common household in Round V is provided in the data set.
The same is true of split households in Round VII, as well as of two joined households in which people in different dwelling units married and continued to live in a dwelling in the sample. Furthermore, as of Round VII, we followed households who moved out of the sample of dwellings in order to maintain the quality of longitudinal studies as well as possible. These moved households and individuals are not part of the sample of households based on dwellings, and a convenient indicator variable allows analysts to omit them from cross-sectional analyses. However, they are part of the sample of Round V and VI households followed over time, and can legitimately be used in longitudinal analyses.
A multistage probability sample was employed to draw the sample of dwelling units. First, a list of 2,029 consolidated regions (similar to counties) was created from which to draw primary sample 4 units (PSUs). These were allocated into 38 strata based largely on geographical factors and level of urbanization, but also based on ethnicity where there was salient variability. As in many national surveys involving face-to-face interviews, some remote areas were eliminated to contain costs; also, Chechnya was eliminated due to armed conflict. From among the remaining regions (containing more than 95% of the population), three very large population units were selected with certainty: Moscow city, Moscow Oblast, and St. Petersburg city each constituted self-representing (SR) strata. The remaining non-self-representing regions (NSRs) were allocated to 35 equal-sized strata. One region was then selected from each NSR stratum using the method probability proportional to size@ (PPS). That is, the probability that a region in a given NSR stratum was selected was directly proportional to its measure of population size.
In addition, however, the interviewer conducted individual interviews with as many household members 14 and older as possible, acquiring data about their individual activities and health. Data for children 13 and younger were obtained from adults in the household, and were entered in children's questionnaires. In the relatively small percent of cases where adults refused or were absent, surrogate adults in the family were not used to supply information for the missing adult. By virtue of the fact that virtually all members of households were interviewed, the sample constitutes a proper probability sample of individuals as well as of households, without any special weighting beyond that used for dwellings or households.
The sample was designed in the effort to obviate the need for weighting as much as possible. In general, this aim was achieved. It is unlikely that using weights will affect substantive results. Nevertheless, two kinds of weights have been calculated to compensate for imperfections in the sample procedure. First, though the sample procedure aimed at giving all dwelling units equal probability of selection, in practice this goal was not perfectly met. One set of weights, then, corrects for the fact that some strata were slightly larger than others, and that some SSUs selected with equal probability (rather than with PPS) turned out to be larger than others within the same PSU. It also corrects for disparate response rates across PSUs and SSUs. The second set of weights matches the sample of households and individuals to the 1989 census. The household sample is matched by urban-rural distribution and by household size; the individual sample is matched by the joint distribution of age, sex, and urban-rural location.
The general observation is that the combined influence of nonresponse attrition and household turnover does not seriously distort the geographic distribution of the sample or its size or household-head characteristics. The distributions for the geographic variables indicate that, between Round V and Round VII, there is a decline in the nominal representation of households in the Moscow/St. Petersburg region, reflected in a decline in the proportion of sample households from the urban domain. Households with a male head aged 18-59 may be subject to slightly higher than average attrition/net loss in replacement. If we focus only on these characteristics, the problem is not serious.
In summary, the net effect of nonresponse attrition and change in dwelling unit occupants across rounds on the marginal characteristics of the observed cross-sectional samples is modest. Loss in nominal "sample share" between Rounds V and VII is greatest for residents of Moscow/St. Petersburg--a loss in representation that is readily corrected with the combined sample selection/nonresponse adjustment factors that have been computed for each round. It is important to note that the simple analysis described here cannot demonstrate that no uncorrected attrition bias remains. The potential for uncorrected nonresponse bias can be specific to the dependent variable under study. Nevertheless, it appears that, with the nonresponse and post-stratification adjustments developed by Michael Swafford, the potential for serious attrition bias in repeated cross-section analysis is small.
On the basis of a probability sample of 3,591 households, as well as some 10,000 members of those households, the RLMS-Round VII provides more than 3,000 variables from which to construct many indices of material well-being at several levels of measurement: individual, household, and community. Since the files are linked, it is possible to study contextual effects on the welfare of individuals and households, as well as change over time among households and individuals.