The Survey Assessement of Vietnamese Youth is the first nationwide baseline survey of youth ever undertaken in Viet Nam.
The Survey Assessment of Vietnamese Youth (SAVY) undertaken in late 2003 was a collaboration of the Ministry of Health, General Statistics Office with technical and financial support from the World Health Organization (WHO) and the United Nations Children’s Fund (UNICEF).
This is the first nationwide baseline survey of youth ever undertaken in Viet Nam. It mainly aims to collect data on various aspects of youth life in order to inform policy and programmes in the adolescent and youth health and development area.
SAVY reveals a positive picture of Vietnamese youth as they face both challenges and opportunities in a changing economic and social environment. Compared with young people in other Asian countries, Vietnamese youth display relatively less risky behaviour, are supported by protective factors and are optimistic and eager to build a prosperous country. However, this survey does reveal that some young people will encounter considerable challenges in their transition to adulthood, unless provided with support. It is important that parents, the community and the government, with the support of international agencies and young people, work together to ensure the healthy development of young people in Viet Nam.
The survey involved 7,584 youth aged 14-25 years from 42 provinces across the country, from the smallest rural hamlet to the largest cities. Using a household sample, youth were invited to a central location to complete both a face-to-face interview and a self-administered anonymous survey which contained sensitive questions young people could answer in private. What results is the most extensive understanding of the social life, attitudes and aspirations of young Vietnamese people today.
- Provide information that can best inform future initiatives to promote the healthy development of youth across the country;
- Inform policy and program development in the Adolescent and Youth Health area in the immediate future; and
- Provide baseline data about Vietnamese youth to identify trends and patterns in the coming years.
The questionnaire was designed through a very dynamic process, where experience from previous surveys was examined and opinion of young people ware actively solicited to ensure quality and relevance. The specific information collected through the questionnaire includes:
Vocational training, Work and employment
Puberty: knowledge and behaviors about reproductive health
Dating and friendships
Injury, illness and physical health
Attitudes, perceptions and behaviors
Social factors and emotional wellbeing
SAVY is a collaborative effort between many agencies and young people. It is the result of extensive investment and parnership building between the Vietnamese Government through the Ministry of Health, the General Statistics Office, and United Nations agencies, notably The World Health Organisation and the United Nations Children's Fund. Several other organizations, from a variety of sectors, also contributed to the endeavor, notably the Ministry of Education and Training (MoET), the Central Youth Union (YU) and the Vietnam Women's Union (VWU). In order to ensure that the survey was methodologically sound, the East- West Centrer (Honolulu, Hawaii) provided intensive technical assisstance.
Results from the surveys, including national reports, and micro level datasets. The dataset was formatted by *.sav (SPSS) and *.dta (STATA)
More information and electronic files of SAVY, visit : http://www.moh.gov.vn/SKSS/Savy_htm/savy.htm
Kind of Data
Sample survey data [ssd]
Unit of Analysis
Youth aged 14-25 years
Version 01: Edited data used for preliminary report.
The Scope of the Survey Assessement of Vietnamese Youth includes:
Young People as part of Vietnamese Families; Education; Work and employment; Friendships, Dating, Sexuality and Reproductive Health; Pregnancy and Abortion Experiences; Awareness, Knowledge and Sources of Reproductive Health Information; HIV/AIDS; Substance Use; Health Compromising and Problem Behavious; Accidents, Injury and Physical Health; Mental Wellbeing, Aspiration and Expectations.
The survey covered all youths aged 14-25 years resident in the household. The SAVY sample did not include Vietnamese youth not living with their families nor those living in military barracks, social protection centers, dormitories, re-education centers and drug treatment centers.
Producers and sponsors
General Statistics Office
Ministry of Planning and Investment
Ministry of Health
Government of Vietnam
World Health Organisation
United Nations Children's Fund
World Health Organisation
United Nations Children's Fund
The SAVY sample is a national representative sample of youth (persons ages 14-25 years) living in households across the eight economic regions of Viet Nam. THe sample was drawn from the sub-sample of 45,000 households in the 2002 Viet Nam Living Standards Survey (VLSS 2002), within a multi-staged and stratified design. The youth in the SAVY sample design are sufficient to represent the nation as a whole, as well as the urban and rural separely. The largest cities (Hanoi and Ho Chi Minh) were over sampled in order to provide for increased statistical power in that segment of the total population of youth.
Forty-two out of 61 provinces were selected for the SAVY sample, using the probability proportional to size (PPS) method to maintain representativeness . At the next stage of sampling, enumeration areas (EAs) in each province were selected. In those EAs sampled, all youth aged 14 through 25 were identified (i.e, those born between 1978 and 1989) males and females, married and non married from the 20 households that had been selected for the VLSS2002. The youth cohort represents all youth, but not those living in special arrangements, such as barracks, re-education centers, social protection centers, factories and dormitories.
The 61 provinces in the VLSS 2002 sample included 2.250 EAS, and the 42 provinces selected for SAVY included 1643 EAs. From these, a total of 446 EAs were selected for the SAVY sample. These EAs contained 8920 households corresponding to a population of 40,140 (about 4.5 persons per household). Since youth aged 14-25 account for 24.5% of the total population (the figure in the 1999 census), the anticipated number of youth in the SAVY sample was approximately 9,835. If the mobilization rate (percentage of eligible youth actually interviewed) was 90% then the number of youth interviewed woul be estimated to be about 8,850. In the actual SAVY field experiece, the mobilization rate was 85% and the number of completed interviews was 7,584.
The sample is therefore representative, and provides sufficient cases for analysis at the national level within urban and rural sectors at the national level, by gender at the nation level, and for each of the regions. Further detail on the sampling methodology is provided in the Appendix of the Final Report.
Of the 9,835 youths selected for the sample, all were found to be occupied. Among these youths, 7,584 youths were successfully interviewed, which corresponds to a response rate of 77.1 percent.
Among those who did agree to go to the central location for interviewing, almost no one refused to answer the questions or fill in the self-completed part of the survey. The survey method (face-to-face interview with a self-administered second part), the quality of interviewers and the organization of the field work, including extensive supervision, were important factors that ensured the quality of the SAVY data.
Sample weights were calculated and attached to the datafiles.
Dates of Data Collection
Data Collection Mode
Interviewing was conducted by teams of interviewers. Each interviewing team comprised of 4 interviewers and a Field Editor.
The role of the supervisor was to coordinator field data collection activities, including management of the field teams, supplies and equipment, finances, maps and listings, coordinate with local authorities concerning the survey plan and make arrangements for accomodation and travel. Additionally, the field supervisor assigned the work to the interviewers, spot checked work, maintained field control documents, and sent completed questionnaires and progress reports to the central office
The field editor was responsible for reviewing each questionnaire at the end of the day, checking for missed questions, skip errors, fields incorrectly completed, and checking for inconsistencies in the data. The field editor also observed interviews and conducted review sessions with interviewers.
Responsibilities of the supervisors and field editors are described in the Instructions for Supervisors and Field Editors, together with the different field controls that were in place to control the quality of the fieldwork. Field visits were also made by a team of central staff on a periodic basis during fieldwork. The senior staff of GSO also made 3 visits to field teams to provide support and to review progress.
Data Collection Notes
Data from young people were collected by assembling them together in a public space away from their homes (such as a people’s committee office, or the cultural house of the hamlet), but still arranging for privacy during interviews. The fact that youth were asked to gather in one place had the potential to affect the response rate. To minimize this, the GSO coordinated very closely with the Women’s Union, the Youth Union and the local government, informing them about the purpose and importance of the survey so that they would then mobilize youth effectively.
An interviewer of the same sex, sitting side-by-side, interviewed young people, a method that had previously been tested during the training and testing phases. The face-to-face interview was of assistance to respondents who were generally not familiar with questionnaires and the coding of the responses. The interviewers checked the respondents’ ability to selfcomplete the sensitive part (through interviewing and checking levels of education) and gave clear instructions on how to fill in the questionnaire before handing it over to the respondent.
After finishing the anonymous questionnaire, respondents were asked to post them in the box provided. This procedure was designed to ensure and demonstrate the privacy of the information.
Training and Field Data Collection
A central steering committee for the survey was established with members from the General Statistics Office (GSO), Ministry of Health (MoH), World Health Organization (WHO), United Nations Children's Fund (UNICEF), Central Youth Union (YU) and Viet Nam Women's Union (WU).
Since this was a sociological survey asking for information covering a wide range of areas, including sensitive issues, the training to ensure high quality field staff was considered crucial to the quality of the results. Three training courses, over five-day periods including one day of practice, were organized by the GSO for 150 recruited data collectors. These collectors were GSO staff from provincial offices.
Skills covered in training included how to communicate with youth, the specific content of the questionnaire, the underlying intent of each of the questions, and specific instructions for coding responses. The training methods included classroom instruction, group discussion and hands-on practice. More than 150 youth from different social backgrounds were invited to the sessions so that interviewers could practice interview skills and fieldwork procedures. Feedback from the young people was very valuable for both the instructors and the trainee interviewers. Only trainees who passed a post-training examination were allowed to participate in the survey.
Field data collection took place over 53 days. The trained interviewers were organized into 19 teams: each was comprised of four interviewers and one team head. The field schedule was prepared detailing the expected itinerary and travel of each survey team between EAs, districts and provinces. The People’s Committee at commune level, through the YU and WU, provided notice to young people and mobilized them to participate in the survey. Fieldwork began on the 5th of October, 2003 and concluded in mid December, 2003.
General Statistics Office of Vietnam
The questionnaire was designed through a very dynamic process, where experience from previous surveys was examined and opinions of young people were actively solicited to ensure quality and relevance. This process also helped to define the methodology and implications for fieldwork planning.
A number of stakeholders’ agencies, including research institutes, were involved in the development of the questionnaire. This process ensured broad participation and ownership of the questionnaire and the survey.
The questionnaire design took place in two stages. In the first stage, experienced researchers, and others interested in the survey as stakeholders, were convened to a workshop by the MoH. Potential topics, and the possible phrasing of questions using the questionnaire bank from previous studies in the region as reference, were fully discussed. Since some of the topics were deemed to be more sensitive than others, it was recommended that the questionnaire should be organized into two parts, one for an interview and the other for self-completion. On the basis of that workshop, a draft questionnaire was created for review by the workshop members and numerous others in stakeholder agencies, as well as by young people through a series of consultations.
Eight focus group discussions were conducted in Hanoi and HCMC, with around 60 young people of different ages in the 14-25 range who were either married or unmarried and either attending or not attending school. Participants gave detailed feedback about the terminology, the ways in which questions were posed and the sequencing of the questions, as well as which specific questions or issues they would prefer to respond to on their own, rather than with an interviewer. This process resulted in the rephrasing of a number of questions and changes to the self-completed section.
Preliminary training was conducted for field-testing of the questionnaire. Participants came from the GSO Office in Tuyen Quang, Hue and HCMC, representing the north, south and central regions of Viet Nam. A group of 50 young males and females, either married or unmarried and either attending or not attending school, participated in the interviewers’ practice session. In the debriefing discussions, these young people expressed their feelings about the interviews, the questions asked, what they liked and did not like about the process, seating arrangements, ideas of what topics/issues they thought might still be missing in the draft questionnaire, and what they thought would be needed to make good interviewers. Field testing with around 180 young people from six communes in these three provinces then took place.
The second stage involved further vetting of questionnaire sections and was coordinated by the GSO. The review meeting following the field trips recommended the need for another field testing exercise, particularly because little experience had been gained from testing with urban young people and interviewing ethnic minority young people through interpreters. Following the second round of field-testing in Hanoi and Yen Bai, the feedback was incorporated to finalise the questionnaire for the interviewers training. At the training, further revision and refinement of the questionnaire occurred prior to the field work.
The resulting questionnaire consisted of a total of more than 200 questions. SAVY experts then further modified the questionnaire in order to ensure the best phrasing possible, and to avoid technical terms. The first section was conducted as a face-to-face interview, with general questions categorized into topics. The second part of the questionnaire – and the part that makes the survey special – was an anonymous self-administered section, including 52 sensitive questions that youth preferred to answer in private. Originally, it was envisaged that the self completed section would contain between 10-15 questions, but it became much longer as the youth consulted suggested that a lot more questions they perceived to be sensitive should be placed there. The questionnaire could be completed in 60 to 80 minutes, though this was longer for those unable to read the questionnaire and who required translation.
The specific information collected through the questionnaire includes:
- Personal demographics
- Schooling, education
- Vocational training, Work and employment
- Puberty: knowledge and behaviors about reproductive health
- Dating and friendship
- Injury, illness and physical health
- Attitudes, perceptions and behaviors
- Social factors and emotional wellbeing
- Mass media
- Future aspirations
Data editing took place at a number of stages throughout the processing including:
a) Office editing and coding
b) During data entry
c) Structure checking and completeness
d) Secondary editing
e) Structural checking of SPSS data files
Detailed documentation of the editing of data can be found in the data processing guidelines
Data were processed in clusters, with each cluster being processed as a complete unit through each stage of data processing. Each cluster goes through the following steps:
1) Questionnaire reception
2) Office editing and coding
3) Data entry
4) Structure and completeness checking
5) Verification entry
6) Comparison of verification data
7) Back up of raw data
8) Secondary editing
9) Edited data back up
After all clusters are processed, all data is concatenated together and then the following steps are completed for all data files:
10) Export to SPSS in 01 file
11) Recoding of variables needed for analysis
12) Adding of sample weights
13) Structural checking of SPSS files
14) Data quality tabulations
15) Production of analysis tabulations
Details of each of these steps can be found in the data processing documentation, data editing guidelines, data processing programs in EPI6 and SPSS, and tabulation guidelines.
Data entry was conducted by 06 data entry operators, supervised by 1 data entry supervisors, using a total of 7 computers (6 data entry computers plus one supervisors computer). All data entry was conducted at the GenCenStat head office using manual data entry. For data entry, EPI 6 was used with a highly structured data entry program, using system controlled approach, that controlled entry of each variable. All range checks and skips were controlled by the program and operators could not override these. A limited set of consistency checks were also included inthe data entry program. In addition, the calculation of anthropometric Z-scores was also included in the data entry programs for use during analysis. Open-ended responses ("Other" answers) were not entered or coded, except in rare circumstances where the response matched an existing code in the questionnaire.
Structure and completeness checking ensured that all questionnaires for the cluster had been entered.
100% verification of all variables was performed using independent verification, i.e. double entry of data, with separate comparison of data followed by modification of one or both datasets to correct keying errors by original operators who first keyed the files.
After completion of all processing in EPI 6, all individual cluster files were backed up before concatenating data together using the EPI 6 file concatenate utility.
After transferring all files to SPSS, certain variables were recoded for use as background characteristics in the tabulation of the data, including grouping age, education, geographic areas as needed for analysis. In the process of recoding ages and dates some random imputation of dates (within calculated constraints) was performed to handle missing or "don't know" ages or dates. Additionally, a wealth (asset) index of household members was calculated using principal components analysis, based on household assets, and both the score and quintiles were included in the datasets for use in tabulations.
Estimates of Sampling Error
Estimates from a sample survey are affected by two types of errors: 1) non-sampling errors and 2) sampling errors. Sampling errors occur because the observations are made from only a sample of, rather than the entire, population. Sampling errors have been calculated for a select set of statistics (all of which are proportions due to the limitations of the Taylor linearization method) for the national sample, urban and rural areas, and for each of the five regions. For each statistic, the estimate, its standard error, the coefficient of variation (or relative error -- the ratio between the standard error and the estimate), the design effect, and the square root design effect (DEFT -- the ratio between the standard error using the given sample design and the standard error that would result if a simple random sample had been used), as well as the 95 percent confidence intervals (+/-2 standard errors). Details of the sampling errors are presented in the sampling errors appendix to the report and in the sampling errors table presented in the external resources.
The particular sample used in SAVY is one of a large number of all possible samples of the same size that could have been selected using the same sample design. The particular value of the estimate – the point estimate – derived from each of the different samples would differ from each other. The deviation of a sample estimate from the average of all possible samples is called sampling error. While it is not possible to calculate the actual sampling error since we only have data from one of the possible samples, the standard error of a given estimate as calculated in this report is nevertheless an estimate of the sampling error. The estimated standard error also partially measures the effect of some non-sampling errors such as that which can be attributed to variability among interviewers and coders but does not measure any systematic biases in the data.
The point estimate from the sample for a given variable or indicator, and an estimate of its standard error, permit us to construct interval estimates with prescribed confidence that the interval includes the average result of all possible samples. To illustrate, if all possible samples were selected, each were surveyed under the same conditions and an indicator and its estimated standard error were calculated from each sample, then approximately 95% of the intervals from two standard errors below to two standard errors above the indicator would include the average value of all possible samples: the so-called 95% confidence interval. Details about standard errors can be found in the Appendix of the Final Report.
Non-sampling error is more difficult to quantify than sampling error because it has many components, each one of which requires its own evaluation study to assess effects on the survey results. Among the many sources contributing to non-sampling error are (1) non-response from some of the households and/or persons selected to participate in the sample; (2) conceptual and definitional difficulties in the design of the questions that are asked of the respondents; (3) inability or unwillingness to provide correct information on the part of respondents; (4) mistakes by interviewers and data entry staff in recording or coding the data obtained; and (5) other errors of collection, processing and coverage. The evaluation of nonsampling error is also constrained because the statistical theory for doing so is underdeveloped compared to that for sampling errors, and also because there is an absence of knowledge about the true values in the target populations under study. For these reasons the designers and producers of large-scale surveys, including SAVY, rarely provide empirical results showing the type and magnitude of nonsampling error that may be present. Survey budget constraints effectively make its assessment infeasible.
Instead, efforts are usually directed at controlling and minimizing non-sampling error through such means as using previously validated questionnaire items, pretesting of new questions, pilot testing of survey methods and operations, careful and intense training of interviewers, sample verification of data entry and coding, plus close supervision, observation and spotchecking in the field. All these steps were taken during the SAVY operations. Nevertheless it is important to describe, where known, the kinds of nonsampling error, or biases, that may be present in the data and to indicate, qualitatively, what the effects of these may be.
An important source of non-sampling error in SAVY is non-response. Its magnitude is measured by the non-response rate, calculated as (1- I/n), where I is the number of youth 14-25 for whom interviews were obtained and n is the number selected in the sample. In SAVY there were 7,584 interviewed youth and 9,989 selected-for a non-response rate of (1-7584/9989), or 24.1%. This level of non-response is in line with the experience of other surveys focused on youth age groups. Moreover, the SAVY field design eliminated non-response at the household level, an additional source of non-response in most other surveys. The cited level of non-response can have a significant biasing effect on the results because without additional information we can only assume that the non-responding youth are similar, in terms of their characteristics and distribution, to the responding ones-an assumption which cannot be independently verified. Comparisons between survey distributions and distributions available from other sources, notably the Health Surveys and the national censuses, suggest only a very limited bias is involved, though one exception should be noted. The SAVY sample of young people may slightly under-represent those in the older part of the age range who are married or who are working, or who are not enrolled in school. This is apparent in comparisons of age and sex-specific percents single, enrolled, and working, in SAVY versus in the 1989 and 1999 censuses. This bias should not have much effect on analysis of SAVY data, but must be kept in mind nevertheless.
Director, Social-Environmental Statistics Department
Confidentiality of respondents is guaranteed by the General Statistics Office of Vietnam.
Before being granted access to the dataset, all users have to formally agree:
1. To make no copies of any files or portions of files to which s/he is granted access except those authorized by the data depositor.
2. Not to use any technique in an attempt to learn the identity of any person, establishment, or sampling unit not identified on public use data files.
3. To hold in strictest confidence the identification of any establishment or individual that may be inadvertently revealed in any documents or discussion, or analysis. Such inadvertent identification revealed in her/his analysis will be immediately brought to the attention of the data depositor.
The GSO official microdata access policy is available on its website: : www.gso.gov.vn <http://www.gso.gov.vn>
Requests for access to the datasets can be made through the GSO website.
The dataset has been anonymized and is available as a Public Use Dataset. It is accessible to all for statistical and research purposes only, under the following terms and conditions:
1. The data and other materials will not be redistributed or sold to other individuals, institutions, or organizations without the written agreement of the General Statistics Office of Vietnam.
2. The data will be used for statistical and scientific research purposes only. They will be used solely for reporting of aggregated information, and not for investigation of specific individuals or organizations.
3. No attempt will be made to re-identify respondents, and no use will be made of the identity of any person or establishment discovered inadvertently. Any such discovery would immediately be reported to the Department of Agriculture, forestry and fishery Statistics of General Statistics Office of Vietnam.
4. No attempt will be made to produce links among datasets provided by the General Statistics Office of Vietnam, or among data from the National Data Archive and other datasets that could identify individuals or organizations.
5. Any books, articles, conference papers, theses, dissertations, reports, or other publications that employ data obtained from the GSO will cite the source of data in accordance with the Citation Requirement provided with each dataset.
6. An electronic copy of all reports and publications based on the requested data will be sent to the General Statistics Office of Vietnam.
The original collector of the data, the General Statistics Office of Vietnam, and the relevant funding agencies bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
General Statistics Office of Vietnam. Survey Assessement of Vietnamese Youth 2003. Ref. VNM_2003_SAVY_v01_M. Dataset downloaded from http://www.gso.gov.vn on [date].