Southern and Eastern Africa Consortium for Monitoring Educational Quality 2000
SACMEQ II Project
Socio-Economic/Monitoring Survey [hh/sems]
The origins of the Southern and Eastern Africa Consortium for Monitoring Educational Quality (SACMEQ) date back to 1991, the year when several Ministries of Education in Eastern and Southern Africa started working closely with UNESCO's International Institute for Educational Planning (IIEP) on the implementation of integrated educational policy research and training programmes.
In 1995 these Ministries of Education formalized their collaboration by establishing a network that is widely known as SACMEQ. Fifteen Ministries are now members of SACMEQ: Botswana, Kenya, Lesotho, Malawi, Mauritius, Mozambique, Namibia, Seychelles, South Africa, Swaziland, Tanzania (Mainland), Tanzania (Zanzibar), Uganda, Zambia, and Zimbabwe.
The Southern and Eastern Africa Consortium for Monitoring Educational Quality (SACMEQ) undertook three large-scale, cross-national studies of the quality of education: SACMEQ I (1995-1999, reading) with seven ministries; SACMEQ II (2000-2004, reading and mathematics) with 14 ministries; and SACMEQ III (2006-2010, reading, mathematics, and HIV and AIDS knowledge) with 15 ministries.
In 1991 the International Institute for Educational Planning (IIEP) and a number of Ministries of Education in Southern and Eastern Africa began to work together in order to address training and research needs in Education. The focus for this work was on establishing long-term strategies for building the capacity of educational planners to monitor and evaluate the quality of their basic education systems. The first two educational policy research projects undertaken by SACMEQ (widely known as "SACMEQ I" and "SACMEQ II") were designed to provide detailed information that could be used to guide planning decisions aimed at improving the quality of education in primary school systems.
During 1995-1998 seven Ministries of Education participated in the SACMEQ I Project. The SACMEQ II Project commenced in 1998 and the surveys of schools, involving 14 Ministries of Education, took place between 2000 and 2004. The survey was undertaken in schools in Botswana, Kenya, Lesotho, Malawi, Mauritius, Mozambique, Namibia, Seychelles, South Africa, Swaziland, Tanzania, Uganda, Zambia and Zanzibar.
Moving from the SACMEQ I Project (covering around 1100 schools and 20,000 pupils) to the SACMEQ II Project (covering around 2500 schools and 45,000 pupils) resulted in a major increase in the scale and complexity of SACMEQ's research and training programmes.
SACMEQ's mission is to:
a) Expand opportunities for educational planners to gain the technical skills required to monitor and evaluate the quality of their education systems; and
b) Generate information that can be used by decision-makers to plan and improve the quality of education.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
v01: Edited data for licensed distribution
Data was collected on pupils' home backgrounds and their school life; classrooms, teaching practices, teachers' working conditions, and teacher housing; information about school head; enrolments, school buildings and facilities, and school management.
basic skills education [6.1]
The target population for SACMEQ's Initial Project was defined as "all pupils at the Grade 6 level in 1995 who were attending registered government or non-government schools". Grade 6 was chosen because it was the grade level where the basics of reading literacy were expected to have been acquired.
Producers and sponsors
Southern and Eastern Africa Consortium for Monitoring Educational Quality
International Institute for Educational Planning (IIEP)
United Nations Educational, Scientific and Cultural Organization
Funding the project
Funding the project
Funding the project
Ministries of Education and Human Resources, Mauritius
Funding the project
Sample designs in the field of education are usually prepared amid a network of competing constraints. These designs need to adhere to established survey sampling theory and, at the same time, give due recognition to the financial, administrative, and socio-political settings in which they are to be applied. The "best" sample design for a particular project is one that provides levels of sampling accuracy that are acceptable in terms of the main aims of the project, while simultaneously limiting cost, logistic, and procedural demands to manageable levels. The major constraints that were established prior to the preparation of the sample designs for the SACMEQ II Project have been listed below.
Target Population: The target population definitions should focus on Grade 6 pupils attending registered mainstream government or non-government schools. In addition, the defined target population should be constructed by excluding no more than 5 percent of pupils from the desired target population.
Bias Control: The sampling should conform to the accepted rules of scientific probability sampling. That is, the members of the defined target population should have a known and non-zero probability of selection into the sample so that any potential for bias in sample estimates due to variations from "epsem sampling" (equal probability of selection method) may be addressed through the use of appropriate sampling weights.
Sampling Errors: The sample estimates for the main criterion variables should conform to the sampling accuracy requirements set down by the International Association for the Evaluation of Educational Achievement. That is, the standard error of sampling for the pupil tests should be of a magnitude that is equal to, or smaller than, what would be achieved by employing a simple random sample of 400 pupils.
Response Rates: Each SACMEQ country should aim to achieve an overall response rate for pupils of 80 percent. This figure was based on the wish to achieve or exceed a response rate of 90 percent for schools and a response rate of 90 percent for pupils within schools.
Administrative and Financial Costs: The number of schools selected in each country should recognize limitations in the administrative and financial resources available for data collection.
Other Constraints: The number of pupils selected to participate in the data collection in each selected school should be set at a level that will maximize validity of the within-school data collection for the pupil reading and mathematics tests.
The Specification of the Target Population
The target population for both the SACMEQ I and SACMEQ II Projects was focussed on the Grade 6 level for three main reasons.
First, Grade 6 identified a point near the end of primary schooling where school participation rates were reasonably high for most of the seven countries that participated in the SACMEQ I data collection during 1995-1997, and also reasonably high for most of the fourteen countries that participated in the SACMEQ II collection during 2000-2002. For this reason, Grade 6 represented a point that was suitable for making an assessment of the contribution of primary schooling towards the literacy and numeracy levels of a broad cross-section of society.
Second, the NRCs considered that testing pupils at grade levels lower than Grade 6 was problematic - because in some SACMEQ countries the lower grades were too close to the transition point between the use of local and national languages by teachers in the classroom. This transition point generally occurred at around Grade 3 level - but in some rural areas of some countries it was thought to be as high as Grade 4 level.
Third, the NRCs were of the opinion that the collection of home background information from pupils at grade levels lower than Grade 6 was likely to lack validity for certain key "explanatory" variables. For example, the NRCs felt that children at lower grade levels did not know how many years of education that their parents had received, and they also had difficulty in accurately describing the socioeconomic environment of their own homes (for example, the number of books at home).
The Stratification Procedures
The stratification procedures adopted for the study employed explicit and implicit strata. The explicit stratification variable, "Region", was applied by separating each sampling frame into separate regional lists of schools prior to undertaking the sampling. The implicit stratification variable was "School Size" - as measured by the number of Grade 6 pupils.
The main reason for choosing Region as the explicit stratification variable was that the SACMEQ Ministries of Education wanted to have education administration regions as "domains" for the study. That is, the Ministries wanted to have reasonably accurate sample estimates of population characteristics for each region.
There were two other reasons for selecting Region as the main stratification variable. First, this was expected to provide an increment in sampling precision due to known between-region differences in the educational achievement of pupils - especially between predominantly urban and predominantly rural regions. Second, this approach provided a broad geographical coverage for the sample - which was necessary in order to spread the fieldwork across each country in a manner that prevented the occurrence of excessive administrative demands in particular regions.
The use of School Size as an implicit stratification variable within regions also offered increased sampling precision because it provided a way of sorting the schools from "mostly rural" (small schools) to "mostly urban" (large schools). It was known that this kind of sorting was linked to the main criterion variables for the study - with urban schools likely to have higher resource levels and better pupil achievement scores than rural schools.
Sample Design Framework
The SACMEQ II sample designs were prepared by using a specialized software system (SAMDEM) that enabled the high-speed generation of a range of sampling options which satisfied the statistical accuracy constraints set down for the project, and at the same time also addressed the logistical and financial realities of each country.
Note: Details of sampling design procedures are presented in the "Mauritius Working Report".
Response rates for pupils and schools respectively were 92 percent and 99 percent.
The calculation of sampling weights conducted after all files had been cleaned and merged. Sampling weights were used to adjust for missing data and for variations in probabilities of selection that arose from the application of stratified multi-stage sample designs. There were also certain country-specific aspects of the sampling procedures, and these had to be reflected in the calculation of sampling weights.
Two forms of sampling weights were prepared for the SACMEQ II Project. The first sampling weight (RF2) was the inverse of the probability of selecting a pupil into the sample. These "raising factors" were equal to the number of pupils in the defined target population that were "represented by a single pupil" in the sample. The second sampling weight (pweight2) was obtained by multiplying the raising factors by a constant so that the sum of the sampling weights was equal to the achieved sample size.
Dates of Data Collection
Data Collection Mode
Data Collection Notes
The main SACMEQ II data collection occurred for 12 of the 15 SACMEQ Ministries of Education in the period September to December 2000, the Mauritius data collection was completed in July 2001, and the Malawi data collection in September 2002.
The numbers of schools involved in the data collection for each school system ranged from 24 in the Seychelles (where the whole target population of schools and Grade 6 pupils were involved), to 275 in Namibia (where the known magnitude of the coefficient of intraclass correlation and the requirement to gather data in "new" administrative regions added substantially to the required number of schools). The average number of schools per country for the designed samples was around 165.
In smaller countries it was possible to assemble the whole data collection team at the head office of the Ministry of Education and then travel out to sample schools. However, the management of transportation represented a major undertaking for NRCs in larger countries such as Kenya, Namibia, and Mozambique - where much greater distances had to be travelled, and sample schools were sometimes located in extremely remote and difficult-to-find locations. For these countries, the NRCs enlisted the assistance of Regional and District Education Offices.
Two days of data collection were required for each sample school. On the first day pupils were given the pupil questionnaire and the pupil reading test, and on the second day they were given the mathematics test. The teachers (who completed a questionnaire and one of, or both of, the reading and mathematics tests) and school heads (who completed a questionnaire) were asked to respond on the first day. These arrangements made it possible for the data collectors to check all completed questionnaires (pupil, teacher, and school head) during the evening of the first day and then, if necessary, obtain any missing or incomplete information on the second day.
The data collection for teachers was in three parts: questionnaire, reading test, and mathematics test. Where sample teachers taught both reading and mathematics, they took both tests. Where they taught only one of these subjects, they were given the relevant test.
The manual used by the data collectors contained detailed instructions concerning the random selection of 20 sample pupils and up to 6 sample teachers within schools. The data collectors were given intensive prior training in the strict application of these procedures. It was necessary to do this because the validity of the whole SACMEQ II data collection could have been seriously damaged if "outside influences" had been applied to selecting respondents. A further measure that was applied in order to avoid the inclusion of unknown biases into the data collection was to absolutely forbid the replacement of absent pupils.
The data collectors were provided with a 40-point checklist in order to ensure that they completed all important tasks that were required before, during, and after their visits to schools. Each task was cross-referenced to specific pages of instructions in the data collectors' manual.
The data collection for SACMEQ’s Initial Project took place in October 1995 and involved the administration of questionnaires to pupils, teachers, and school heads.
The pupil questionnaire contained questions about the pupils’ home backgrounds and their school life; the teacher questionnaire asked about classrooms, teaching practices, working conditions, and teacher housing; and the school head questionnaire collected information about teachers, enrolments, buildings, facilities, and management. A reading literacy test was also given to the pupils. The test was based on items that were selected after a trial-testing programme had been completed.
Data Checking and Data Entry
Data preparation commenced soon after the main data collection was completed. The NRCs had to organize the safe return of all materials to the Ministry of Education where the data collection instruments could be checked, entered into computers, and then "cleaned" to remove errors prior to data analysis. The data-checking involved the "hand editing" of data collection instruments by a team of trained staff. They were required to check that: (i) all questionnaires, tests, and forms had arrived back from the sample schools, (ii) the identification numbers on all instruments were complete and accurate, and (iii) certain logical linkages between questions made sense (for example, the two questions to school heads concerning "Do you have a school library?" and "How many books do you have in your school library?").
The next step was the entry of data into computers using the WINDEM software. A team of 5-10 staff normally undertook this work. In some cases the data were "double entered" in order to monitor accuracy.
The numbers of keystrokes required to enter one copy of each data collection instrument were as follows: pupil questionnaire: 150; pupil reading test: 85; pupil mathematics test: 65; teacher questionnaire: 587; teacher reading test: 51; teacher mathematics test: 43; school head questionnaire: 319; school form: 58; and pupil name form: 51.
The NRCs received written instructions and follow-up support from IIEP staff in the basic steps of data cleaning using the WINDEM software. This permitted the NRCs to (i) identify major errors in the sequence of identification numbers, (ii) cross-check identification numbers across files (for example, to ensure that all pupils were linked with their own reading and mathematics teachers), (iii) ensure that all schools listed on the original sampling frame also had valid data collection instruments and vice-versa, (iv) check for "wild codes" that occurred when some variables had values that fell outside pre-specified reasonable limits, and (v) validate that variables used as linkage devices in later file merges were available and accurate.
A second phase of data preparation directed efforts towards the identification and correction of "wild codes" (which refer to data values that that fall outside credible limits), and "inconsistencies" (which refer to different responses to the same, or related, questions). There were also some errors in the identification codes for teachers that needed to be corrected before data could be merged.
During 2002 a supplementary training programme was prepared and delivered to all countries via the Internet. This training led each SACMEQ Research Team step-by-step through the required data cleaning procedures - with the NRCs supervising "hands-on" data cleaning activities and IIEP staff occasionally using advanced software systems to validate the quality of the work involved in each data-cleaning step.
This resulted in a "cyclical" process whereby data files were cleaned by the NRC and then emailed to the IIEP for checking and then emailed back to the NRC for further cleaning.
The number of cycles required to complete all of the data cleaning ranged from lows of 5 and 9 cycles in the Seychelles and Namibia, respectively, to highs of 27 and 31 cycles in Zanzibar and Uganda, respectively. The time required to complete the all of the data cleaning took from lows of 4 and 9 months in the Seychelles and Namibia, respectively, to highs of 23 and 24 months in Uganda and Mozambique, respectively.
As each NRC finalized the cleaning of the SACMEQ II data for his/her country, the data from all sources within a country were merged and weighted.
The merging process required the construction of a single data file for each school system in which pupils were the units of analysis. This was achieved by "disaggregating" the teacher and school head data over the pupil data. That is, each record of the final data file for a country consisted of the following four components: (a) the questionnaire and test data for an individual pupil, (b) the questionnaire and test data for his/her mathematics and reading teacher, (c) the questionnaire data for his/her school head, and (d) school and pupil "tracking forms" that were required for data cleaning purposes.
The merged file enabled linkages to be made among pupils, teachers, and school heads at the "between-pupil" level of analysis. To illustrate, with the merged file it was possible to examine questions of the following kind: "What are the average reading and mathematics test scores (based on information taken from the pupil tests) for groups of pupils who attend urban or rural schools (based on information taken from the school head questionnaire), and who are taught by male or female teachers (based on information taken from the teacher questionnaire)?"
Analyzing the Data
The data analyses for the SACMEQ II Project were very clearly defined because they were focussed specifically on generating results that could be used to "fill in the blank entries" in the Dummy Tables described above. There were two main tasks in this area. First, the SPSS software system was used to construct new variables (often referred to as "indices") or to recode existing variables. For example, an index of "socioeconomic level" was constructed by combining recoded variables that described the educational level of the pupils' parents, the materials used in the construction of pupils' homes, and the number of possessions in pupils' homes. Second, the IIEP's specialized data analysis software, IIEPJACK, was used to "fill" the Dummy Tables with appropriate statistics along with their correct measures of sampling error.
Estimates of Sampling Error
The sample designs employed in the SACMEQ Projects departed markedly from the usual "textbook model" of simple random sampling. This departure demanded that special steps be taken in order to calculate "sampling errors" (that is, measures of the stability of sample estimates of population characteristics).
In the report (Mauritius Working Report) a brief overview of various aspects of the general concept of "sampling error" has been presented. This has included a discussion of notions of "design effect", "the effective sample size", and the "Jackknife procedure" for estimating sampling errors.
Director- Southern Africa Consortium for Monitoring Educational Quality (SACMEQ)
International Institute for Educational Planning (UNESCO)
Director- Southern Africa Consortium for Monitoring Educational Quality
International Institute for Educational Planning - UNESCO
International Institute for Educational Planning
United Nations Educational, Scientific and Cultural Organization (UNESCO)
TERMS AND CONDITIONS FOR USE OF THE SACMEQ DATA ARCHIVE
The Southern and Eastern Africa Consortium for Monitoring Educational Quality (SACMEQ) Co-ordinating Centre (SCC <http://www.sacmeq.org/_legal/accept_new?destination=training-workshops>) has produced a data archive containing all information collected for SACMEQ's first three educational policy research projects (SACMEQ I, SACMEQ II, and SACMEQ III). This archive is now available online on the SACMEQ website so as to give bona fide researchers and students online access to SACMEQ data and documents.
The SACMEQ data sets have been developed at great cost and with the application of stringent quality controls. It is being made available to eligible users because it has a great potential to contribute to educational policy development beyond what has already been achieved in this respect through the reports written by the National Research Co-ordinators (NRC <http://www.sacmeq.org/_legal/accept_new?destination=training-workshops>s) and Deputy National Research Coordinators (NRC <http://www.sacmeq.org/_legal/accept_new?destination=training-workshops>s). It is expected that many researchers and students will wish to use the Data Archive for research, publications, and/or training purposes.
The Terms and Conditions serve two purposes. Firstly, they provide interested applicants with guidelines on how to access this valuable information resource. Secondly, they are intended to safeguard against the danger of users being unaware of the complexities of the data collection process and consequently arriving at misinterpretations that could lead to incorrect conclusions.
2.0 How can the user gain such access?
In order to obtain SACMEQ Data Archive for any of the SACMEQ school systems, the applicant should follow these steps:
2.1 Read and Agree to these "Terms and Conditions for the Use of the SACMEQ Data Archive."
2.2 Complete an online application form.
3.0 What rules govern the use of the SACMEQ data archive?
3.1 The Data Archive is the outcome of expensive and time-consuming activities of the staff of the represented Ministries of Education spread over many years. For this reason, the SACMEQ Ministries of Education described in the Data Archive should:
3.1.1 be notified by the SACMEQ SCC <http://www.sacmeq.org/_legal/accept_new?destination=training-workshops> of any request for data;
3.1.2 have an opportunity to review reports based on the data archive so as to correct any gross errors before they are published; and
3.1.3 satisfy themselves that the data have been used in such a manner that they contribute positively to the development of relevant education policies in relevant SACMEQ member countries.
3.2 It is the National Research Coordinators (NRC <http://www.sacmeq.org/_legal/accept_new?destination=training-workshops>s) and Deputy National Research
Coordinators (DNRCs) who have spearheaded the collection and compilation of SACMEQ data. In acknowledgment of their efforts, the applicant(s) will be required to invite the relevant country's National Research Coordinator to participate in the study associated with the use of the data. Where an individual other than the NRC <http://www.sacmeq.org/_legal/accept_new?destination=training-workshops> or DNRC is co-opted, the relevant NRC <http://www.sacmeq.org/_legal/accept_new?destination=training-workshops> and DNRC shall be given the first right of refusal.
3.3 This provision does not apply in situations where the SACMEQ Data Archive is used purely for purposes of individual academic research by a student, and where the results are not intended for publication.
3.4 All relevant NRC <http://www.sacmeq.org/_legal/accept_new?destination=training-workshops>s and DNRCs will be informed by the SCC <http://www.sacmeq.org/_legal/accept_new?destination=training-workshops> about the recipients of the Data Archive.
3.5 SACMEQ provides the SACMEQ Data Archive to applicants on the basis of the intended use stated in the application. The applicant, therefore, should not use the data for any purpose other than the one stated in the application. Should the applicant(s) wish to use the data for a purpose other than that stated in the agreement, then he/she/they must first secure the written approval of SACMEQ before he/she/they proceed to do so.
3.6 SACMEQ data are provided for the sole and exclusive use of the applicant specified in the agreement. The successful applicant should, therefore, not share the SACMEQ Data Archive with, or pass it on to, any other unauthorized person(s).
3.7 The authorized user shall take responsibility for the safe custody of the SACMEQ Data Archive and also take reasonable steps to ensure that no unauthorized persons gain access to it.
3.8 The authorized user shall give due credit to SACMEQ for providing the Data Archive by providing written acknowledgement of this in any publication emanating from their use.
3.9 As the Data Archive remains the property of the SACMEQ, no other person(s), including the successful applicants or the member Ministry, shall re-distribute or offer for sale the SACMEQ Data Archive.
3.10 All reports based on the SACMEQ Data Archive have to secure the written approval of the SCC <http://www.sacmeq.org/_legal/accept_new?destination=training-workshops> prior to the publication in order to confirm compliance to our terms and conditions, and also to ensure that there is no misunderstanding or misinterpretation of the data.
3.11 Once authorization has been granted to access the archive, you will see a link on the website which will take you to the Data Archive.
3.12 All relevant NRC <http://www.sacmeq.org/_legal/accept_new?destination=training-workshops>s will be informed by the SCC <http://www.sacmeq.org/_legal/accept_new?destination=training-workshops> about the recipients of the SACMEQ Data Archive.
3.13 Full acknowledgement of the source of the data (including reference to the SACMEQ Data Archive) must be given whenever the data are used.
3.14 A copy of any published article or report based on the SACMEQ Data Archive must be provided free of charge to (a) the SACMEQ Co-ordinating Centre, and (b) the Ministry(ies) of Education from whose data the report has been generated.
Southern and Eastern Africa Consortium for Monitoring Educational Quality. SACMEQ II Project 2000 [dataset]. Version 4. Harare: SACMEQ [producer], 2004. Paris: International Institute for Educational Planning, UNESCO [distributor], 2010.
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.