Login
Login
  • Home
  • Microdata Catalog
  • Citations
    Home / Central Data Catalog / ZAF_2000-2015_INDEPTH-ACDIS_V01_M
central

Africa Health Research Institute INDEPTH Core Dataset 2000 - 2015 Residents only (Release 2017)

South Africa, 2000 - 2015
Kobus Herbst, Frank Tanser, Deenan Pillay
Created on March 29, 2019 Last modified March 29, 2019 Page views 2631 Study website Metadata DDI/XML JSON
  • Study description
  • Documentation
  • Data Description
  • Get Microdata
  • Identification
  • Version
  • Scope
  • Coverage
  • Producers and sponsors
  • Sampling
  • Data Collection
  • Questionnaires
  • Data Processing
  • Data Appraisal
  • Access policy
  • Disclaimer and copyrights
  • Metadata production

Identification

Survey ID Number
ZAF_2000-2015_INDEPTH-ACDIS_v01_M
Title
Africa Health Research Institute INDEPTH Core Dataset 2000 - 2015 Residents only (Release 2017)
Subtitle
Residents only, Release 2014
Country
Name Country code
South Africa ZAF
Study type
Demographic Surveillance
Series Information
This dataset contains rounds 1 to 37 of demographic surveillance data covering the period from 1 Jan 2000 to 31 December 2015. Two rounds of data collection took place annually except in 2002 when three surveillance rounds were conducted. In 2012 we reverted to three rounds of data collection again. It is important to note that this does not imply that the dataset contains 37 individual cross sectional components, rather that the information (events) associated with each individual could have been updated at each of these 37 surveillance rounds if the individual was under surveillance during any particular round.
Abstract
The health and demography of the South African population has been undergoing substantial changes as a result of the rapidly progressing HIV epidemic. Researchers at the University of KwaZulu-Natal and the South African Medical Research Council established The Africa Health Research Studies in 1997 funded by a core grant from The Wellcome Trust, UK. Given the urgent need for high quality longitudinal data with which to monitor these changes, and with which to evaluate interventions to mitigate impact, a demographic surveillance system (DSS) was established in a rural South African population facing a rapid and severe HIV epidemic. The DSS, referred to as the Africa Health Research Institute Demographic Information System (ACDIS), started in 2000.

ACDIS was established to ‘describe the demographic, social and health impact of the HIV epidemic in a population going through the health transition’ and to monitor the impact of intervention strategies on the epidemic. South Africa’s political and economic history has resulted in highly mobile urban and rural populations, coupled with complex, fluid households. In order to successfully monitor the epidemic, it was necessary to collect longitudinal demographic data (e.g. mortality, fertility, migration) on the population and to mirror this complex social reality within the design of the demographic information system. To this end, three primary subjects are observed longitudinally in ACDIS: physical structures (e.g. homesteads, clinics and schools), households and individuals. The information about these subjects, and all related information, is stored in a single MSSQL Server database, in a truly longitudinal way—i.e. not as a series of cross-sections.

The surveillance area is located near the market town of Mtubatuba in the Umkanyakude district of KwaZulu-Natal. The area is 438 square kilometers in size and includes a population of approximately 85 000 people who are members of approximately 11 000 households. The population is almost exclusively Zulu-speaking. The area is typical of many rural areas of South Africa in that while predominantly rural, it contains an urban township and informal peri-urban settlements. The area is characterized by large variations in population densities (20–3000 people/km2). In the rural areas, homesteads are scattered rather than grouped. Most households are multi-generational and range with an average size of 7.9 (SD:4.7) members. Despite being a predominantly rural area, the principle source of income for most households is waged employment and state pensions rather than agriculture. In 2006, approximately 77% of households in the surveillance area had access to piped water and toilet facilities.

To fulfil the eligibility criteria for the ACDIS cohort, individuals must be a member of a household within the surveillance area but not necessarily resident within it. Crucially, this means that ACDIS collects information on resident and non-resident members of households and makes a distinction between membership (self-defined on the basis of links to other household members) and residency (residing at a physical structure within the surveillance area at a particular point in time). Individuals can be members of more than one household at any point in time (e.g. polygamously married men whose wives maintain separate households). As of June 2006, there were 85 855 people under surveillance of whom 33% were not resident within the surveillance area. Obtaining information on non-resident members is vital for a number of reasons. Most importantly, understanding patterns of HIV transmission within rural areas requires knowledge about patterns of circulation and about sexual contacts between residents and their non-resident partners. To be consistent with similar datasets from other INDEPTH Member centres, this data set contains data from resident members only.

During data collection, households are visited by fieldworkers and information supplied by a single key informant. All births, deaths and migrations of household members are recorded. If household members have moved internally within the surveillance area, such moves are reconciled and the internal migrant retains the original identfier associated with him/her.
Kind of Data
Event history data
Unit of Analysis
Individual

Version

Version Description
v1 : Version extracted from analytical database ACDIS_A20161215

This study represents only a portion of the total data associated with the complete AHRI Population Intervention Platform as described in the study abstract.
Version Date
2017

Scope

Notes
The study only includes the events defining the resident exposure of individuals under surveillance as well as the delivery events of resident women. Each type of event contains minimal attributes describing the event.

Attributes common to each event:
Event Type,
Event Date
Observation date

Migration:
Origin and Destination

Death:
Cause

Delivery:
Live born and Still born counts
Parity
Topics
Topic Vocabulary URI
Demography [N01.224] MeSH http://www.ncbi.nlm.nih.gov/mesh
Age Distribution [N01.224.033] MeSH http://www.ncbi.nlm.nih.gov/mesh
Emigration and Immigration [N01.224.625.350] MeSH http://www.ncbi.nlm.nih.gov/mesh
Residential Mobility [N01.224.791.700] MeSH http://www.ncbi.nlm.nih.gov/mesh
Sex Distribution [N01.224.803] MeSH http://www.ncbi.nlm.nih.gov/mesh
Vital Statistics [N01.224.935] MeSH http://www.ncbi.nlm.nih.gov/mesh
Life Expectancy [N01.224.935.464] MeSH http://www.ncbi.nlm.nih.gov/mesh
Mortality [N01.224.935.698] MeSH http://www.ncbi.nlm.nih.gov/mesh
Cause of Death [N01.224.935.698.100] MeSH http://www.ncbi.nlm.nih.gov/mesh
Birth Rate [N01.224.935.849.500] MeSH http://www.ncbi.nlm.nih.gov/mesh
Rural Population [N01.600.725] MeSH http://www.ncbi.nlm.nih.gov/mesh
Maternal Age [N06.850.490.250.550] MeSH http://www.ncbi.nlm.nih.gov/mesh
Parity [N06.850.490.812.600] MeSH http://www.ncbi.nlm.nih.gov/mesh
Survival Analysis [N06.850.520.830.998] MeSH http://www.ncbi.nlm.nih.gov/mesh

Coverage

Geographic Coverage
Demographic surveillance area situated in the south-east portion of the uMkhanyakude district of KwaZulu-Natal province near the town of Mtubatuba. It is bounded on the west by the Umfolozi-Hluhluwe nature reserve, on the South by the Umfolozi river, on the East by the N2 highway (except form portions where the Kwamsane township strandles the highway) and in the North by the Inyalazi river for portions of the boundary. The area is 438 square kilometers.
Universe
Resident household members of households resident within the demographic surveillance area. Inmigrants are defined by intention to become resident, but actual residence episodes of less than 180 days are censored. Outmigrants are defined by intention to become resident elsewhere, but actual periods of non-residence less than 180 days are censored. Children born to resident women are considered resident by default, irrespective of actual place of birth. The dataset contains the events of all individuals ever resident during the study period (1 Jan 2000 to 31 Dec 2015).

Producers and sponsors

Primary investigators
Name Affiliation
Kobus Herbst Africa Health Research Institute (ZA031)
Frank Tanser Africa Health Research Institute (ZA031)
Deenan Pillay Africa Health Research Institute (ZA031)
Producers
Name Affiliation Role
Tinofa Mutevedzi Africa Health Research Insittute (ZA031) Data Collection
Funding Agency/Sponsor
Name Abbreviation Role
Wellcome Trust WT
Wellcome Trust WT prior funder
Other Identifications/Acknowledgments
Name Affiliation Role
Dickman Gareta Africa Health Research Institute (ZA031) Database Scientist
Sweetness Dube Africa Health Research Institute (ZA031) Data Documentation Archivist

Sampling

Sampling Procedure
This dataset is not based on a sample but contains information from the complete demographic surveillance area.

Reponse units (households) by year:
Year Households
2000 11856
2001 12321
2002 12981
2003 12165
2004 11841
2005 11312
2006 12065
2007 12165
2008 11790
2009 12145
2010 12485
2011 12455
2012 12087
2013 11988
2014 11778
2015 11938

In 2006 the number of response units increased due to the addition of a new village into the demographic surveillance area.
Deviations from the Sample Design
None
Response Rate
Household response rates are as follows (assuming that if a household has not responded for 2 years following the last recorded visit to that household, that the household is lost to follow-up and no longer part of the response rate denominator):

Year Response Rate
2000 94%
2001 93%
2002 96%
2003 91%
2004 88%
2005 84%
2006 88%
2007 89%
2008 87%
2009 88%
2010 89%
2011 89%
2012 89%
2013 90%
2014 89%
2015 91%
Weighting
Not applicable

Data Collection

Dates of Data Collection
Start End Cycle
2000-01-01 2015-12-31 Release coverage
2000-08-01 2001-02-01 Round 2
2001-02-01 2001-06-25 Round 3
2001-06-25 2002-01-07 Round 4
2002-01-07 2002-05-06 Round 5
2002-05-06 2002-09-02 Round 6
2002-09-02 2003-01-08 Round 7
2003-01-08 2003-06-18 Round 8
2003-06-18 2003-11-27 Round 9
2003-11-27 2004-06-07 Round 10
2004-06-07 2005-01-01 Round 11
2005-01-01 2005-07-04 Round 12
2005-07-04 2006-01-10 Round 13
2006-01-10 2006-07-17 Round 14
2006-07-17 2007-01-10 Round 15
2007-01-10 2007-07-11 Round 16
2007-07-11 2008-01-13 Round 17
2008-01-13 2008-07-01 Round 18
2008-07-01 2009-01-11 Round 19
2009-01-11 2009-07-13 Round 20
2009-07-13 2009-12-11 Round 21
2009-12-11 2010-06-11 Round 22
2010-06-11 2010-12-21 Round 23
2010-12-21 2011-06-20 Round 24
2011-06-20 2011-11-17 Round 25
2012-01-11 2012-05-29 Round 26
Frequency of Data Collection
This dataset contains rounds 1 to 37 of demographic surveillance data covering the period from 1 Jan 2000 to 31 December 2015. Two rounds of data collection took place annually except in 2002 when three surveillance rounds were conducted. From 1 Jan 2015 onwards there are three surveillance rounds per annum.
Time periods
Start date End date Cycle
2000-01-01 2011-12-31 Release coverage
Data Collection Mode
Proxy Respondent [proxy]
Supervision
Fieldworkers operated in teams of between 8 and 12 fieldworkers supervised each supervised by a Fieldwork supervisor. Supervisors conduct supervised visits and quality control visits and review fieldworkers data collection.
Data Collection Notes
Enumerators were trained immediately prior to the baseline data collection and then refresher training was conducted for one week between each surveillance round. New fieldworkers received a standardised 6 week training course prior to appointment as data collectors. Data entry staff received fieldwork training in addition to training in the use of the data entry programs.
Data Collectors
Name Abbreviation Affiliation
The Africa Health Research Institute ZA031 UKZN

Questionnaires

Questionnaires
Bounded structure registration (BSR) or update (BSU) form:
- Used to register characteristics of the BS
- Updates characteristics of the BS
- Information as at previous round is preprinted

Household registration (HHR) or update (HHU) form:
- Used to register characteristics of the HH
- Used to update information about the composition of the household
- Information preprinted of composition and all registered households as at previous

Household Membership Registration (HMR) or update (HMU):
- Used to link individuals to households
- Used to update information about the household memberships and member status observations
- Information preprinted of member status observations as at previous

Individual registration form (IDR):
- Used to uniquely identify each individual
- Mainly to ensure members with multiple household memberships are appropriately captured

Migration notification form (MGN):
- Used to record change in the BS of residency of individuals or households
_ Migrants are tracked and updated in the database

Pregnancy history form (PGH) & pregnancy outcome notification form (PON):
- Records details of pregnancies and their outcomes
- Only if woman is a new member
- Only if woman has never completed WHL or WGH

Death notification form (DTN):
- Records all deaths that have recently occurred
- Iincludes information about time, place, circumstances and possible cause of death

Data Processing

Data Editing
On data entry data consistency and plausibility were checked by 455 data validation rules at database level. If data validaton failure was due to a data collection error, the questionnaire was referred back to the field for revisit and correction. If the error was due to data inconsistencies that could not be directly traced to a data collection error, the record was referred to the data quality team under the supervision of the senior database scientist. This could request further field level investigation by a team of trackers or could correct the inconsistency directly at database level.

No imputations were done on the resulting micro data set, except for:

a. If an out-migration (OMG) event is followed by a homestead entry event (ENT) and the gap between OMG event and ENT event is greater than 180 days, the ENT event was changed to an in-migration event (IMG).
b. If an out-migration (OMG) event is followed by a homestead entry event (ENT) and the gap between OMG event and ENT event is less than 180 days, the OMG event was changed to an homestead exit event (EXT) and the ENT event date changed to the day following the original OMG event.
c. If a homestead exit event (EXT) is followed by an in-migration event (IMG) and the gap between the EXT event and the IMG event is greater than 180 days, the EXT event was changed to an out-migration event (OMG).
d. If a homestead exit event (EXT) is followed by an in-migration event (IMG) and the gap between the EXT event and the IMG event is less than 180 days, the IMG event was changed to an homestead entry event (ENT) with a date equal to the day following the EXT event.
e. If the last recorded event for an individual is homestead exit (EXT) and this event is more than 180 days prior to the end of the surveillance period, then the EXT event is changed to an out-migration event (OMG)

In the case of the village that was added (enumerated) in 2006, some individuals may have outmigrated from the original surveillance area and setlled in the the new village prior to the first enumeration. Where the records of such individuals have been linked, and indivdiual can legitmately have and outmigration event (OMG) forllowed by and enumeration event (ENU). In a few cases a homestead exit event (EXT) was followed by an enumeration event in these cases. In these instances the EXT events were changed to an out-migration event (OMG).

To produce this micro-data set, the episode table is processed using Pentaho Kettle ETL program to produce this standard event-history format dataset.

The following processing checks are done during the ETL process.

1. If the first event is legal. Like the first event must beenumeration, birth or inmigration.
2. If the last event is legal. Like the last event must be end of observtion, death or outmigration.
3. If the transition events are legal.
The list of legal transitions:

Birth followed by death
Birth followed by exit
Birth followed by end of observation
Birth followed by outmigration

Death followed by none

Entry followed by death
Entry followed by exit
Entry followed by end of observation
Entry followed by outmigration
Enumeration followed by death
Enumeration followed by exit
Enumeration followed by outmigration

Exit followed by entry

Inmigration followed by Death
Inmigration followed by exit
Inmigration followed by end of observation
Inmigration followed by outmigration

End of observation followed by none

Outmigration followed by none
Outmigration followed by enumeration
Outmigration followed by inmigration

The list of illegal transitions:

Birth followed by none
Birth followed by birth
Birth followed by entry
Birth followed by enumeration
Birth followed by inmigration

Death followed by birth
Death followed by death
Death followed by entry
Death followed by enumeration
Death followed by exit
Death followed by inmigration
Death followed by outmigration
Death followed by end of observation

Entry followed by none
Entry followed by birth
Entry followed by entry
Entry followed by enumeration
Entry followed by inmigration

Enumeration followed by none
Enumeration followed by birth
Enumeration followed by entry
Enumeration followed by enumeration
Enumeration followed by inmigration

Exit followed by birth
Exit followed by death
Exit followed by exit
Exit followed by end of observation
Exit followed by outmigration

Inmigration followed by none
Inmigration followed by birth
Inmigration followed by entry
Inmigration followed by enumeration
Inmigration followed by inmigration

End of observation followed by birth
End of observation followed by death
End of observation followed by entry
End of observation followed by enumeration
End of observation followed by exit
End of observation followed by inmigration
End of observation followed by end of observation
End of observation followed by outmigration

Outmigration followed by birth
Outmigration followed by death
Outmigration followed by exit
Outmigration followed by end of observation
Outmigration followed by outmigration

List of edited events:

Exit followed by none
Exit followed by enumeration
Exit followed by inmigration

Outmigration followed by entry
Other Processing
All homesteads in the Hlabisa sub-district were geocoded and entered into a geographic information system (GIS) prior to the start of surveillance. The demographic surveillance area was selected on the basis of this information to include an area with clear geographic boundaries and an estimated population size suitable for the envisaged research agenda. Since then the GIS database has been updated based on notification of new homesteads from the fieldwork and periodic reviews of satellite and aerial photography.

Mapping teams used differentially coorrected global positioning system (GPS) units (accuracy <2m) to geocode homesteads.

How document control was conducted to ensure all census forms were completed?

Before each round, a SQL script generated a list of questionnaires to be printed for each household resident in the surveillance area. Each questionnaire is given a unique integer key which is printed as a barcode on the questionnaire. A series of web-based reports called 'Unified Reports' are then used to track and control the status of each questionnaire from document production, data collection, data entry and document archiving. A strict chain of custody is enforced for all questionnaire movements.

A data entry is performed by a team of 6 data capturers with one supervisor using in-house developed software (Delphi and .NET C#). Double-entry is not routinely used except in the case of verbal autopsy questionnaires.

Data is stored in a MS SQL database, with transaction logging, daily backups and twice weekly off-site backups. Constraints and validation rules placed on the database help in checking data quality during data entry.

All data entry done by each data capturer in the first five days of each round is 100% rechecked by the supervisor. If during those 5 days the data capturer's work is consistently error-free, only 20% of their work will be subjected to rechecking by a supervisor. If any error is picked up in the 20% rechecking, then their work gets subjected to 100% recheck for another 5 consecutive days.

Field QC Procedures

- Supervised visits - this exercise is carried by the fieldworker and the supervisor jointly. The two select a sample of bounded structures which they will visit together. During a Supervised visit, the supervisor listens and observes as the fieldworker conducts the interviews without interrupting. The supervisor uses a checklist to write observations and comments for feedback and further training of a particular fieldworker immediately after departure from a BS,. The supervised visit checklist is submitted to the QC section and is used for performance analysis, as well as for identification of training needs.

- Quality Control visits - these are repeat data collection visits conducted by a fieldwork supervisor soon after the fieldworker completes routine data collection at a homestead. This is done mainly to ensure accuracy and reliability of the information collected by fieldworkers. Quality control visits are selected randomly by the computer at a 5% sample of the total number of homesteads to be visited per each round. The original copy and the supervisor's copy are then compared by the quality controllers to identify discrepancies between the two. If discrepancies are found, the two copies are rejected back to the field for reconciliation between the two. The records are also kept at the quality control section for analyses towards the end of the round and this also contributes to performance management of individual employees QC at the office before data entry.

After data collection and before data entry, the office-based QC section checks questionnaires for completeness, consistency and accuracy. If a questionnaire failed to meet the quality standard requirements, the QC clerk send back the questionnaires to the field worker's supervisor.

Specify how the data was extracted (including which software program was used) to produce the core micro data set. How was inconsistent records dealt with during this process?

Following data collection and data entry completion at the end of a surveillance round, a snapshot of the operational database is created as an analytical database. each such snapshot is uniquely identified and analytical datasets must reference the analytical database thay originated from. Analytical datasets are never produced directly from the operational database, as this database is continually in flux as data is updated through the data collection and entry processes.

An sql script produces a normalised episode table each time an analytical database is created. This episode table contains an exposure record for each exposure episode for an individual, from initial enumeration, birth or in-migration, up to eventual death or out-migration. The episode table contains the start event and date of the exposure as well as the end event and date of the end of exposure. Individuals that out-migrate and later in-migrate are reconciled as far as possible using individual identifiers (national identity number, names, sex and date of birth) under a single individual identity. All internal movements (migrations) are reconciled and residencies at different homesteads within the surveillance area are reflected as separate episodes in the episode table.

In the case of deaths, the next of kin are visited by a verbal autopsy nurse and a derivation of the INDEPTH standard verbal autopsy questionnaire is used to document the death. The verbal autopsy questionnaires are interpreted by the INTERVA-4 program to derive cause of death information.

Data Appraisal

Estimates of Sampling Error
Not applicable

Access policy

Access authority
Name Affiliation Email
iSHARE2 Help desk INDEPTH help-data@indepth-network.org
Contacts
Name Affiliation Email URL
iSHARE2 Helpdesk INDEPTH help-data@indepth-network.org http://www.indepth-ishare.org/index.php/howtouse
Confidentiality
This data is anonymised and no confidentiality agreement in addition to the general data use agreement is required.
Access conditions
This data is made available for licensed access under the following conditions:

1. Data and other material provided by INDEPTH will not be redistributed or sold to other individuals, institutions or organisations without INDEPTH's written agreement.

2. In the case of multi-centre datasets, data originating from a single contributing member centre of the INDEPTH Network may not be analysed or reported on in isolation without the express permission of the member centre concerned.

3. No attempt will be made to re-identify respondents, and there will be no use of the identity of any person or establishment discovered inadvertently. Any such discovery will be reported immediately to INDEPTH.

4. No attempt will be made to produce links between datasets provided by INDEPTH or between INDEPTH data and other datasets that could identify individuals.

5. Any books, articles, conference papers, theses, dissertations, reports or other publications employing data obtained from INDEPTH will cite the source, in line with the citation requirement provided with the dataset.

6. An electronic copy of all publications based on the requested data will be sent to INDEPTH.

7. The original collector of the data, INDEPTH, and the relevant funding agencies bear no responsibility for the data's use or interpretation or inferences based upon it.
Citation requirements
Any use of this dataset must cite the digital object identifier (doi) associated with this dataset. Using the following form:

"Africa Health Research Institute INDEPTH Core Dataset 2000-2015 (Residents only) - Release 2017. Jul 2017. Provided by the INDEPTH Network Data Repository. www.indepth-network.org <http://www.indepth-network.org>. doi:10.7796/INDEPTH.ZA031.CMD2015.v1"
Location of Data Collection
INDEPTH Data Repository
Archive where study is originally stored
Africa Centre (ZA031)

Disclaimer and copyrights

Disclaimer
The user of the data acknowledges that the original collector of the data, INDEPTH, and the relevant funding agencies bear no responsibility for the data's use or interpretation or inferences based upon it.
Copyright
This dataset documentation is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. The dataset is shared in terms of the data-use agreement accepted at the time of data download.

Metadata production

DDI Document ID
DDI_ZAF_2000-2015_INDEPTH-ACDIS_v01_M
Producers
Name Abbreviation Affiliation Role
iSHARE2 Technical Team iS2TT INDEPTH Network Documentation of the study
AJ Herbst Africa Health Research Institute (ZA031) DDI author
SH Dube Africa Health Research Institute (ZA031) Data documentation archivist
Date of Metadata Production
2017-06-29
DDI Document version
- v01 (June 2017)
The DDI was produced by INDEPTH Network. It was downloaded on October 12, 2017 from http://www.indepth-ishare.org/index.php/catalog/137/ by the World Bank Microdata Library documentation team.

- v02 (October 2017)
Modifications in the study ID and DDI ID were done by the World Bank Microdata Library documentation team to match the standard used by the library and the IHSN Survey Catalog. Some metadata fields were also edited.
IHSN Survey Catalog

© IHSN Survey Catalog, All Rights Reserved.