<?xml version="1.0" encoding="UTF-8"?>
<codeBook version="1.2.2" ID="MDA_2013_MCC-VCT_v01_M" xml-lang="en" xmlns="http://www.icpsr.umich.edu/DDI" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.icpsr.umich.edu/DDI http://www.icpsr.umich.edu/DDI/Version1-2-2.xsd">
<docDscr>
  <citation>
    <titlStmt>
      <IDNo>DDI_MDA_2013_MCC-VCT_v01_M</IDNo>
    </titlStmt>
    <prodStmt>
      <producer abbr="MCC" affiliation="" role="Review of Metadata">Millennium Challenge Corporation</producer>
      <prodDate date="2014-10-27">2014-10-27</prodDate>
      <software version="v5">NADA</software>
    </prodStmt>
    <verStmt>
      <version>Version 1.1 (Original 2014-9-22)
Version 2.0 (April 2015). Edited version based on Version 01 (DDI-MCC-MDA-IE-AG-2012-v1.1) that was done by Millennium Challenge Corporation.</version>
    </verStmt>
  </citation>
</docDscr>
<stdyDscr>
  <citation>
    <titlStmt>
      <titl>Value Chain Training 2013</titl>
      <subTitl/>
      <altTitl>MCC-VCT 2013</altTitl>
      <parTitl/>
      <IDNo>MDA_2013_MCC-VCT_v01_M</IDNo>
    </titlStmt>
    <rspStmt>
      <AuthEnty affiliation="">Mathematica Policy Research</AuthEnty>
    </rspStmt>
    <prodStmt>
      <copyright/>
      <software version="5.0" date="2021-04-09">NADA</software>
      <fundAg abbr="MCC" role="">Millennium Challenge Corporation</fundAg>
      <grantNo/>
    </prodStmt>
    <distStmt>
      <contact affiliation="Millennium Challenge Corporation" URI="" email="impact-eval@mcc.gov">Monitoring &amp; Evaluation Division</contact>
      <depDate date=""/>
      <distDate date=""/>
    </distStmt>
    <serStmt>
      <serName>Independent Impact Evaluation</serName>
      <serInfo/>
    </serStmt>
    <verStmt>
      <version date="">Anonymized dataset for public distribution</version>
      <verResp/>
      <notes/>
    </verStmt>
    <biblCit format=""/>
    <notes/>
  </citation>
  <stdyInfo>
    <studyBudget/>
    <subject>
      <keyword vocab="" vocabURI="">Moldova</keyword>
      <keyword vocab="" vocabURI="">agriculture</keyword>
      <keyword vocab="" vocabURI="">farmer training</keyword>
      <keyword vocab="" vocabURI="">impact evaluation</keyword>
      <keyword vocab="" vocabURI="">randomization</keyword>
      <topcClas vocab="" vocabURI="">Moldova</topcClas>
      <topcClas vocab="" vocabURI="">agriculture</topcClas>
      <topcClas vocab="" vocabURI="">farmer training</topcClas>
      <topcClas vocab="" vocabURI="">impact evaluation</topcClas>
      <topcClas vocab="" vocabURI="">randomization</topcClas>
    </subject>
    <abstract>The impact evaluation is not designed to measure the overall impact of the GHS activity. Instead, the impact evaluation will be able to provide evidence on the impact of the value chain training subactivity (alone) in an environment in which other value chain constraints are concurrently addressed. 

The evaluation of the GHS value chain training subactivity will focus on measuring the extent, if any, to which the training activities improved the productivity and profitability of participants. In particular, the evaluation will address the following research questions: 
1. What is the impact of GHS value chain training on adoption of new practices and production (yield) within the context of a value chain project? Do these impacts vary by value chain? Were some practices or combinations of practices adopted more than others, and why or why not?
2. Does distance from an GHS value chain training site affect participation in GHS value chain training? What other factors affect participation?
3. To what degree are new practices adopted by value chain participants who do not themselves participate in GHS value chain training activities? Can adoption by nonparticipants be attributed to program ripple effects, rather than broader trends? 
4. How does the impact of value chain training on adoption of new practices and production vary with the characteristics of farm operators and farm households?

The impact evaluation of the GHS value chain training subactivity will use a random assignment evaluation design. Eighty potential training sites were randomly assigned to a treatment group (48 sites)--at which training activities will be conducted--or to a control group (32 sites)--at which training activities will not be conducted. Though random assignment will determine where GHS value chain training activities are held, it will not necessarily determine which farmers participate in training. Farmers living in communities that are near control sites will be free to attend trainings held in other communities and may travel to do so; likewise, not all farmers living near treatment sites will attend trainings. If all farmers in treatment sites attended training while all farmers in control sites did not, the impacts of training could be estimated by comparing the outcomes of treatment group farmers to the outcomes of control group farmers at follow-up. If instead some farmers living near treatment sites choose not to attend training while some farmers living near control sites do attend training--which is our expectation--the evaluation approach will have to account for this phenomenon. 

The evaluator will be able to measure the impacts of the GHS value chain training subactivity as long as farmers living near treatment sites are more likely to attend GHS value chain training activities than farmers who live near control sites. The estimation approach will exploit the variation in the likelihood of attending GHS value chain training activities induced by random assignment. In particular, the impact of the GHS value chain training subactivity will be estimated using an instrumental variables (IV) framework, using distance from training as an instrument for participation in training. In this context, using an IV approach is not unlike a comparison of farmers in treatment and control sites, except that it adjusts for the fact that some control farmers will participate in GHS value chain training activities and some treatment farmers will not participate. 

The IV approach is credible in this context because training sites were assigned randomly. Because training locations were assigned randomly, we can assume that farmers near treatment sites are the same, on average, as farmers living near control sites (before training activities take place). The IV approach isolates the component of participation that is driven by the instrument (here, distance). The IV estimates can be interpreted as the impact for a key group affected by the training subactivity--farmers who undertake training if it is offered nearby, but not if it is offered far away. 

This evaluation design will enable the evaluator to measure the impacts of participating in GHS value chain training activities. Importantly, all value chain participants could benefit from the activities, whether or not they participate in training; furthermore, other activities in the value chain could amplify the benefits of training. Therefore, impacts measured through the evaluation will tell us the impacts of training in an environment in which other value chain barriers are addressed; they will not tell us the full impact of all of the activities or what the impact of training would be in the absence of other, related activities.</abstract>
    <sumDscr>
      <collDate date="2013-01-01" event="start" cycle=""/>
      <collDate date="2013-03-31" event="end" cycle=""/>
      <nation abbr="MDA">Moldova</nation>
      <geogCover>Data are collected from farmers in communities spread throughout rural Moldova, but only from communities that were considered for training (but may not have necessarily had training offered, such as for communities randomly assigned to the control group).</geogCover>
      <geogUnit/>
      <anlyUnit>Farms</anlyUnit>
      <universe>The study population includes farm operators in approximately 88 communities--48 treatment communities, 32 control communities, and 8 A-list communities (high priority sites that were purposefully selected to receive training). To be included in the study, farmers must have cultivated targeted crops (which, for each community, were identified in advance by the implementer). Across these 88 communities, about 2100 farmers were interviewed in the 2013-2014 FOS.

Given this is a panel survey, the same households will be interviewed across multiple rounds.</universe>
      <dataKind>Sample survey data [ssd]</dataKind>
    </sumDscr>
    <!-- qualityStatement - ddi2.5 - complex type
     
     This structure consists of two parts, standardsCompliance and otherQualityStatements. 
     In standardsCompliance list all specific standards complied with during the execution of this 
     study. Note the standard name and producer and how the study complied with the standard. 
     Enter any additional quality statements in otherQualityStatements.
     
     -->
    <qualityStatement>
      <standardsCompliance>
        <standard>
          <standardName/>
          <producer/>
        </standard>
        <complianceDescription/>
      </standardsCompliance>
      <otherQualityStatement/>
    </qualityStatement>
    <notes/>
    <!-- exPostEvaluation ddi2.5
      Use this section to describe evaluation procedures not address in data evaluation processes. 
      These may include issues such as timing of the study, sequencing issues, cost/budget issues, 
      relevance, instituional or legal arrangments etc. of the study. 
      
      The completionDate attribute holds the date the evaluation was completed. 
      The type attribute is an optional type to identify the type of evaluation with or without 
      the use of a controlled vocabulary.
    -->
    <exPostEvaluation completionDate="" type="">
      <evaluationProcess/>
      <outcomes/>
    </exPostEvaluation>
  </stdyInfo>
  <method>
    <dataColl>
      <timeMeth/>
      <dataCollector abbr="ACT" affiliation="">ACT Research</dataCollector>
      <!-- collectorTraining - DDI2.5
        
        Collector Training

        Describes the training provided to data collectors including internviewer training, process testing, 
        compliance with standards etc. This is repeatable for language and to capture different aspects of the 
        training process. The type attribute allows specification of the type of training being described.
        
        -->
      <collectorTraining type=""/>
      <frequenc/>
      <sampProc>1. Sample frame
For the sample frame, the survey contractor developed a list of all farm operators cultivating crops in targeted value chains in the 80 study communities (treatment and control) and 8 A-list communities (high priority sites that were purposefully selected to receive training). This list included information about farm size and which of the targeted crops the farm operator cultivated., In three communities, the survey contractor did not identify any farmers cultivating targeted crops, so the final sample frame included 77 study communities and 8 A-list communities. Information on total farm size was used to draw separate samples for farms of different sizes. 

2. Drawing the sample
For small farms (less than 10 hectares), we drew a random sample of farm operators in targeted value chains in each community. To determine the number of farmers to select in each community and to select farmers, we implemented the following steps:

·We allocated the total small-farm sample across communities in proportion to their size (the number of small-farm operators in targeted value chains). For example, if one community had twice as many treatment small-farm operators as another, we allocated twice as many small-farm operators to that community. To ensure that very small communities were adequately represented and that very large communities do not drive the impact estimates, no community's sample could be below a minimum of 20 or above a maximum of 150 small farmers. Allocating the sample in this way ensured that the sample was balanced across communities but still close to self-weighting.

·We drew the sample in each community using implicit stratification by value chain. We used implicit stratification by value chain (sorting farmers in each community by value chain and selecting the sample so that it was evenly spread across this ordered list) to ensure that the randomly-selected sample provided proportional representation of the different value chains in each community.

For medium (between 10 and 100 hectares) and large (100 hectares or larger) farms, we determined that there were relatively few farms in the value chain training sample frame (174 medium farms and 77 large farms). We therefore attempted to interview all operators of these farms so that we would have precise estimates for these groups. 

3. Use of replacements
In some cases, the survey contractor was unable to conduct an interview with a selected farm operator. This occurred for various reasons, such as refusal to participate or ineligibility for the survey (if it was determined that the operator did not cultivate the targeted value chains). To account for this, we developed a list of replacement farmers in each community at the same time that we selected our initial sample. Because all medium and large farmers were selected for the sample, the replacement list included only small farmers. These procedures were designed to help ensure that we reached our target sample sizes for the analysis while maintaining the representativeness of the sample to the extent possible and keeping the replacement procedure reasonably straightforward.</sampProc>
      <sampleFrame>
        <sampleFrameName/>
        <custodian/>
        <universe/>
        <frameUnit isPrimary="">
          <unitType numberOfUnits=""/>
        </frameUnit>
        <updateProcedure/>
      </sampleFrame>
      <deviat>The analysis sample does not include all respondents to the survey. The analysis sample excludes farmers from one stratum that had five treatment communities and three control communities. This stratum was excluded because it contained virtually no control farmers. As a result, the analysis sample includes 902 farmers in 41 treatment communities, 563 farmers in 28 control communities, and 200 farmers in 8 A-list communities.</deviat>
      <collMode/>
      <resInstru>The evaluation will draw on three key sources of data. 

The first is longitudinal survey data from farm operators living near treatment and control sites that will enable us to track outcome changes over time. This survey, the Moldova Farm Operator Survey, included two questionnaires: one questionnaire for small and medium farms (&lt; 100 Ha), and a separate questionnaire for large farms (&gt;= 100 Ha). The questionnaires were provided in Romanian language (though English translations are available). In some cases, the interview may have been conducted in Russian instead of Romanian. The questionnaire includes numerous domains, including household/farm characteristics, production, sales, farm income, use of agricultural practices, participation in agricultural training, and credit. In general, the questionnaire focused on outcomes from the 2013 agricultural season.

For the impact analysis, these survey data will be linked to a second source, which is implementation data about GHS value chain training activities--such as locations, value chains and topics covered, and dates. The final source is qualitative data from focus groups and interviews.</resInstru>
      <!-- instrumentDevelopment - DDI2.5             
        Describe any development work on the data collection instrument. Type attribute allows for the optional use of a defined development type with or without use of a controlled vocabulary.
        -->
      <instrumentDevelopment type=""/>
      <collSitu/>
      <actMin/>
      <ConOps/>
      <weight>Our sampling strategy attempted to create a survey sample that was as close to self-weighting as possible. However, we still need to apply weights to ensure that our analysis sample is representative of farm operators in the targeted value chains in the treatment and control communities. We constructed weights to account for:

·Differences in sampling probabilities across farmers. We drew the sample of eligible small farmers using implicit stratification in each community. The sampling probability for small farmers in a given community was therefore determined by the fraction of small farmers sampled in that community. Because the community allocations were roughly proportional to the number of eligible farmers in each community (except for small deviations due to the minima and maxima we imposed), this sampling probability was similar for most small farmers. Nevertheless, we need to adjust for the small deviations in this probability. We surveyed all medium and large farmers; therefore, their sampling probability was one. The inverse of the sampling probability was used to obtain a farm-level sampling weight for each farmer.

·Possible differential nonresponse across different types of farmers. To adjust for possible systematic nonresponse among certain types of farmers, we computed response rates within cells that we defined by random assignment stratum, treatment status, and farm size (small, medium, or large). We used the inverse of the response rate to obtain a nonresponse weight for all farmers in a given cell. 

We then multiplied these weights to yield preliminary farm-level weights. In addition, to ensure that treatment status was not correlated with random assignment stratum, we reweighted the control farms in each stratum so that their (weighted) sum was equal to the (weighted) sum of treatment observations in that stratum. Finally, we normalized these adjusted weights so that their sum was equal to the number of observations for each farm size group (small, medium, and large).</weight>
      <cleanOps/>
    </dataColl>
    <notes/>
    <anlyInfo>
      <respRate>The overall response rate was 83 percent in treatment and control communities.</respRate>
      <EstSmpErr>Baseline differences between the treatment and control groups were estimated in a regression framework. This regression model enabled us to account for the features of the evaluation design, specifically the stratified random assignment. In addition, because the unit of random assignment is the community, to obtain the correct standard error for the baseline differences we had to account for the fact that outcomes in the same communities are likely correlated. The regression model enabled us to account for this using the “cluster” correction in Stata, with the community as the level of clustering.</EstSmpErr>
      <dataAppr/>
    </anlyInfo>
    <stdyClas/>
    <dataProcessing type=""/>
    <codingInstructions relatedProcesses="" type="">
      <txt/>
      <command formalLanguage=""/>
    </codingInstructions>
  </method>
  <dataAccs>
    <setAvail>
      <accsPlac URI="http://data.mcc.gov/evaluations/index.php/catalog/121">Millennium Challenge Corporation</accsPlac>
      <origArch>Millennium Challenge Corporation
http://data.mcc.gov/evaluations/index.php/catalog/121
Cost: None</origArch>
      <avlStatus/>
      <collSize/>
      <complete/>
      <fileQnty/>
      <notes/>
    </setAvail>
    <useStmt>
      <confDec required="no" formNo="" URI=""/>
      <restrctn/>
      <citReq/>
      <deposReq/>
      <conditions/>
      <disclaimer/>
    </useStmt>
    <notes/>
  </dataAccs>
  <notes/>
</stdyDscr>
<dataDscr>
</dataDscr></codeBook>
