## Abstract

This paper estimates neighborhood effects on adult labor market outcomes using the Moving to Opportunity (MTO) housing mobility experiment. We propose and implement a new strategy for identifying transition-specific effects that exploits identification of the unobserved component of a neighborhood choice model. Estimated local average treatment effects (LATEs) are large, result from moves between the first and second deciles of the national distribution of neighborhood quality, and pertain to a subpopulation of nine percent of program participants.

## I. Introduction

MOVING to Opportunity (MTO) was a once-in-a-generation housing mobility experiment designed to identify neighborhood effects (Shroder & Orr, 2012). Through the program, households living in high-poverty neighborhoods were randomly assigned housing vouchers for use in low-poverty neighborhoods. Researchers found that receiving an MTO voucher had no effect on labor market or education outcomes four to seven years after assignment (Kling, Liebman, & Katz, 2007a; Sanbonmatsu et al., 2006).

The lack of effects from receiving an MTO voucher has been interpreted as evidence against the existence of neighborhood effects (Ludwig et al., 2008, 2013). One of the strongest assumptions required for these program effects to be informative about effects from neighborhood environments is that neighborhood poverty measures neighborhood quality (Aliprantis, 2017a). This assumption did not hold in MTO because voucher recipients tended to move to predominantly black low-poverty neighborhoods (Clampet-Lundquist & Massey, 2008; Sampson, 2008), and black low-poverty neighborhoods are comparable to white high-poverty neighborhoods along dimensions other than poverty (Aliprantis & Kolliner, 2015).

This paper identifies neighborhood effects with the MTO data. We propose a methodological innovation through which transition-specific effects in an ordered treatment model can be identified with a single discrete instrument. We also use a measure of neighborhood quality that includes characteristics other than poverty.

We find that increasing neighborhood quality improves adult outcomes. Because our standard errors are large, we think focusing on any single-point estimate is less useful than looking at the broad picture painted by our estimates. All of the local average treatment effects (LATEs) we estimate conform to the theory that living in higher-quality neighborhoods improves adult labor market and health outcomes while decreasing receipt of welfare benefits.

Identifying LATEs in a model with multiple treatment levels is difficult (Athey & Imbens, 2017). Two prominent approaches either identify weighted averages of transition-specific effects (Angrist & Imbens, 1995) or require transition-specific instruments for each level of treatment (Heckman, Urzúa, & Vytlacil, 2006). Our identification strategy combines the strengths of each of these approaches in that it is both empirically feasible and economically interesting. The key insight of the identification strategy is that by observing the continuous level of treatment associated with each individual, it is possible to identify the unobserved, idiosyncratic component of their latent index in the treatment choice model. Assuming that this unobserved component is normally distributed generates an ordered probit model, which allows us to use well-known statistical methods in a novel approach to identification. Identification is obtained by comparing individuals who are similar along the observed and unobserved dimensions of the ordered choice model and who therefore select into different treatment levels just because of the randomly assigned instrument.

We think there are two key reasons why, in contrast to previous studies, we find evidence of neighborhood effects in the interim MTO data. The first is that previous studies have estimated effects for broader populations that on average did not experience large changes in neighborhood quality. For example, estimates of neighborhood effects on adult labor market outcomes are absent from the most prominent analysis of MTO because the program was found to have little effect on such outcomes for larger subpopulations than we study (Kling et al., 2007a).1 To find movers from the first to the second decile of neighborhood quality, we focus on a subpopulation so restrictive that our LATE estimates pertain to 9% of program participants.

The second reason is that while poverty may be the best single variable for characterizing neighborhood quality, other characteristics may be important drivers of outcomes. Consider that quasi-experimental studies show that neighborhood effects on employment could result from neighborhood referrals (Bayer, Ross, & Topa, 2008) or proximity to jobs (Andersson et al., 2018). These mechanisms are likely to be captured by our neighborhood quality index because the index includes unemployment and the employment-to-population ratio. Important mechanisms may be missed when considering poverty alone if low-poverty neighborhoods in MTO were negatively selected along other dimensions (Davis et al., 2020).

We do not conclude that MTO was a failed policy because few households were living in better neighborhoods four to seven years after randomization. While this limited mobility does restrict the evidence provided by MTO about neighborhood effects (Wilson, 1987), we think this mobility provides clear evidence for policy. Providing access to opportunity neighborhoods will require trying more interventions, whether that means making subsidies tied to smaller geographic areas (Collinson & Ganong, 2018), supporting landlords with capital-needs financing and program streamlining (Galvez, Simington, & Treskon, 2010), or experimenting with other models of assistance to make voucher tenants more attractive to landlords (Phillips, 2017) and to increase supply (Geyer & Sieg, 2013).

The paper proceeds as follows. Section II specifies our joint model of neighborhood choice and potential outcomes, as well as our strategy for identifying treatment effect parameters from this model. Section III describes the MTO housing mobility program, the data used in estimation, and descriptive statistics of those data. Section IV presents our empirical specification and estimation results, and section V concludes. The appendix presents a formal justification for our identification strategy, the algorithm we use to estimate the model, and the full specification of the estimated model.

## II. Model

We present a new approach to estimating LATEs where a multilevel treatment is modeled through an ordered choice model with a restricted class of random thresholds and a binary instrument. Along with a parametric specification of the ordered choice model, this threshold restriction allows us to estimate an unobserved component of choice that then determines the specific transitions across treatment for which LATEs can be recovered.

We are interested in estimating transition-specific LATEs for an outcome of interest $Y$ in response to an ordered treatment $D$, with instrument $Z$ influencing the level of treatment selection. Our application is the effect of neighborhood quality level on individual economic outcomes, using the random assignment of a housing mobility voucher as an instrument.

Let $Ω$ be the sample space of events representing all individual types $i$ in the experiment, with each individual possibly defining its own type.2

The following random variables on $Ω$ are observed:

• $X(i):Ω→Rn$ is a vector of characteristics observed by the researcher.

• $Z(i):Ω→{0,1}$ is an instrument.

• $W(i):Ω→{0,1}$ is take-up of the instrument.

• $D(i):Ω→{1,…,K}$ is the treatment.

• $Y(i):Ω→R$ is the outcome of interest.

The following random variables are unobserved:

• $V(i):Ω→R$ represents an unobserved component of neighborhood choice.

• $VM(i):Ω→R$ represents an unobserved component of instrument take-up.

We are interested in learning about the following counterfactual random variables:

• $WZ(i):Ω→{0,1}$ are the potential outcomes for take-up.

• $DZ(i):Ω→{1,…,K}$ are the potential outcomes for treatment.

• $Yk(i):Ω→R$ is the potential outcome for individual $i$ associated with treatment level $k$.

We will use lowercase and subindexes to denote realized values of these variables (e.g., $X(i)$ is realized as $xi$.).

Observed outcomes are related to potential outcomes as
$W(i)=W1(i)Z(i)+W0(i)(1-Z(i)),D(i)=D1(i)Z(i)+D0(i)(1-Z(i)),Y(i)=∑k=1KYk(i)1{D(i)=k}.$
Treatment selection among each one of $K$ possible treatment levels, $k=1,…,K$ is determined through an ordered choice model with random thresholds defined by
$DZ(i)=k⟺Ck-1-WZ(i)γk-1≤μ(X(i))-V(i)
(1)
where ${Ck}k=1K$ are positive and monotonically increasing in $k$, ${γk}k=1K$ are nonnegative, and $μ(X(i))$ is a linear function of $X(i)$. We assume that the take-up of a voucher is determined by the latent index model,3
$WZ(i)=1μM(X(i),Z(i))-VM(i)≥0.$

The individual-specific nature of the thresholds $Ck-WZ(i)γk$ reflects that the instrument's cost reduction at each level $k$ is allowed to be heterogeneous across individuals. The level-specific nature of the $WZ(i)γk$ is an important way that our model is different from the ordered choice models in Vytlacil (2006b), where a function of the instrument shifts each threshold by the same amount. In our model, setting the instrument $Z$ from 0 to 1 can influence each threshold differently. It is not only possible but probable that $γ1≠γ2≠⋯≠γK$. The differential impact of the instrument across individuals at each threshold $k$ depends on the values of $γk$.

For $Z$ to satisfy standard instrument conditions, we require that $Z$ be randomly assigned:
$Z(i)⊥(X(i),V(i),VM(i),Y1(i),…,YK(i)).$
(2)
We note that the nonnegativity of the ${γk}$ implies monotonicity, but need not imply that the instrument is relevant (Imbens & Angrist, 1994, condition 1ii) without further assuming that (a) $γkH>γkL$ for some $k$ and (b) $Pr(W1(i)-W0(i)=1)>0$.

### A. Identification: Ordered Choice Model

We are able to estimate the parameters of a random thresholds ordered choice model under three key restrictions:

• OC1: Each individual-specific threshold has only a binary support, with $WZ(i)γk∈{0,γk}$.

• OC2: Individuals cannot take up a voucher without it first being assigned, $W0(i)=0$ for all $i$.

• OC3: $V(i)∼N(0,1)$.

Invoking a binary support assumption like OC1, a related latent class model is estimated in Greene et al. (2014) and discussed at length in section 8.2 of Greene and Hensher (2010). Our version of OC1 is different from these latent class models because we use data on instrument take-up to infer class membership.

While there is little reason to doubt OC2 in the case of experimental MTO vouchers, it is possible that control group households sought out Section 8 vouchers through other programs. We do not observe this information, so we make assumption OC2, which we operationalize by assuming that in the selection equation $WZ(i)=1μM(X(i),Z(i))-VM(i)≥0$, we have
$μM(X(i),Z(i))=μM(X(i))-1,000,000×(1-Z(i)).$

OC3 is the only parametric assumption on unobservables necessary to identify LATEs, as we show in appendix C; parametric assumptions about the joint distribution of $(V,VM)$ are made only to learn about the relationship between unobservables in the neighborhood selection and take-up models. Under OC1-OC3, equation (1) becomes an ordered probit model with fixed thresholds whose parameters can be estimated via maximum likelihood optimization.4

Assumptions OC1 to OC3 are consistent with the way in which the MTO experiment was implemented. We can think of the counterfactual neighborhood choice facing experiment participants as a two-step process. The first step determines whether an offered voucher is taken up. In the second step, the individual chooses the level of quality of her neighborhood, which is measured four to seven years after voucher assignment.

In the first step, we interpret willingness to participate in the MTO program as the choice to use a voucher if feasible. If an individual is offered a voucher but does not take it up, implying $W1(i)=0$, we interpret this outcome as an indication that it was not feasible for the household to improve its housing using a Section 8 voucher under the program rules for reasons unknown to the household at the time it volunteered for the program.5

Identifying transition-specific LATEs by allowing thresholds to be a function of instrument assignment is attractive relative to the alternatives, even if we must restrict the heterogeneity in response to the instrument to be binary. While a single instrument can more easily identify the average causal response (ACR), this weighted average of transition-specific LATEs (Angrist & Imbens, 1995) would be difficult to interpret in our application, where transitions are rare. And while transition-specific instruments might impose weaker assumptions for identifying transition-specific LATEs (Heckman et al., 2006), in many applications, only a single instrument will be available. One alternative possibility that could be of interest in future work is to estimate average partial effects (Masten & Torgovitsky, 2016) over restricted ranges of neighborhood quality.

The key step in our identification strategy is identifying the unobserved component of neighborhood choice $vi$. This identification is achieved by interpolating between the discrete ordered choice model estimates to obtain estimates for a continuous choice model, which can then be used to infer $vi$ over its continuous support ($R$). A necessary condition on the data for this strategy to be feasible is that we observe a continuous measure of treatment, even when the ordered choice model assumes a discretized version of treatment. In our application, we do indeed observe a continuous measure of the treatment variable: neighborhood quality. This allows us to set up a continuous version of the treatment choice model in equation (1), in which $q$ denotes the observed continuous level of neighborhood quality. In this case, the optimal quality level of neighborhood $q*$ satisfies the following first-order condition (FOC):
$μ(xi)-vi-C(qi*)+wziγ(qi*)=0.$
(3)
By interpolating between estimates of ${C^k}$ and ${γ^k}$ from the estimated model in equation (1), we obtain estimates of the continuous functions $C^(q)$ and $γ^i(q)$ that can be substituted into the FOC:
$μ^(xi)-vi-C^(qi*)+wziγ^i(qi*)=0,$
(4)
from which we derive $v^i=μ^(xi)-C^(qi)+wziγ^(qi)$.

### B. Identification: LATEs

Having estimated this individual unobserved component of choice for all participants, we are able to identify transition-specific LATEs by estimating average outcome differences among those with similar $vi$'s and $xi$'s but different instrument assignments. As Heckman et al. (2006) noted, the LATE is remarkable for the patterns of selection under which the parameter is interpretable and potentially identified. In our application, this includes sorting on unobservables and matching on neighborhood amenities (Graham, 2018).

The causal effects identified with any data set are determined by selection into treatment and response to the instrument. In our case, we will only be able to identify $j$ to $j+1$ transition-specific LATEs (for $j∈{1,⋯,J-1}$) that pertain to a specific region of $xi$'s and $vi$'s where $Z$ induces changes in treatment.

In the empirical analysis, it will be convenient to refer interchangeably to parameters and sets defined in terms of $V$ and $UD$, where $UD(i)≡FV(V(i))$, the cumulative distribution function of $V$. We are interested in the regions of observed and unobserved characteristics, $(μ(X(i)),UD(i))$, corresponding to households that would select into treatment level $j$ in the absence of a voucher, but could possibly select into treatment level $j+1$ if given an MTO voucher. To focus on this group, we define
$Ωj≡{i∈Ω|D0(i)=j,D1(i)∈{j,j+1},Pr(D1(i)=j+1)>0}$
and the identification support set $Sj$ as
$Sj≡μ(X(i)),UD(i)|i∈Ωj.$
We stress that membership in the identification support set $Sj$ can be determined only after identifying each household's $vi$ as established in the previous section.

Determining $Sj$ gives us the ability to identify transition-specific LATEs with a single discrete instrument. In appendix A, we prove the following proposition.

Proposition 1.
The Wald estimator applied to the subsample of experimental and control households in $Sj$ identifies the $j$ to $j+1$ transition-specific LATE:
$▵j,j+1LATE(ZM)≡EYj+1-Yj|i∈Cj=EY|ZM=1,(μ(X),UD)∈SjM-EY|ZM=0,(μ(X),UD)∈SjMED|ZM=1,(μ(X),UD)∈SjM-ED|ZM=0,(μ(X),UD)∈SjM,$

where $Cj$ is the subset of $Sj$ for which $D1(i)=j+1$.

## III. Moving to Opportunity

### A. Program Description

Moving to Opportunity (MTO) was inspired by the promising results of the Gautreaux housing mobility program (Polikoff, 2006). The initial relocation process of the Gautreaux program created a quasi-experiment, and its results indicated housing mobility could be an effective policy. Relative to city movers, suburban movers from Gautreaux were more likely to be employed (Mendenhall, DeLuca, & Duncan, 2006), and the children of suburban movers attended better schools; were more likely to complete high school, attend college, and be employed; and had higher wages than city movers (Rosenbaum, 1995).

MTO was designed to replicate these beneficial effects, offering housing vouchers to eligible households between September 1994 and July 1998 in Baltimore, Boston, Chicago, Los Angeles, and New York (Goering, 2003). Households were eligible to participate in MTO if they were low income, had at least one child under age 18, were residing in either public housing or Section 8 project–based housing located in a census tract with a poverty rate of at least 40%, were current in their rent payment, and all household members were on the current lease and were without criminal records (Orr et al., 2003).

Households were drawn from the MTO waiting list through a random lottery. After being drawn, households were randomly allocated into one of three treatment groups. Households in the experimental group were offered Section 8 housing vouchers but were restricted to using them in census tracts with 1990 poverty rates of less than 10%. Households in this group were also provided with counseling and education through a local nonprofit. After one year had passed, households in the experimental group were unrestricted in where they used their Section 8 vouchers. Households in the Section-8-only comparison group were provided with no counseling and were offered Section 8 housing vouchers without any restriction on their place of use. And households in the control group continued receiving project-based assistance.6

### B. Data

The first source of data we use in our analysis is the MTO Interim Evaluation. The evaluation contains variables listing the census tracts in which households lived at both the baseline and in 2002, the time the interim evaluation was conducted. Neighborhood characteristics are measured at the tract level after merging the MTO sample with decennial census data from the National Historical Geographic Information System (NHGIS; Minnesota Population Center, 2004).

#### Variables.

We create an index of neighborhood quality using a combination of several neighborhood characteristics. Neighborhood characteristics measured by NHGIS variables are first transformed into percentiles of the national distribution from the 2000 Census. Principal components analysis is then used to determine which single vector accounts for the most variation in the national distribution of the poverty rate, the percent with a high school diploma, the percent with a BA or higher, the percent of single-headed households, the male employment-to-population ratio (EPR), and the female unemployment rate.7

With respect to interpretation, we think of effects from this index of neighborhood quality as resulting from the neighborhood-level determinants of social interactions, access to resources, and exposure to institutions. In terms of the categories from Manski (1993) and Graham (2018), such effects include correlated effects from common exposure to neighborhood-level factors influencing the labor market and health outcomes of interest. These factors span anything from police interactions and strategies to, for example, schools, safety, lead paint, pollution, access to fresh food, and transportation. We note that our neighborhood effects will not include all correlated effects due to racial discrimination, such as the stress from exposure to racism in labor markets, the criminal justice system, the health care system, and interpersonal encounters. Another issue is that we cannot distinguish between endogenous effects like those from peers engaging in violence versus correlated effects due to exposure to violence resulting from weak state institutions (Aliprantis, 2017b).

With respect to identification, we note that in our model, each level of neighborhood quality represents homogeneous types of social interactions and resources available at the neighborhood level. We assume agents experience neighborhood quality but cannot influence it because we empirically define neighborhoods as census tracts, which contain about 4,000 residents on average. If we were thinking about smaller reference groups, such as social groups in classrooms, we would be more interested in incorporating the endogenous formation of reference groups into the model and directly modeling interference (Brock & Durlauf, 2007; Manski, 2013). Since our treatment is determined by large reference groups outside of our MTO sample, we believe our partial equilibrium abstraction is appropriate.8

Baseline characteristics of the MTO households used in the model include baseline neighborhood quality, whether the respondent had family living in his or her neighborhood of residence, whether a member of the household was a victim of a crime in the previous six months, and whether there were teenage children in the household. Site of residence is the only other observed characteristic included in $X$; when models were estimated with additional variables such as the number of children or residence in an early HOPE VI building, the coefficients on the other observables were all statistically insignificant.

Outcome variables for adults from the MTO Interim Evaluation include the labor market status of the adult at the time of the interim survey (two binary variables—one indicating labor force participation, the other indicating whether the adult was employed), the self-reported total household income (all sources summed), the individual earnings in 2001 of the sample adult, receipt of Temporary Assistance for Needy Families (TANF) benefits, and the respondent's BMI. Weights are used in constructing all estimates.9

#### Sample and descriptive statistics.

The focus of this analysis is adults in the MTO Interim Evaluation sample satisfying two conditions. The first restriction ensures that we are focusing on a relatively homogeneous population. To satisfy this restriction, we drop all households living at baseline in a neighborhood above the 10th percentile of the national distribution of quality. These are exceptional households in MTO: the median baseline neighborhood quality for MTO participants was below the 1st percentile of the national distribution (see figure 1a). For Chicago, Los Angeles, and New York City, nearly all participants lived at baseline in neighborhoods below the 10th percentile of the national distribution. In Baltimore and Boston, however, at baseline, a nontrivial share of program participants lived in higher-quality neighborhoods, driven mainly by the male EPR and the share of adults holding a BA in their neighborhoods. These individuals represent a little under 15% of the interim evaluation sample and are dropped from our estimation sample.
Figure 1.

Neighborhood Quality in MTO

This figure shows the neighborhood quality distribution of MTO adults in various subsamples. A color version is available online.

Figure 1.

Neighborhood Quality in MTO

This figure shows the neighborhood quality distribution of MTO adults in various subsamples. A color version is available online.

The second sample restriction facilitates the estimation of the ordered choice model. To satisfy this restriction, we top-code neighborhood quality at the median of the national distribution of quality in 2000. Figure 1b shows the final results of these restrictions.

The final estimation sample used in our analysis has approximately 3,100 adults (a little over 85% of the interim sample and a little under 75% of the original adult sample). Our sample represents “the other 1%.” At baseline, 67% of the estimation sample lived in a neighborhood whose quality was below the 1st percentile of the national distribution of neighborhood quality. There was enough mobility in the control group so that by 2002, only 39% were living in 1st percentile neighborhoods. On the other hand, though, the mobility of the control group was not extraordinary. By 2002, about 80% of the sample in the control group lived in a neighborhood whose quality was less than the 10th percentile of the national distribution of neighborhood quality.

## IV. Empirical Specification and Estimation Results

### A. Ordered Choice Model Specification

When estimating the parameters of the ordered choice model, we define the discrete treatment levels by dividing the estimation sample into its deciles at the time of the interim survey. So $(q̲k,q¯k]$ are discrete levels of treatment with values for $q̲k$ and $q¯k$ in ${0,0.2,0.6,1.1,1.9,3.6,6.1,11,19,33}$, with $k=1$ corresponding to $(0,0.2]$, and K = 10 corresponding to (33, 50].

The marginal benefit of choosing to move from treatment level $k$ to $k+1$ in the ordered choice model is specified to be
$MBk(i)=μ(X(i))+WZ(i)γk-Ck-V(i)$
with components
$μ(X(i))=β1X1(i)+⋯+β8X8(i);γk=Γ0+Γk;Ck=δ0+δk.$

We interpret the marginal benefit function as resulting from a combination of multiple outcomes and housing costs (see appendix D for a full description).10 We can interpret $V(i)$ as the unobserved cost for household $i$ of moving up in the absence of a voucher program and $Wz(i)$ as the voucher take-up outcome observed under $Z(i)=z$. We assume for the sake of identification that $V(i)∼$ i.i.d. $N(0,1)$.11 Assuming that $V(i)$ follows a normal distribution does impose a parametric assumption.12 But in compensation, we can interpret neighborhood choice in terms of an ordered probit model. Moreover, this functional form assumption might be best viewed as a normalization since the specification of $C(q)$ is flexible. Our identication approach joins the recent literature offering novel solutions to identication problems when only discrete instruments are available (Brinch, Mogstad, & Wiswall, 2017; Kline & Walters, 2019, Pinto, 2018). The full empirical specification and likelihood function are specified in appendix C.

The first four variables in $X(i)$ are baseline neighborhood quality, whether the respondent had family living in the neighborhood of residence, whether a member of the household was a victim of a crime in the previous six months, and whether there were teenage children in the household at baseline. The final four variables in $X(i)$ are site indicators.

$Γ0$'s are site-specific fixed effects that capture differences in factors such as local labor and housing markets across sites. We do not attempt to explicitly model housing market prices since these are largely offset by the nature of the payment structure of the rental vouchers and project-based programs. Individuals pay 30% of their income toward rent in project-based units and pay this same rent when using vouchers as long as the price of rent is not above the FMR. Thus, the ability to lease up with a Section 8 voucher is more salient than price.

Like Galiani, Murphy, and Pantano (2015), we interpret $Z$ as the random assignment of a potential reduction in the cost to accessing a higher-quality neighborhood relative to staying in the baseline neighborhood. It is important to note that secular trends outside the control of the program might swamp this cost reduction. One can imagine changing costs to accessing higher-quality neighborhoods due to changes in the local labor or housing markets, changes in school quality due to the provision of magnet schools, or simply an improvement in the quality of the baseline neighborhood.

We estimate the parameters of this ordered choice model via maximum likelihood using the log-likelihood function in appendix C. We then interpolate to identify the continuous functions and estimate the $V(i)$, as described in appendix B.

#### Potential outcomes specification.

When estimating treatment effect parameters, we define the discrete treatment levels in the intervals $(q̲j,q¯j]$ in terms of deciles of the national distribution so that $q̲j∈{0,10,20,30,40}$, with $j=1$ corresponding to $(0,10]$, and J $=$ 5 corresponding to (40,50].13

We choose deciles to discretize neighborhood quality when investigating treatment effects not because we believe treatment should have an effect when crossing the particular thresholds of neighborhood quality used in this definition, but because we believe it offers the best balance between theoretical ideal and practical necessity. The model assumes that moves within a given level of treatment will not have effects on outcomes. Even if they do, it is enough to assume that individuals do not select within treatment levels based on rich information regarding neighborhood quality.14 If these assumptions do not hold within entire deciles of quality, the effects from such moves will likely enter the estimation results through the unoboserved components of the potential outcomes $Yj(i)$. Theoretically, one way to handle this issue would be to increase the number of bins until moves within a given level do not have effects on outcomes. Another way to handle this problem would be to reformulate the model to accommodate a continuous treatment (Florens et al., 2008; Masten & Torgovitsky, 2016).

Because of the limited mobility induced by MTO, we believe deciles of quality are likely to offer the smallest window on which it is feasible to estimate neighborhood effects using the MTO interim survey data. As the next section shows, even this discretization leaves us with an undesirably small sample size of compliers, resulting in noisy estimates. As a result, the only LATEs we attempt to estimate are of moves between $j=1$ ($q∈(0,10]$) and $j=2$ ($q∈(10,20]$).

### B. Estimation Results: Ordered Choice Model

The full model we estimate, specified in appendix C, takes into account the two types of vouchers used in MTO. Parameters corresponding to the experimental group voucher will be superindexed with $M$, while those corresponding to the standard Section 8 voucher will be superindexed by $S$.

The MTO voucher was much more effective than the standard Section 8 voucher in getting moving households to access higher-quality neighborhoods. Figure 2 shows program participants, color-coded by whether they lived in a neighborhood at the time of the interim evaluation ranked in the first, second, third, or fourth decile of the national distribution of neighborhood quality. On the $x$-axis is the $μ^(xi)$ of each household, and on the $y$-axis is the percentile of the household's unobserved determinant of selection in the absence of a program, $U^D(i)≡FV(V^(i))=Φ(V^(i))$. Since vouchers were randomly assigned in MTO, figures 2a and 2b illustrate counterfactual distributions of neighborhood quality under external manipulations to voucher type. For each household, given observed variables summarized by $μ(xi)$ and unobserved variables $uDi$, these figures show the neighborhood quality households would select into under each setting of the vouchers.
Figure 2.

Selection into Treatment

This figure shows how MTO adults sorted into the discrete quality levels $D(i)∈{1,2,3,4}$ as a function of their estimated choice model observed characteristics ($μ^(xi)$ on the $x$-axis) and unobserved characteristics ($U^D(i)$ on the $y$-axis). Since these characteristics determine how households would select into $D(i)$ in the absence of any voucher, and vouchers were randomly assigned to households, panels a and b characterize counterfactual choices under ideal interventions to voucher status. A color version is available online.

Figure 2.

Selection into Treatment

This figure shows how MTO adults sorted into the discrete quality levels $D(i)∈{1,2,3,4}$ as a function of their estimated choice model observed characteristics ($μ^(xi)$ on the $x$-axis) and unobserved characteristics ($U^D(i)$ on the $y$-axis). Since these characteristics determine how households would select into $D(i)$ in the absence of any voucher, and vouchers were randomly assigned to households, panels a and b characterize counterfactual choices under ideal interventions to voucher status. A color version is available online.

In figure 2a, we can see that almost all of the control group remained in low-quality neighborhoods, most remaining in the first decile of neighborhood quality. Only households with very high observed factors $μ^(xi)$ and very low unobserved cost factors $U^D(i)$ managed to move to higher-quality neighborhoods, even when defined as moving only to the second, third, or fourth deciles of the national distribution.

Figure 2b shows that the MTO voucher induced some households to move to higher-quality neighborhoods. Most of this mobility is from the first to the second decile of the national distribution of quality. Although it was still relatively rare, the MTO voucher did induce some households to move into the third and fourth deciles of neighborhood quality.

If a richer set of variables had been recorded through MTO, we could have gained more insight into the personal circumstances that influence neighborhood quality choice. This in turn would have refined the comparison of outcomes for individuals across similar unobservables, likely improving the precision of our LATE estimates. However, in the absence of more informative variables, our estimated choice model is still revealing of meaningful differences across observables like cities and baseline neighborhood quality.

We draw three conclusions for housing policy from the ordered choice model estimates. Our first conclusion is that programs should be tailored to their local housing market, as programs that work for the housing authority in one city might not work in other locations. We draw this conclusion from the fact that in both the absence of any program and in terms of program uptake, there tends to be greater variation across cities than across household characteristics. In the first column of table 1, the difference in predicted neighborhood quality between living in Boston and either Los Angeles or New York City is equal to nearly a full standard deviation of the unobserved factors determining neighborhood quality ($V$).

Table 1.
Ordered Choice Model Parameter Estimates
 $Xk$ and $V$ $β^k$ $β^kS$ $β^kM$ Baseline characteristics Teens in household −0.08 −0.48 −0.39 (0.05) (0.10) (0.09) Family in neighborhood −0.14 −0.16 0.00 (0.05) (0.12) (0.06) Household member victim 0.03 0.10 0.10 (0.05) (0.10) (0.10) Baseline neighborhood quality 0.13 −0.02 −0.10 (0.01) (0.03) (0.02) Site fixed effects/constant Baltimore 0 0.03 −0.25 — (0.13) (0.13) Boston 0.31 −0.49 0.02 (0.10) (0.21) (0.02) Chicago −0.04 0.04 −0.50 (0.09) (0.13) (0.13) Los Angeles −0.52 0.39 0.47 (0.10) (0.17) (0.13) New York City −0.58 0.60 −0.13 (0.09) (0.15) (0.15) Unobserved factors $ρS$ and $ρM$ — 0.07 −0.17 — (0.10) (0.11)
 $Xk$ and $V$ $β^k$ $β^kS$ $β^kM$ Baseline characteristics Teens in household −0.08 −0.48 −0.39 (0.05) (0.10) (0.09) Family in neighborhood −0.14 −0.16 0.00 (0.05) (0.12) (0.06) Household member victim 0.03 0.10 0.10 (0.05) (0.10) (0.10) Baseline neighborhood quality 0.13 −0.02 −0.10 (0.01) (0.03) (0.02) Site fixed effects/constant Baltimore 0 0.03 −0.25 — (0.13) (0.13) Boston 0.31 −0.49 0.02 (0.10) (0.21) (0.02) Chicago −0.04 0.04 −0.50 (0.09) (0.13) (0.13) Los Angeles −0.52 0.39 0.47 (0.10) (0.17) (0.13) New York City −0.58 0.60 −0.13 (0.09) (0.15) (0.15) Unobserved factors $ρS$ and $ρM$ — 0.07 −0.17 — (0.10) (0.11)

Large differences across cities are estimated for the take-up and impacts of vouchers, and these differences vary across program design. In the second column of table 1, large differences in the likelihood of take-up for a standard Section 8 voucher can be seen between Boston and either Los Angeles or New York City. However, the largest differences in the take-up of the experimental MTO voucher are between residents of Los Angeles and either Chicago or Baltimore.15 Appendix C.3 shows that there were large differences in cost reduction across cities within program design: the cost reduction from MTO take-up in Los Angeles and New York was almost double the reduction in Baltimore and Boston, and the same was true for the standard Section 8 voucher.16

Our second conclusion is that changing program requirements will change the composition of households that take up the program. While initial neighborhood quality had little influence on the take-up of standard Section 8 vouchers, take-up of the experimental MTO voucher was skewed toward households initially living in the lowest-quality neighborhoods. Initially living in a neighborhood at the 9th percentile of quality rather than a neighborhood at the 1st percentile, for example, would decrease the likelihood of taking up the MTO voucher as much as a standard deviation in the unobserved factors determining take-up ($VM$).

Third, while some household characteristics had surprisingly small influences over take-up, household structure did matter. Having a teen in the household reduced the likelihood of moving with either type of voucher. It is possible that the majority of respondents' motivation for moving out of public housing, to get away from drugs and gangs (Kling et al., 2007a), was felt most strongly for younger children. This explanation would be consistent with the coefficients on young children in Shroder (2002). Another possibility we suspect is that having teens in one's household created a hurdle to the successful take-up of a voucher. We think it is most likely that room occupancy restrictions according to age and gender of children may have made it harder for households with older children to find housing, since younger children are more likely to be allowed to share a room.17

### C. Estimation Results: LATEs of Neighborhood Quality

#### What effects are identified?

Recalling the counterfactual distributions displayed in figure 2, there is a range of values of $(μ(Xi),UD(i))$ for which households could be induced by receiving an MTO voucher to move from a $D=1$ quality neighborhood to quality $D=2$. Since $DZ$ is not only a function of $μ(xi)$ and $UD(i)$ but also of $WZ(i)$, households with the same $μ(xi)$ and $UD(i)$ could be in different levels of $D$ depending on whether they take up the voucher. There is another range for which households would not move from $D=1$ (those with high $UD(i)$), and there are also ranges for which households could be induced to make other moves, such as from $D=2$ to $D=3$.

Owing to the observed patterns of neighborhood selection displayed in figure 2, we focus on identifying the effects of moving from the first to the second decile of neighborhood quality.18 Table 2 characterizes some of the changes in neighborhood characteristics that would typically accompany a move from $D=1$ to $D=2$. On average, the poverty rate would decline from 33% to 22%, BA attainment would go from 7% to 11%, the share of single-headed households would drop from 52% to 38%, and the female unemployment rate would drop from 16% to 10%. While these changes in neighborhood characteristics are nontrivial, it is worth pointing out that they are still far worse than the unconditional median neighborhood in the United States in 2000, and changes of these magnitudes would have to occur several times to achieve the characteristics of the highest-quality neighborhoods. As discussed in Aliprantis (2017a) and elsewhere, these are moves from the most extreme areas of the left tail of the distribution of quality to neighborhoods that are still within the left tail of quality.19

Table 2.
Average Neighborhood Characteristics in 2000
 Neighborhood Characteristic Mean $|D=1$ Mean $|D=2$ Unconditional Median Mean $|D=10$ Poverty rate (%) 33 22 9 3 High school diploma (%) 55 65 83 95 BA (%) 7 11 19 52 Single-headed households (%) 52 38 24 11 Female unemployment rate (%) 16 10 5 2 Male EPR (⁠$×100$⁠) 55 65 79 89
 Neighborhood Characteristic Mean $|D=1$ Mean $|D=2$ Unconditional Median Mean $|D=10$ Poverty rate (%) 33 22 9 3 High school diploma (%) 55 65 83 95 BA (%) 7 11 19 52 Single-headed households (%) 52 38 24 11 Female unemployment rate (%) 16 10 5 2 Male EPR (⁠$×100$⁠) 55 65 79 89

#### For whom are effects identified?

Appendix B shows how we determine the support of $(μ(xi),UD(i))$ for which LATEs of moving from $D=1$ to $D=2$ are identified. The identification support set shown by the shaded region in figure 3, is
$S1,2M≡{μ(xi),UD(i)|μ(xi)∈-0.6,0.4,UD(i)∈0.43+0.30μ(xi),0.68+0.15μ(xi)}.$
Families in this group select into neighborhood quality $D=1$ without any voucher, and into neighborhood quality $D=1$ or $D=2$ with an MTO voucher assigned. We now include the superscript on $S$ to indicate that the set is associated with the MTO voucher and subscripts for both $j$ and $j+1$ to be clear about the transition with which the set is associated.
Figure 3.

Selection into Treatment and Identification Support Set $S1,2M$

This figure shows how MTO adults sorted into the discrete quality levels $D(i)∈{1,2,3,4}$ as a function of their estimated choice model observed characteristics ($μ^(xi)$ on the $x$-axis) and unobserved characteristics ($U^D(i)$ on the $y$-axis). We now also show the identification of the support set $S1,2M$. A color version is available online.

Figure 3.

Selection into Treatment and Identification Support Set $S1,2M$

This figure shows how MTO adults sorted into the discrete quality levels $D(i)∈{1,2,3,4}$ as a function of their estimated choice model observed characteristics ($μ^(xi)$ on the $x$-axis) and unobserved characteristics ($U^D(i)$ on the $y$-axis). We now also show the identification of the support set $S1,2M$. A color version is available online.

#### LATEs of neighborhood quality estimation results.

Estimates of LATEs of neighborhood quality are reported for the subpopulation in $S1,2M$ in table 3. We interpret the large standard errors as a result of the small sample size used in estimation and note that the mechanical relationship generating bias in linear-in-means models does not hold in our model (see note 8). Since the estimates pertain to the select subgroup induced to move to higher-quality neighborhoods through MTO, they are not generalizable.

Table 3.
 Outcome $▵1,2LATEZM$ Control Mean in $S1,2M$ Labor Market In labor force (%) 25.8 53.2 (18.3) Employed (%) 31.2 41.7 (20.1) Household income ($) 5,616 13,506 (3,914) Earnings ($) 1,970 7,642 (4,066) Welfare benefits Received TANF (%) −40.0 39.9 (19.2) Health BMI (raw) −3.1 30.9 (2.8)
 Outcome $▵1,2LATEZM$ Control Mean in $S1,2M$ Labor Market In labor force (%) 25.8 53.2 (18.3) Employed (%) 31.2 41.7 (20.1) Household income ($) 5,616 13,506 (3,914) Earnings ($) 1,970 7,642 (4,066) Welfare benefits Received TANF (%) −40.0 39.9 (19.2) Health BMI (raw) −3.1 30.9 (2.8)

$▵1,2LATEZM$ estimates pertain to individuals with observed and unobserved choice model components in $S1,2M≡{(μ(xi),UD(i))|μ(xi)∈[-0.6,0.4],uD(i)∈[0.43+0.30μ(xi),0.68+0.15μ(xi)]}$. Control means are also computed for the subsample in this region and outside this region (both conditional on $D(i)=1$). Standard errors are computed using 200 bootstrap replications.

All of our point estimates are large. However, all of our standard errors are also large. Thus, we think focusing on any single point estimate is less useful than looking at the broad picture painted by these estimates.

Viewing MTO through the lens of our model of neighborhood effects produces a very different picture from viewing MTO through its program effects. All of the LATEs in table 3 conform to the theory that living in higher-quality neighborhoods improves adult labor market and health outcomes while decreasing receipt of welfare benefits. This contrasts with prominent research on MTO that interprets the program as evidence against the theory that living in higher-quality neighborhoods improves adult labor market outcomes.

We think distance and information frictions are the most likely mechanisms for explaining our results on labor market outcomes. As measured by the commute to work, distance could operate through employers' discrimination (Phillips, forthcoming) or commuting costs to the employee (Gobillon & Selod, 2014). Information frictions could operate through the employer's search due to differences in referral networks across neighborhoods (Bayer et al., 2008) or through the employee's search due to inner-city workers gaining information about suburban employment opportunities after moving closer to them (Holzer & Reaser, 2000).

Our labor market4 estimates are consistent with several strands of the literature. Evidence on local labor markets, neighborhood effects, labor supply elasticities, and spatial mismatch all suggest there could be especially large neighborhood effects for the MTO population. Recent work indicates that local labor market effects can be experienced over a very small spatial scale (Manning & Petrongolo, 2017; Mansfield, 2018). Looking at the effects of specific neighborhood characteristics, Weinberg, Reagan, and Yankow (2004) find highly nonlinear effects on hours worked. Improving the female employment from 1 standard deviation below the mean to the mean would increase an individual's hours worked by 11%, while the next improvement of 1 standard deviation would only increase hours worked by only 4%.

We suspect our estimates should be larger than those in Weinberg et al. (2004) because their effects are estimated on a somewhat representative sample of men, while the MTO population comprises primarily single, low-income, black women living in the lowest-quality neighborhoods in the United States. Bargain, Orsini, and Peichl's (2014)) cross-country analysis finds large (married) female labor supply wage elasticities in countries where female participation is low, as well as particularly large elasticities among low-wage single individuals. Andersson et al. (2018) find that better job accessibility considerably decreases the duration of joblessness among lower-income displaced workers, with especially large effects for blacks and women. In addition to Weinberg et al.'s (2004)) own suggestive but inconclusive evidence that neighborhood effects are larger for blacks, evidence on the spatial mismatch hypothesis has found that black workers are negatively affected when establishments move to the suburbs (Zax & Kain, 1996), and that the black share of employees is decreasing in distance from the central business district, even within firms that operate multiple establishments in the same city (Miller, 2018).

Our point estimates must be interpreted with the caveat that they all have large standard errors. This is not surprising, since our estimates apply to such a small subpopulation of the MTO participants. Subject to this caveat, we find large point estimates plausible when considering the literature on local labor markets, nonlinear neighborhood effects, female labor elasticities, and the spatial mismatch hypothesis together with the highly selected nature of our subsample.20

#### Falsification test: Outcomes for non-complier households.

As a check on our selection model and to highlight the difference between our analysis and the program effect approach adopted in most of the literature on MTO (Ludwig et al., 2008, 2013; Chetty, Hendren, & Katz, 2016), we now use figure 4 to define a falsification set $F1,2M$ for which households would remain in neighborhoods of quality $D=1$ even if they were assigned an MTO voucher:
$F1,2M≡μ(xi),UD(i)|D0(i)=D1(i)=1.$
Household membership in the falsification set is determined not only by the values of $μ(xi),UD(i)$ but also by the general cost function $C(q)$, the value of the cost reduction function $γM(q)$ at various levels of quality $q$, and instrument take-up $WM(i)$. We can see from figure 4 that restricting households to those with $UD(i)∈[0.7,1]$ identifies the falsification set $F1,2M$.
Figure 4.

Falsification Set $F1,2M$

This figure shows how MTO adults sorted into the discrete quality levels $D(i)∈{1,2,3,4}$ as a function of their estimated choice model observed characteristics ($μ^(xi)$ on the $x$-axis) and unobserved characteristics ($U^D(i)$ on the $y$-axis). The falsification set $F1,2M$ is the group of households identified by the choice model to counterfactually all have $D0(i)=D1(i)=1$. A color version is available online.

Figure 4.

Falsification Set $F1,2M$

This figure shows how MTO adults sorted into the discrete quality levels $D(i)∈{1,2,3,4}$ as a function of their estimated choice model observed characteristics ($μ^(xi)$ on the $x$-axis) and unobserved characteristics ($U^D(i)$ on the $y$-axis). The falsification set $F1,2M$ is the group of households identified by the choice model to counterfactually all have $D0(i)=D1(i)=1$. A color version is available online.

Neighborhood selection for the subpopulations in $S1,2M$ and $F1,2M$ is shown in the CDFs in figure 5. Figure 5a shows considerable variation in the neighborhood quality selected by the control and MTO voucher holders in the identification support set $S1,2M$: no households in the control group selected into $D=2$, while 37% of MTO voucher holders did. Neighborhood selection for households in the falsification set $F1,2M$ was quite different: no households in the control group selected into $D=2$, and only 2% of MTO voucher holders selected into $D=2$.
Figure 5.

Selection into Neighborhood Quality for Various Subpopulations

This figure shows the distributions of $qZ(i)$ for (a) the identification support set $S1,2M$ and (b) the falsification set $F1,2M$. We see that the model does appear able to identify groups whose counterfactual neighborhood selection is quite different, and that selection into the continuous measure of neighborhood quality is consistent with the selection into the discrete measure of neighborhood quality shown earlier. A color version is available online.

Figure 5.

Selection into Neighborhood Quality for Various Subpopulations

This figure shows the distributions of $qZ(i)$ for (a) the identification support set $S1,2M$ and (b) the falsification set $F1,2M$. We see that the model does appear able to identify groups whose counterfactual neighborhood selection is quite different, and that selection into the continuous measure of neighborhood quality is consistent with the selection into the discrete measure of neighborhood quality shown earlier. A color version is available online.

The effects of the MTO program are compared in table 4 for households in the LATE identification support set $S1,2M$ and for households in the falsification set $F1,2M$. While receiving an MTO voucher resulted in large improvements to labor force participation rates for households in the group with improvements in neighborhood quality (those in $S1,2M$), there was no effect on labor force participation for households in the group with no improvement in neighborhood quality. Employment actually went down for those whose neighborhood quality did not improve, perhaps due to the disruptiveness of moving without the benefits of moving closer to jobs (Weinberg, 2000; Andersson et al., 2018). And while welfare (TANF) receipt and BMI decreased for voucher recipients who did not move to higher-quality neighborhoods, this effect was much larger for those who did move to a higher-quality neighborhood. Table 5 compares the ITT effects for the subpopulations in table 4 with the ITT effects estimated in the literature for all adults.

Table 4.
Adult Program Effects Estimates by Neighborhood Selection Groups
 Falsification Set $(μ(xi),UD(i))∈F1,2M$ (No Change in Neighborhood Quality) Identification Set $(μ(xi),UD(i))∈S1,2M$ (Improvement in Neighborhood Quality) $E[Y|ZM]$ $E[Y|ZM]$ $ZM=1$ $ZM=0$ $E[Y|ZM=1]-$$E[Y|ZM=0]$ $ZM=1$ $ZM=0$ $E[Y|ZM=1]-$$E[Y|ZM=0]$ Neighborhood selection Neighborhood quality (⁠$D$⁠) 1.02 1.00 0.02 1.37 1.00 0.37 (0.07) (0.00) (0.07) (0.10) (0.09) (0.13) Neighborhood quality (⁠$q$⁠) 1.7 0.4 1.2 6.4 1.1 5.3 (0.8) (0.1) (0.8) (1.0) (0.9) (1.3) Labor market In labor force (%) 63.6 63.6 0.0 63.0 53.2 9.8 (3.9) (5.4) (6.4) (3.2) (3.8) (4.7) Employed (%) 47.1 53.6 −6.5 53.5 41.7 11.8 (4.2) (5.4) (6.8) (3.3) (3.9) (5.0) Household income ($) 14,252 14,134 119 15,629 13,506 2,123 (924) (998) (1,366) (847) (883) (1,175) Earnings ($) 7,583 8,554 −971 8,364 7,642 722 (914) (992) (1,375) (611) (767) (917) Welfare benefits Received TANF (%) 32.2 33.7 −1.5 24.9 39.9 −15.0 (3.7) (5.0) (6.6) (3.0) (3.4) (4.6) Health BMI (raw) 30.0 30.4 −0.3 29.7 30.9 −1.2 (0.5) (0.8) (1.0) (0.5) (0.5) (0.7)
 Falsification Set $(μ(xi),UD(i))∈F1,2M$ (No Change in Neighborhood Quality) Identification Set $(μ(xi),UD(i))∈S1,2M$ (Improvement in Neighborhood Quality) $E[Y|ZM]$ $E[Y|ZM]$ $ZM=1$ $ZM=0$ $E[Y|ZM=1]-$$E[Y|ZM=0]$ $ZM=1$ $ZM=0$ $E[Y|ZM=1]-$$E[Y|ZM=0]$ Neighborhood selection Neighborhood quality (⁠$D$⁠) 1.02 1.00 0.02 1.37 1.00 0.37 (0.07) (0.00) (0.07) (0.10) (0.09) (0.13) Neighborhood quality (⁠$q$⁠) 1.7 0.4 1.2 6.4 1.1 5.3 (0.8) (0.1) (0.8) (1.0) (0.9) (1.3) Labor market In labor force (%) 63.6 63.6 0.0 63.0 53.2 9.8 (3.9) (5.4) (6.4) (3.2) (3.8) (4.7) Employed (%) 47.1 53.6 −6.5 53.5 41.7 11.8 (4.2) (5.4) (6.8) (3.3) (3.9) (5.0) Household income ($) 14,252 14,134 119 15,629 13,506 2,123 (924) (998) (1,366) (847) (883) (1,175) Earnings ($) 7,583 8,554 −971 8,364 7,642 722 (914) (992) (1,375) (611) (767) (917) Welfare benefits Received TANF (%) 32.2 33.7 −1.5 24.9 39.9 −15.0 (3.7) (5.0) (6.6) (3.0) (3.4) (4.6) Health BMI (raw) 30.0 30.4 −0.3 29.7 30.9 −1.2 (0.5) (0.8) (1.0) (0.5) (0.5) (0.7)

The first three columns of this table report the effects of receiving an experimental MTO voucher for households predicted by the estimated choice model to reside in a low-quality neighborhood even when receiving an MTO voucher. The last three columns report the effects of receiving an experimental MTO voucher for households predicted by the estimated choice model to potentially move to a higher-quality neighborhood when receiving an MTO voucher.

Table 5.
Adult Program Effects Estimates by Neighborhood Selection Groups
 $E[Y|ZM=1]-E[Y|ZM=0]$ $(μ(xi),$$UD(i))∈S1,2M$ $(μ(xi),$$UD(i))∈F1,2M$ All Adults Labor market In labor force (%) 9.8 0.0 3.8 (4.7) (3.9) (2.0) Employed (%) 11.8 −6.5 1.4 (5.0) (6.8) (2.1) Household income ($) 2,123 119 239 (1,175) (1,366) (571) Earnings ($) 722 −971 136 (917) (1,375) (443) Welfare benefits Received TANF (%) −15.0 −1.5 −2.1 (4.6) (6.6) (1.9) Health BMI (raw) −1.2 −0.3 (0.7) (1.0)
 $E[Y|ZM=1]-E[Y|ZM=0]$ $(μ(xi),$$UD(i))∈S1,2M$ $(μ(xi),$$UD(i))∈F1,2M$ All Adults Labor market In labor force (%) 9.8 0.0 3.8 (4.7) (3.9) (2.0) Employed (%) 11.8 −6.5 1.4 (5.0) (6.8) (2.1) Household income ($) 2,123 119 239 (1,175) (1,366) (571) Earnings ($) 722 −971 136 (917) (1,375) (443) Welfare benefits Received TANF (%) −15.0 −1.5 −2.1 (4.6) (6.6) (1.9) Health BMI (raw) −1.2 −0.3 (0.7) (1.0)

The first column of this table reports the ITT effects of receiving an experimental MTO voucher for households predicted by the estimated choice model to potentially move to a higher-quality neighborhood when receiving an MTO voucher. The second column reports the ITT effects of receiving an experimental MTO voucher for households predicted by the estimated choice model to remain in a low-quality neighborhood even when receiving an MTO voucher. The final column reports estimates from the literature of the ITT effects of receiving an experimental MTO voucher for all adults. The ITT on TANF receipt for all adults comes from table F3 from Kling, Liebman, and Katz (2007b), with the rest of the all adults ITT effects coming from tables D7.1a, D8.1, and D8.2 in Orr et al. (2003). The effects reported in the final column are regression-adjusted with robust standard errors.

This falsification test helps to illustrate that the effects of the MTO program are not interchangeable with the effects from neighborhood quality. A list of assumptions must be made before translating effects of variation in MTO voucher assignment into effects of variation in neighborhood quality. Our assumptions have been stated explicitly in sections II to IV; most assumptions in the MTO literature have been made implicitly (Aliprantis, 2017a).

## V. Conclusion

Because households endogenously sort into neighborhoods, identifying the causal effects of neighborhood environments has proven to be a substantial challenge. The Moving to Opportunity (MTO) housing mobility experiment gave households living in high-poverty neighborhoods in five US cities the ability to enter a lottery for housing vouchers to be used in low-poverty neighborhoods. The results from MTO have led to surprising and controversial inferences about neighborhood effects.

This paper identified neighborhood effects in MTO using a methodological innovation allowing for the identification of transition-specific effects with a binary instrument. We found that moving to a higher-quality neighborhood had large, positive effects on adult labor market outcomes and welfare receipt. While our estimates are noisy, we found no evidence from MTO against the theory that increasing neighborhood quality improves adult outcomes.

It is important to think about our results while being conscious of the difficulty of interpreting experiments in social settings (Deaton, 2010). Programs and neighborhood changes differentially affecting the treatment and control groups will interfere with the ability to identify neighborhood effects using voucher assignment as an instrument. Ideally, the voucher assignment would induce a change in the cost of moving, holding all else equal. However, households may have responded to their group assignment, and baseline neighborhood conditions may well have changed during the multiple-year period between the decision to move and the time of the interim evaluation when outcomes were measured. For example, households assigned to the control group might have responded by applying for Section 8 vouchers on their own outside of the MTO program (Orr et al., 2003). And according to de Souza Briggs, Popkin, and Goering (2010), during the implementation of MTO, the Jobs-Plus program saturated public housing developments with state-of-the-art employment, training, and child care services, while providing rent incentives to encourage employment. In addition, the United States enacted major welfare reform legislation in August 1996, precisely while MTO vouchers were being assigned (Blank, 2002).

We also think that our results support MTO-like policy innovations. The fact that households were more likely to move with Section 8 vouchers than MTO vouchers does not imply that Section 8 vouchers are preferable to MTO vouchers. Changes in neighborhood quality were much smaller for Section 8 movers than for MTO movers, and variation by site was large. Since only about a quarter of eligible households are currently able to obtain a Section 8 housing voucher (Sard & Fischer, 2012), an area for future research is understanding what changes in voucher policy might optimize the extent to which households are able to realize positive neighborhood effects through the subsidy, and which of these policies might be feasible to implement (McClure, 2010; Collinson & Ganong, 2018).

## Notes

1

Kling et al. (2007a) focus their attention on “outcomes that exhibit significant treatment effects” (p. 83).

2

While it is possible that more than two individuals will correspond to the same type $i$, we will refer to $i$ as individual rather than individual type for ease of notation. Because individuals are assumed to be household heads, we sometimes also refer to household $i$.

3

Our selection model rules out the possibility of interference. While there is reason to believe this assumption is violated in the context of MTO (Ellen, Suher, & Torrats-Espinosa, 2019), we leave relaxing this assumption to future work because doing so creates significant new obstacles to identification (Sobel, 2006; Manski, 2013).

4

The full likelihood is specified in appendix C.

5

We would generally interpret this as time restrictions on initial take-up combined with supply constraints, where supply constraints could be driven at least in part by landlords avoiding voucher holders (Phillips, 2017). This outcome could also result from the realization of idiosyncratic shocks after volunteering for the program.

6

Section 8 vouchers pay part of a tenant's private-market rent. Project-based assistance gives the option of a reduced-rent unit tied to a specific structure.

7

The coefficients relating each component to the index vector all have magnitudes similar to that of the coefficient for poverty. While poverty is, as expected, negatively correlated with quality, Aliprantis (2017a) shows that there exist neighborhoods that are both low poverty and low quality.

8

Following the discussion on page 106 of Angrist (2014), our model and definitions of variables serve to distinguish in several ways between the subjects of investigation and the peers who might causally affect them. Each outcome variable is a single raw variable pertaining to the sample of MTO participants. Our treatment variable is an index of multiple variables, each measured in terms of percentiles, and pertains to samples of census tract residents. Discretizing treatment serves to break whatever mechanical relationship might remain between the outcome and treatment variables.

9

Weights are used for two reasons. First, random assignment ratios varied both from site to site and over different time periods of sample recruitment. Randomization ratio weights are used to create samples representing the same number of people across groups within each site-period. This ensures that neighborhood effects are not conflated with time trends. Second, sampling weights must be used to account for the subsampling procedures used during the collection of the interim evaluation data.

10

We see testing the exclusion restriction that $Z$ influences neighborhood quality only through take-up as an area for future work once tests like those in Kitagawa (2015) and Huber and Mellace (2015) are generalized to the case of a multivalued discrete treatment.

11

In the full model, we assume that $(V(i),VS(i))$ and $(V(i),VM(i))$ are jointly normal, where $VS(i)$ and $VM(i)$ are unobserved variables influencing the decision of household $i$ to take up a Section 8 voucher and an MTO voucher when these are offered. We stress that these assumptions of joint normality are made only to learn about take-up of the Section 8 and Experimental MTO vouchers, not in order to identify LATEs. Appendix C presents a full explanation of why our identification of LATEs requires only the assumption that $V(i)∼N(0,1)$.

12

Such a parametric assumption is not always evoked for identification of ordered choice models. For an example, see the nonparametric identification in Vytlacil (2006a).

13

Results in appendix A allow us to move freely between ordered choice models with different discretizations of quality. This allows us to estimate the choice model on the discretization of quality ${qk}$ that is best for predicting choice and then, using the inferred continuous model, to translate those estimates to the discretization ${qj}$ that is most appropriate for potential outcomes.

14

This can be seen as a stronger version of the central identifying assumption in Bayer et al. (2008).

15

While our ordering of city fixed effects is different from the ordering in Shroder (2002), we believe this is primarily explained by his specification including the metropolitan vacancy rate, which is subsumed in our city fixed effects, and our specification including baseline neighborhood quality.

16

We note that the variation in the cost reduction across cities was larger for the MTO voucher than for the Section 8 voucher and also that we see large differences in cost reduction within a single city across program design. In Chicago, the cost reduction from taking up an MTO voucher was quite high, but the cost reduction from a standard Section 8 voucher was low.

17

See Chyn, Hyman, and Kapustin (2019) and Currie and Yelowitz (2000) for discussions of this policy.

18

We have also estimated average causal responses (ACRs) for subsets in which many possible moves are induced. The results are broadly consistent with our LATE estimates. Implementing Masten and Torgovitsky's (2016)) strategy for average partial effects (APEs) is left for future work.

19

See Clampet-Lundquist and Massey (2008) or Quigley and Raphael (2008) for related discussions. Our effects are from changes in neighborhoods that are comparable to those in Pinto (2018), smaller than those considered in Altonji and Mansfield (2018), and in a different part of the distribution than in Galster et al. (2016).

20

Regarding sample selection, the $D1=2,D0=1$ compliers to which our estimates pertain comprise about 10% of our MTO population, with the population of MTO volunteers representing about 25% of eligible households living in the poorest neighborhoods in the United States.

## REFERENCES

Aliprantis
,
D.
, “
Assessing the Evidence on Neighborhood Effects from Moving to Opportunity,
Empirical Economics
52
(
2017a
),
925
954
.
Aliprantis
,
D.
Human Capital in the Inner City,
Empirical Economics
53
(
2017b
),
1125
1169
.
Aliprantis
,
D.
, and
D.
Kolliner
, “
Neighborhood Poverty and Quality in the Moving to Opportunity Experiment,
Federal Reserve Bank of Cleveland Economic Commentary
(
2015
).
Altonji
,
J.
, and
R.
Mansfield
, “
Estimating School and Neighborhood Effects Using Sorting on Observables to Control for Sorting on Unobservables,
American Economic Review
108
(
2018
),
2902
2946
.
,
F.
,
J. C.
Haltiwanger
,
M. J.
Kutzbach
,
H. O.
Pollakowski
, and
D. H.
Weinberg
, “
Job Displacement and the Duration of Joblessness: The Role of Spatial Mismatch,
this review
100
(
2018
),
203
218
.
Angrist
,
J. D.
, “
The Perils of Peer Effects,
Labour Economics
30
(
2014
),
98
108
.
Angrist
,
J. D.
, and
G. W.
Imbens
, “
Two-Stage Least Squares Estimation of Average Causal Effects in Models with Variable Treatment Intensity,
Journal of the American Statistical Association
90
:
430
(
1995
),
431
442
.
Athey
,
S.
, and
G.
W
.
Imbens “The State of Applied Econometrics: Causality and Policy Evaluation,
Journal of Economic Perspectives
31
:
2
(
2017
),
3
32
.
Bargain
,
O.
,
K.
Orsini
, and
A.
Peichl
, “
Comparing Labor Supply Elasticities in Europe and the United States: New Results,
Journal of Human Resources
49
(
2014
),
723
838
.
Bayer
,
P.
,
S. L.
Ross
, and
G.
Topa
, “
Place of Work and Place of Residence: Informal Hiring Networks and Labor Market Outcomes,
Journal of Political Economy
116
:
6
(2008),
1150
1196
.
Blank
,
R. M.
, “
Evaluating Welfare Reform in the United States,
Journal of Economic Literature
40
(
2002
),
1105
1166
.
Brinch
,
C. N.
,
M.
, and
M.
Wiswall
, “
Beyond LATE with a Discrete Instrument,
Journal of Political Economy
125
(
2017
),
985
1039
.
Brock
,
W.
, and
S.
Durlauf
, “
Identification of Binary Choice Models with Social Interactions,
Journal of Econometrics
140
(
2007
),
52
75
.
Chetty
,
R.
,
N.
Hendren
, and
L. F.
Katz
, “
The Effects of Exposure to Better Neighborhoods on Children: New Evidence from the Moving to Opportunity Experiment,
American Economic Review
106
(
2016
),
855
902
.
Chyn
,
E.
,
J.
Hyman
, and
M.
Kapustin
, “
Housing Voucher Take-Up and Labor Market Impacts
,”
Journal of Policy Analysis and Management
38
:
1
(
2019
),
65
98
.
Clampet-Lundquist
,
S.
, and
D. S.
Massey
, “
Neighborhood Effects on Economic Self- Sufficiency: A Reconsideration of the Moving to Opportunity Experiment,
American Journal of Sociology
114
(
2008
),
107
143
.
Collinson
,
R.
, and
P.
Ganong
, “
How Do Changes in Housing Voucher Design Affect Rent and Neighborhood Quality?
American Economic Journal: Economic Policy
,”
10
:
2
(
2018
),
62
89
.
Currie
,
J.
and
A.
Yelowitz
, “
Are Public Housing Projects Good for Kids?
Journal of Public Economics
75
(
2000
),
99
124
.
Davis
,
M. A.
,
J.
Gregory
,
D. A.
Hartley
, and
K.
Tan
, “
Neighborhood Effects and Housing Vouchers,
Federal Reserve Bank of Chicago working paper 2017-02
(
2020
).
de Souza Briggs
,
X.
,
S. J.
Popkin
, and
J.
Goering
,
Moving to Opportunity: The Story of an American Experiment to Fight Ghetto Poverty
(
New York
:
Oxford University Press
,
2010
).
Deaton
,
A.
, “
Instruments, Randomization and Learning about Development,
Journal of Economic Literature
48
(
2010
),
424
455
.
Ellen
,
I. G.
,
M.
Suher
, and
G.
Torrats-Espinosa
, “
Neighbors and Networks: The Role of Social Interactions on the Residential Choices of Housing Choice Voucher Holders,
Journal of Housing Economics
43
(
2019
),
56
71
.
Florens
,
J.-P.
,
J. J.
Heckman
,
C.
Meghir
, and
E.
Vytlacil
, “
Identification of Treatment Effects Using Control Functions in Models with Continuous, Endogenous Treatment and Heterogeneous Effects,
Econometrica
76
(
2008
),
1191
1206
.
Galiani
,
S.
,
A.
Murphy
, and
J.
Pantano
, “
Estimating Neighborhood Choice Models: Lessons from a Housing Assistance Experiment,
American Economic Review
105
(
2015
),
3385
3415
.
Galster
,
G.
,
A.
Santiago
,
L.
Stack
, and
J.
Cutsinger
, “
Neighborhood Effects on Secondary School Performance of Latino and African American Youth: Evidence from a Natural Experiment in Denver,
Journal of Urban Economics
93
(
2016
),
30
48
.
Galvez
,
M. M.
,
J.
Simington
, and
M.
Treskon
,
Moving to Work and Neighborhood Opportunity: A Scan of Mobility Initiatives by Moving to Work Public Housing Authorities
(
Washington, DC
:
What Works Collaborative
,
2010
).
Geyer
,
J.
, and
H.
Sieg
, “
Estimating a Model of Excess Demand for Public Housing,
Quantitative Economics
4
(
2013
),
483
513
.
Gobillon
,
L.
, and
H.
Selod
,
Spatial Mismatch, Poverty, and Vulnerable Populations
(
Berlin
:
Springer
,
2014
).
Goering
,
J.
, “
The Impacts of New Neighborhoods on Poor Families: Evaluating the Policy Implications of the Moving to Opportunity Demonstration,
Economic Policy Review
9
(
2003
),
113
140
.
Graham
,
B. S.
, “
Identifying and Estimating Neighborhood Effects,
Journal of Economic Literature
56
(
2018
),
450
500
.
Greene
,
W.
,
M. N.
Harris
,
B.
Hollingsworth
, and
P.
Maitra
, “
A Latent Class Model for Obesity,
Economics Letters
123
(
2014
),
1
5
.
Greene
,
W. H.
, and
D. A.
Hensher
,
Modeling Ordered Choices: A Primer
(
Cambridge
:
Cambridge University Press
2010
).
Heckman J. J.
,
S. Urzúa
, and
E.
Vytlacil
, “
Understanding Instrumental Variables in Models with Essential Heterogeneity,
this review
88
(
2006
)
389
432
.
Holzer
,
H. J.
, and
J.
Reaser
, “
Black Applicants, Black Employees, and Urban Labor Market Policy,
Journal of Urban Economics
48
(
2000
),
365
387
.
Huber
,
M.
, and
G.
Mellace
, “
Testing Instrument Validity for LATE Identification Based on Inequality Moment Constraints,
this review
97
(
2015
),
398
411
.
Imbens
,
G. W.
, and
J. D.
Angrist
, “
Identification and Estimation of Local Average Treatment Effects,
Econometrica
62
(
1994
),
467
475
.
Kitagawa
,
T.
, “
A Test for Instrument Validity,
Econometrica
83
(
2015
),
2043
2063
.
Kline
,
P.
, and
C. R.
Walters
, “
On Heckits, LATE, and Numerical Equivalence,
Econometrica
87
(
2019
),
677
696
.
Kling
,
J. R.
,
J. B.
Liebman
, and
L. F.
Katz
, “
Experimental Analysis of Neighborhood Effects,
Econometrica
75
(
2007a
),
83
119
.
Kling
,
J. R.
,
J. B.
Liebman
, and
L. F.
Katz
Supplement to ‘Experimental Analysis of Neighborhood Effects’: Web Appendix,
Econometrica
75
(
2007b
),
83
119
.
Ludwig
,
J.
,
G. J.
Duncan
,
L. A.
Gennetian
,
L. F.
Katz
,
R. C.
Kessler
,
J. R.
Kling
, and
L.
Sanbonmatsu
, “
Long-Term Neighborhood Effects on Low-Income Families: Evidence from Moving to Opportunity,
American Economic Review
103
(
2013
),
226
231
.
Ludwig
,
J.
,
J. B.
Liebman
,
J. R.
Kling
,
G. J.
Duncan
,
L. F.
Katz
,
R. C.
Kessler
, and
L.
Sanbonmatsu
, “
What Can We Learn about Neighborhood Effects from the Moving to Opportunity Experiment?
American Journal of Sociology
114
(
2008
),
144
188
.
Manning
,
A.
, and
B.
Petrongolo
, “
How Local Are Labor Markets? Evidence from a Spatial Job Search Model,
American Economic Review
107
(
2017
),
2877
2907
.
Mansfield
,
R. K.
, “
How Local Are US Labor Markets? Using an Assignment Model to Forecast the Geographic Incidence of Local Labor Demand Shocks,
(
2018
).
Manski
,
C. F.
, “
Identification of Endogenous Social Effects: The Reflection Problem,
Review of Economic Studies
60
(
1993
),
531
542
.
Manski
,
C. F.
Identification of Treatment Response with Social Interactions,
Econometrics Journal
16
(
2013
),
S1
S23
.
Masten
,
M. A.
, and
A.
Torgovitsky
, “
Identification of Instrumental Variable Correlated Random Coefficients Models,
this review
98
(
2016
),
1001
1005
.
McClure
,
K.
, “
The Prospects for Guiding Housing Choice Voucher Households to High Opportunity Neighborhoods,
Cityscape
12
:
3
(
2010
),
101
122
.
Mendenhall
,
R.
,
S.
DeLuca
, and
G.
Duncan
, “
Neighborhood Resources, Racial Segregation, and Economic Mobility: Results from the Gautreaux Program,
Social Science Research
35
(
2006
),
892
923
.
Miller
,
C.
, “
When Work Moves: Job Suburbanization and Black Employment,
NBER working paper 24728
(
2018
).
Minnesota Population Center
,
National Historical Geographic Information System
(Pre-release Version 0.1 ed.)
, (
Minneapolis
:
University of Minnesota
2004
).
Orr
,
L. L.
,
J. D.
Feins
,
R.
Jacob
,
E.
Beecroft
,
L.
Sanbonmatsu
,
L. F.
Katz
,
J. B.
Liebman
, and
J. R.
Kling
Moving to Opportunity: Interim Impacts Evaluation
(
Washington, DC
:
US Department of Housing and Urban Development, Office of Policy Development and Research
,
2003
).
Phillips
,
D. C.
. “
Landlords Avoid Tenants Who Pay with Vouchers,
Economics Letters
151
(
2017
),
48
52
.
Phillips
,
D. C.
Do Low-Wage Employers Discriminate against Applicants with Long Commutes? Evidence from a Correspondence Experiment,
Journal of Human Resources
,
forthcoming
.
Pinto
,
R.
, “
Noncompliance as a Rational Choice: A Framework That Exploits Compromises in Social Experiments to Identify Causal Effects,
UCLA mimeograph
(
2018
).
Polikoff
,
A.
,
Waiting for Gautreaux
(
Evanston, IL
:
Northwestern University Press
,
2006
).
Quigley
,
J. M.
, and
S.
Raphael
, “
Neighborhoods, Economic Self-Sufficiency, and the MTO Program,
Brookings-Wharton Papers on Urban Affairs
8
(
2008
),
1
46
.
Rosenbaum
,
J. E.
, “
Changing the Geography of Opportunity by Expanding Residential Choice: Lessons from the Gautreaux Program,
Housing Policy Debate
6
(
1995
),
231
269
.
Sampson
,
R. J.
, “
Moving to Inequality: Neighborhood Effects and Experiments Meet Social Structure,
American Journal of Sociology
114
(
2008
),
189
231
.
Sanbonmatsu
,
L.
,
J. R.
Kling
,
G. J.
Duncan
, and
J.
Brooks-Gunn
, ”
Neighborhoods and Academic Achievement: Results from the Moving to Opportunity Experiment,
Journal of Human Resources
41
(
2006
),
649
691
.
Sard
,
B.
, and
W.
Fischer
, “
Renter's Tax Credit Would Promote Equity and Advance Balanced Housing Policy,
Center on Budget and Policy Priorities report
(
2012
).
Shroder
,
M.
, “
Locational Constraint, Housing Counseling, and Successful Lease-Up in a Randomized Housing Voucher Experiment,
Journal of Urban Economics
51
(
2002
),
315
338
.
Shroder
,
M. D.
, and
L. L.
Orr
, “
Moving to Opportunity: Why, How, and What Next?
Cityscape
14
(
2012
),
31
56
.
Sobel
,
M. E.
, “
What Do Randomized Studies of Housing Mobility Demonstrate? Causal Inference in the Face of Interference,
Journal of the American Statistical Association
101
(
2006
),
1398
1407
.
Vytlacil
,
E
, “
A Note on Additive Separability and Latent Index Models of Binary Choice: Representation Results,
Oxford Bulletin of Economics and Statistics
68
(
2006a
),
515
518
.
Vytlacil
,
E
Ordered Discrete-Choice Selection Models and Local Average Treatment Effect Assumptions: Equivalence, Nonequivalence and Representation Results,
this review
88
(
2006b
)
578
581
.
Weinberg
,
B. A.
, “
Black Residential Centralization and the Spatial Mismatch Hypothesis,
Journal of Urban Economics
48
(
2000
),
110
134
.
Weinberg
,
B. A.
,
P. B.
Reagan
, and
J. J.
Yankow
, “
Do Neighborhoods Affect Hours Worked? Evidence from Longitudinal Data,
Journal of Labor Economics
22
(
2004
),
891
924
.
Wilson
,
W. J.
,
The Truly Disadvantaged: The Inner City, the Underclass, and Public Policy
(
Chicago
:
University of Chicago
,
1987
).
Zax
,
J. S.
, and
J. F.
Kain
, “
Moving to the Suburbs: Do Relocating Companies Leave Their Black Employees Behind?
Journal of Labor Economics
14
(
1996
),
472
504
.

## Author notes

We thank Nathaniel Baum-Snow, Greg Caetano, Karim Chalak, Ben Craig, Joel Elvery, Kyle Hood, Jon James, Jeffrey Kling, Carlos Lamarche, Fabian Lange, Alvin Murphy, David Phillips, Stephen Ross, Randall Walsh, the editor Bryan Graham, and several anonymous referees for helpful comments, as well as seminar participants at Case Western, the Cleveland Fed, Fribourg, GATE/Lyon II, Pitt, Rochester, UC Riverside, the 2016 Federal Reserve System Regional conference, 2016 Midwest Econometrics Group conference, 2015 Society of Labor Economists conference, 2014 Econometric Society European Meetings, 2013 Midwest Economics Association conference, 2013 Urban Economics Association conference, and 2012 Kentucky-Cleveland Fed Economics of Education Workshop. We also thank Emily Burgen and Nelson Oliver for valuable research assistance and Paul Joice at HUD for his assistance with the data set. The views stated here are our own and are not necessarily those of the Federal Reserve Bank of Cleveland or the Board of Governors of the Federal Reserve System.

A supplemental appendix is available online at http://www.mitpressjournals.org/doi/suppl/10.1162/rest_a_00933.