Abstract

Residential segregation by race grew sharply during the early twentieth century as black migrants from the South arrived in northern cities. Using newly assembled neighborhood-level data, we provide the first systematic evidence on the impact of prewar population dynamics within cities on the emergence of the American ghetto. Leveraging exogenous changes in neighborhood racial composition, we show that white flight in response to black arrivals was quantitatively large and accelerated between 1900 and 1930. A key implication of our findings is that segregation could have arisen solely from the flight behavior of whites.

I.  Introduction

AMONG the most durable and salient features of American urban life, residential segregation has been implicated in a wide variety of social ills. As a result, the question of how cities came to be segregated, and how that segregation has been sustained, has received widespread attention in both economics and the social sciences more broadly. Economists tend to emphasize two classes of mechanisms that could generate segregation: collective action by whites that raises the costs to blacks of migrating into white neighborhoods and white flight, whereby whites vacate neighborhoods experiencing black in-migration and select into higher-priced neighborhoods that blacks generally cannot afford.

The patterns of segregation that typify the American ghetto were largely established prior to World War II.1 While there is strong evidence that white flight from center cities to the suburbs was a particularly important factor in the entrenchment of these high levels of segregation during the postwar era (Boustan, 2010), there is little empirical evidence on the role of white flight prior to the war. During this period's rapid rise in segregation, which occurred prior to the opening of the suburbs, flight would necessarily have transpired at the neighborhood level as whites departed city blocks experiencing black in-migration for other areas inside the city that were expected to remain racially homogeneous.

This lack of evidence on intracity sorting sets the context for the current literature, which argues that these decentralized mechanisms were not very important in explaining the establishment of segregation in the early twentieth century. In their seminal work, Massey and Denton (1993) vividly describe coordinated house bombings of recently arrived black families and the formation of neighborhood “improvement” associations that existed solely to maintain the color line with restrictive covenants. Similarly, Cutler, Glaeser, and Vigdor (1999) point to black-white rent differentials in the 1940 census as evidence supporting the importance of institutional barriers in establishing the black ghetto. This work lends support to the mainstream view among social scientists that during early waves of the Great Migration, segregation grew out of collective action by whites who sought to restrict the location choices of blacks.

Yet no evidence precludes the possibility that white flight might have been significant and altered the racial geography of cities in important ways during the early twentieth century. In this paper, we provide the first empirical investigation of the impact of urban population dynamics on the emergence of racial segregation in prewar American neighborhoods. Our findings demonstrate that white flight at the neighborhood level was occurring as early as the 1910s, decades before the opening of the suburbs in most cities. Furthermore, white households were sorting away from black arrivals when many formal and informal institutional alternatives to “protecting the neighborhood” were common and often legal. In contrast, our results suggest that far from being a postwar phenomenon, white flight was integral to the rise of segregation in American cities. To illustrate this point, we conclude our empirical work with a counterfactual exercise demonstrating that due to the estimated magnitude of flight behavior, segregation would have emerged in American cities even if blacks had faced far fewer barriers in the housing market.

The lack of panel data covering early-twentieth-century neighborhoods largely explains the paucity of prior research on prewar population dynamics. We address this limitation by constructing a finely grained, spatially identified demographic data set covering ten of the largest northern cities in 1900, 1910, 1920, and 1930. Our empirical work begins by identifying a causal link between black in-migration and white flight. We use exogenous changes in neighborhood-level black populations that we isolate by interacting variation in the state-level outmigration rates of blacks with within-city cross-neighborhood variation in the state of origin of early black arrivals. This approach facilitates the exploration of neighborhood-level population dynamics.2

Our analysis provides clear evidence of white flight from blacks in the early twentieth century; moreover, the flight effect appears to accelerate over the three decades we study. Results from a naive OLS analysis find one black arrival in the preceding decade associated with 0.9 and 1.5 white departures during the 1910s and 1920s, respectively. Of course, these OLS results fail to account for endogeneity concerns and could, for instance, be explained solely by the one-for-one replacement of white movers by black migrants in an environment with inelastic housing supply. However, our instrumental variables analysis, which assigns estimated state-level black outflows from southern states to northern neighborhoods according to black settlement patterns prior to the Great Migration, indicates that one exogenous black arrival was associated with 1.9 white departures in the 1910s and 3.4 white departures during the 1920s. These IV results suggest that OLS estimates were biased against a finding of flight, likely due to both white and black settlement being drawn to generally growing neighborhoods.

In the second portion of our analysis, we construct a series of counterfactual exercises aimed at understanding how much of the observed increase in segregation over the 1900 to 1930 period can be attributed to white flight from black arrivals in the absence of institutional barriers constructed by whites. The most striking finding is the sharp increase in the contribution of flight in each subsequent decade. While our preferred estimates suggest that white flight was inconsequential during the 1900s, we estimate that flight can explain 34% of the increase in segregation (as measured by dissimilarity) over the 1910s and 50% of the increase over the 1920s. The impact of flight in the latter decade is particularly important given that the 1920s saw the largest decadal increase in segregation in the twentieth century.

Our finding that sorting by whites out of neighborhoods with growing black populations was a quantitatively important phenomenon in the prewar decades of the twentieth century is novel. To be clear, these results do not call into question the presence of widespread collective action by whites, about which the historical record is quite clear. They do, however, suggest that segregation would likely have arisen even without the presence of discriminatory institutions as a direct consequence of the widespread and decentralized relocation decisions of white individuals within an urban area. Whites likely would have responded to policies that reduced barriers to black settlement in their vicinity by accelerating their departure for neighborhoods within the city that were at lower risk of “encroachment.” Policies that reduce barriers faced by blacks in the housing market may thus not prevent or reverse segregation as long as white households have a desire to avoid black neighbors or concerns about the quality of public goods and amenities in neighborhoods experiencing racial turnover.

The paper proceeds as follows. Section II reviews the historical context for the black migration from the South and neighborhood population dynamics in northern cities. Section III discusses the construction of the data set used in this paper. Section IV details our empirical approaches for measuring white flight, and section V presents our results. Section VI relates our finding to the observed increase in segregation. Section VII concludes.

II.  Background on Segregation and Urbanization in the United States

A.  Historical Background on the Great Migration

Scholars have long argued that the groundwork of the black ghetto was laid during the first decades of the twentieth century as black populations in northern cities grew, leading to the sharp increase in the racial segregation of neighborhoods. African Americans' migration to northern cities began to accelerate on the eve of World War I, an event that brought European immigration to a temporary halt while simultaneously increasing demand for industrial production. These wartime developments in the northern labor market coincided with the arrival of the Mexican boll weevil in Mississippi and Alabama (1913 and 1916, respectively), which devastated cotton crops and led to a decline in demand for black tenant farmers (Grossman, 1991). This combination of push and pull factors led to unprecedented out-migration from the South: 525,000 blacks moved to the North in the 1910s and 877,000 in the 1920s (Farley & Allen, 1987).

Cities were growing at an unprecedented rate during these initial decades of the twentieth century, but black migrants from the South were just one source of urban population growth. European immigrants were numerically more important, particularly prior to the implementation of the first National Immigration Act in 1921. Segregation thus emerged against a backdrop of rapid urbanization, in contrast to the postwar era, which saw significant suburbanization and declines in urban populations. The share of the population residing in central cities grew from 14% to 33% between 1880 and 1930, leveling off subsequently.3 Cities grew from a combination of increasing density and due to the annexation and development of outlying areas. In our sample, the population density of the average urban neighborhood in increased by 68% between 1900 and 1930 (see table 1).4

Table 1.
Summary Statistics for Hexagon Panel Data Set
1900191019201930
Black percent 2.24 2.25 2.74 4.54 
 (3.86) (4.28) (6.45) (11.78) 
White third-generation percent 36.31 37.06 39.87 41.47 
 (16.65) (16.74) (18.22) (18.91) 
White second-generation percent 34.00 34.09 33.09 32.42 
 (9.95) (9.10) (9.39) (10.99) 
White first-generation percent 26.12 26.20 23.46 21.49 
 (10.11) (11.64) (11.00) (10.55) 
Population 2,504 3,160 3,802 4,216 
 (3,857) (4,239) (4,343) (3,874) 
Decadal change in white population  650.36 590.66 282.60 
  (1,147.63) (1,259.64) (1,741.58) 
Decadal change in black population  20.54 48.83 118.32 
  (51.62) (172.30) (190.35) 
Decadal change in white third-generation population  206.60 323.05 186.10 
  (484.36) (540.35) (657.94) 
Decadal change in white second-generation population  217.03 172.87 121.90 
  (470.84) (503.40) (696.04) 
Decadal change in white first-generation population  228.15 69.25 29.08 
  (545.40) (539.00) (717.29) 
1900191019201930
Black percent 2.24 2.25 2.74 4.54 
 (3.86) (4.28) (6.45) (11.78) 
White third-generation percent 36.31 37.06 39.87 41.47 
 (16.65) (16.74) (18.22) (18.91) 
White second-generation percent 34.00 34.09 33.09 32.42 
 (9.95) (9.10) (9.39) (10.99) 
White first-generation percent 26.12 26.20 23.46 21.49 
 (10.11) (11.64) (11.00) (10.55) 
Population 2,504 3,160 3,802 4,216 
 (3,857) (4,239) (4,343) (3,874) 
Decadal change in white population  650.36 590.66 282.60 
  (1,147.63) (1,259.64) (1,741.58) 
Decadal change in black population  20.54 48.83 118.32 
  (51.62) (172.30) (190.35) 
Decadal change in white third-generation population  206.60 323.05 186.10 
  (484.36) (540.35) (657.94) 
Decadal change in white second-generation population  217.03 172.87 121.90 
  (470.84) (503.40) (696.04) 
Decadal change in white first-generation population  228.15 69.25 29.08 
  (545.40) (539.00) (717.29) 

Changes in population are also with respect to the previous decade's value. All demographic variables were created using the 100% sample of census records from Ancestry.com. Only hexagons with at least 95% coverage by enumeration districts from the respective census in each year are included in the panel. We also trim the sample at the 1st and 99th percentiles of both white and black population change for each decade. We also trim at the 99th percentile of the ratio of white to white household heads and black to black household heads. The statistics presented cover the balanced panel of 1,975 hexagon neighborhoods that remain after these trims.

While our empirical analysis will focus on the urban core of our sample cities, developments in the periphery are important for understanding our results. Although some streetcar suburbs existed by 1910, white flight in this period can primarily be thought of as departures for neighborhoods farther away from the downtown but still within city boundaries. Public transit became cheaper over this period with the proliferation of electric streetcars, subways, and, toward the end of the period, the widespread adoption of the private automobile. Thus, the cost of departing neighborhoods at risk of racial turnover decreased over this period.

Of course, white homeowners who wished to live in a racially homogeneous neighborhood could also choose to fight black arrivals using a host of methods, including violence, restrictive covenants, or appeals to the city government to pass a racial zoning ordinance. The last option was invalidated by the 1917 Supreme Court case Buchanan v. Warley, which ruled that racial zoning laws interfered with the property rights of landowners.5 Restrictive covenants remained enforceable until 1948, and existing empirical work has found that these institutions were effective in constraining where blacks could live (Kucheva & Sander, 2010). Violence and related threats are difficult to study, but a large body of qualitative research has argued that such behaviors on the part of white urban residents had a profound impact on where African Americans lived. Historians have documented that in Chicago, one black home was bombed on average each month between 1917 and 1921 (Drake & Cayton, 1970). Thus, while some mechanisms used to deter black settlement became legally unavailable during the first decades of the early twentieth century, others were still very much in use. Our results can thus be thought of as examining the extent of white flight in a period when transport costs were declining and collective action by whites to maintain the color line remained commonplace.

B.  The Rise of Segregation in the United States

We begin our empirical analysis by confirming the understanding of this rise in segregation levels using our newly constructed spatial data set. We measure segregation using the two most common indices of segregation: isolation and dissimilarity. A standard isolation index measures the percent black in the neighborhood of the average black resident; we follow Cutler, Glaeser, and Vigdor (1999) and compute a modified index that controls for the fact that under the standard approach, there is a potential for the index to be highly sensitive to changes in the overall group share. Our second segregation measure is the dissimilarity index (Duncan & Duncan, 1955). This index ranges from 0 to 1, with 1 representing the highest degree of dissimilarity between where whites and blacks in a city reside. Intuitively, the index reveals what share of the black (or white) population would need to relocate in order for both races to be evenly distributed across a city.

The Cutler et al. (1999) segregation indices are presented in figure 1. They were constructed using ward-level data for censuses prior to 1940 (the year when census tract data became widely available) and tract-level data in later decades. To make the ward and tract-level data comparable, Cutler et al. (1999) estimate the relationship between tract-level and ward-level indices in 1940 and then use the estimated 1940 relationship to rescale the ward-level estimates in earlier years. Using our new enumeration district-level data (discussed in section III), we compute these same segregation measures over the 1900 to 1930 time frame at both the enumeration district and ward level and report the results in figure 2. As expected given their smaller scale, enumeration district-level segregation indices are markedly higher than those computed at the ward level (the average enumeration district had 1,400 individuals, while wards could have as many as 100,000 residents in large cities). However, the trends in ward and enumeration district segregation are nearly parallel, showing a steep increase between 1900 and 1930.

Figure 1.

Segregation Trends in the Largest Ten American Cities, 1890–2000

Data are taken from the data set used in Cutler, Glaeser, and Vigdor (1999) and show the average segregation indices across Baltimore, Boston, Brooklyn, Chicago, Cincinnati, Cleveland, Detroit, Manhattan, Philadelphia, Pittsburgh, and St. Louis. We employ their adjustment factor to make the ward-level indices from 1930 and before comparable to the 1940 and onward tract-level indices.

Figure 1.

Segregation Trends in the Largest Ten American Cities, 1890–2000

Data are taken from the data set used in Cutler, Glaeser, and Vigdor (1999) and show the average segregation indices across Baltimore, Boston, Brooklyn, Chicago, Cincinnati, Cleveland, Detroit, Manhattan, Philadelphia, Pittsburgh, and St. Louis. We employ their adjustment factor to make the ward-level indices from 1930 and before comparable to the 1940 and onward tract-level indices.

Figure 2.

Segregation Trends by Enumeration and Ward, 1900–1930

See figure 1 for notes on the ward and adjusted ward data from Cutler et al. (1999). The enumeration district segregation averages are computed using the universe of census records from each of the ten sample cities accessed from Ancestry.com. ED: enumeration district; CGV: Cutler, Glaeser, and Vigdor (1999).

Figure 2.

Segregation Trends by Enumeration and Ward, 1900–1930

See figure 1 for notes on the ward and adjusted ward data from Cutler et al. (1999). The enumeration district segregation averages are computed using the universe of census records from each of the ten sample cities accessed from Ancestry.com. ED: enumeration district; CGV: Cutler, Glaeser, and Vigdor (1999).

These figures underscore how crucial these early decades were for the emergence of racial residential segregation in America.6 In the first three decades of the twentieth century, the ten northern U.S. cities we study in this paper experienced 97% of their overall twentieth-century increase in dissimilarity and 63% of their increase in isolation.7 We note that large cities, which received proportionally larger black inflows in the prewar decades, attained their peak level of residential segregation by race earlier in the twentieth century than did smaller urban areas in the North. For instance, Cutler et al. (1999) find that segregation continued to increase markedly after 1930 for their more comprehensive sample of northern cities.

III.  Enumeration District Data, 1900–1930

The analysis in this paper is based on a new enumeration district-level spatial data set spanning the years 1900 through 1930.8 There are two major components to these data: census-derived microdata retrieved from Ancestry.com and digitized enumeration district maps. The census-derived microdata cover 100% of the population of ten large cities over four census years. For the twentieth-century decades (1900, 1910, 1920, and 1930), we collected the universe of census records for Baltimore, Boston, Cincinnati, Chicago, Cleveland, Detroit, New York City (Manhattan and Brooklyn boroughs), Philadelphia, Pittsburgh, and St. Louis from the genealogy website Ancestry.com. To maximize the usefulness of the data set for our purpose, we selected northern cities that received substantial inflows of black in-migration. This sample contains the ten largest northern cities in the United States in 1880 and nine of the ten largest cities in the United States in 1930. The combined population of these cities was 9.3 million in 1900 and over 18 million in 1930, which is about half of the total population in the largest 100 cities in both years.

The microdata compiled for this paper represent a significant improvement over existing sources of data on early twentieth-century urban populations. Ward-level tabulations published by the census are the smallest unit at which 100% counts were previously available for the combination of cities and years that we study. Wards, which are still in use in some cities today, are large political units used to elect city council members, while enumeration districts were small administrative units used internally by the census to coordinate enumeration activities prior to the shift to mail surveys in 1960. Each individual record in the Ancestry.com data set includes place of birth, father's place of birth, mother's place of birth, year of birth, marital status, gender, race, year of immigration (for foreign-born individuals), and relation to head of household, in addition to place of residence (city, ward, and enumeration district) at the time of the respective census.

To place these individuals in urban space, we create digitized versions of census enumeration district maps based on two types of information available from the National Archives. We first employ written descriptions of the enumeration districts that are available on microfilm from the National Archives and have been made available online due to the work of Stephen P. Morse.9 Second, we use a nearly complete set of physical enumeration district maps for our census city pairs in the maps section of the National Archives. We took digital photographs of these maps as a second source for our digitization effort. Working primarily with geocoded (GIS) historic base street maps that were developed by the Center for Population Economics (CPE) at the University of Chicago, research assistants generated GIS representations of the enumeration district maps that are consistent with the historic street grids.10 Appendix figure I provides an illustration of this process, which generated maps of more than 35,000 distinct enumeration districts. Here, the shaded regions in panel D represent the digitized enumeration districts.

Analyzing demographic change over time within neighborhoods requires neighborhood definitions that are constant across census years. Using these data to form such neighborhoods is challenging because enumeration districts were redrawn for each decadal census, and unlike the case of modern census tracts, most changes were more complex than simple combinations or bifurcations. To address this challenge, we employ a hexagon-based imputation strategy. The strategy is illustrated in appendix figure II. It involves covering the enumeration district maps (panel A) with an evenly spaced, temporally invariant grid of 800 meter hexagons (panel B) and then computing the intersection of these two sets of polygons (panel C). Hexagons were chosen because they are the most compact way to tile the plane with symmetric shapes. The chosen size yields average populations that are comparable to those of census tracts.11 The count data from the underlying enumeration districts are attached to individual hexagons based on the percentage of the enumeration district's area that lies within the individual hexagon. Panel D presents the allocation weights for a sample hexagon. In the example, 100% of four enumeration districts lies completely within the hexagon (136, 139, 140, and 144), while eleven enumeration districts are partially covered by the hexagon. For these partial enumeration districts, only fractions of their counts are attributed to the hexagon, ranging from a minimum of 0.2% (155) to 93.6% (142).

We form a balanced panel of all hexagons that were at least 95% covered by enumeration districts from the respective census in each year from 1900 to 1930, also trimming at the 1st and 99th percentile of both white and black population change for each decade to eliminate outliers from the sample. In table 1 we provide summary statistics for the balanced sample of 1,975 hexagon neighborhoods. The neighborhoods have an average population of 3,160 individuals in 1910 and 4,216 in 1930, with the increase in density reflecting the rise in urban population density that occurred over this period. By 1930, the neighborhoods are thus roughly similar in population to census tracts today. The average white population growth is positive in all years but declined from 650 over the 1900s to 282 over the 1920s, with much of this slowdown due to declining immigration from Europe after World War I and passage of the Immigration Restriction Act of 1921. The average percent black increased from 2.2% to 4.5% over the 1900 to 1930 period.

IV.  Empirical Strategy

The objective of our empirical work is to ascertain whether black arrivals had a causal impact on white population dynamics over the 1900 to 1930 period. The primary difficulty in identifying such an effect is that minorities do not arrive in neighborhoods exogenously. For example, newly arriving blacks may choose locations that were already being abandoned by white natives for reasons unrelated to race, leading to upwardly biased estimates of white flight responses in a naive estimation framework. Conversely, blacks and whites could both be drawn to neighborhoods whose populations are growing due to factors unrelated to race, leading to a downward bias in flight response estimates. To address this concern, we utilize an instrumental variables approach, which leverages exogenous sources of variation in black population size at the neighborhood level.12

Our main estimation strategy addresses the causality of white flight by directly utilizing exogenous variation in neighborhood racial composition that arose as the result of heterogeneous state-level black outmigration shocks. Our analysis is in the spirit of the immigration shock literature (Altonji & Card, 1991; Boustan, Fishback, & Kantor, 2010; Saiz & Wachter, 2011; Cascio & Lewis, 2012).

We begin this analysis by considering a simple OLS model relating the decadal change in black populations to the change in white populations:
ΔWijt1-t0=βΔBijt1-t0+ηj+εij.
(1)
where ΔWijt1-t0 (ΔBijt1-t0) is the change in the number of whites (blacks) in a neighborhood over a decade and ηj is a city fixed effect. The coefficient of interest from this first differences strategy, β, relates the change in the number of blacks to the change in the number of whites in a particular neighborhood over the same decade with the city-level average captured by the fixed effect.13

In recent work there has been a growing concern that inappropriate model specification can lead to biased estimates in models of native displacement (Peri & Sparber, 2011; Wright, Mark, & Michael, 1997; Wozniak & Murray, 2012). We implement a change-in-levels specification because it facilitates the implementation of our counterfactual analysis and provides the most parsimonious implementation for our IV strategy. This approach also does well in Peri and Sparber's (2011) Monte Carlo simulations of specification bias in displacement models and makes our results more directly comparable to work in the postwar period by Boustan (2010).14

While informative about general patterns in the data, due to a host of endogenity concerns, it would be inappropriate to draw causal inferences from estimates associated with equation (1). The following cases highlight a number of the potential sources of bias. First, consider the case where neighborhood choice is solely driven by unobserved neighborhood characteristics and is completely independent of race. If neighborhood-level housing supply is perfectly inelastic, then any randomly driven increase (decrease) in a neighborhood's black population must be offset one for one with a decrease (increase) in its white population. Thus, a highly inelastic housing supply will bias estimates downward toward -1 in cases where the actual causal relationship implies a value of β equal to 0.

Conversely, if the supply of housing is perfectly elastic and whites and blacks are subject to the same neighborhood-specific demand shocks, blacks and whites on average would sort into neighborhoods at the same relative rates, and we would expect β > 0. The exact relationship will be driven by both within-city relocations and in-migration. If all population changes are driven by in-migrants, β will capture the relative increase in group populations. In our sample, for the 1920 to 1930 decade, this would imply an upwardly biased estimate of β that would be approximately equal to 2 when the true causal relationship implies β equal to 0. Finally, if supply is elastic and the neighborhood-level demand shocks experienced by blacks and whites are negatively correlated, for instance, due to low-income blacks being differentially attracted to low-price neighborhoods that are being systematically vacated by higher-income whites, then the OLS estimates will be biased downward.

Supply elasticity estimates are not available for our sample neighborhoods. However, the magnitude of population growth in our fixed-border neighborhoods (in terms of both individuals and households) suggests that housing supply was quite elastic at both the core and the periphery during this period. As a result, we do not generally expect negative coefficients to arise purely as a result of supply inelasticity. Regardless, the above discussion highlights the likely problem of bias in these simple OLS regressions. Shared sorting on neighborhood characteristics will impart upward bias to OLS estimates of β (away from flight), while OLS estimates of β will be biased in a negative direction (toward flight) if black arrivals were settling in neighborhoods already being abandoned by whites due to either inelastic supply or negatively correlated tastes for other unobserved neighborhood characteristics.

To overcome this bias concern we leverage exogenous variation in contemporary state-level black out-migration rates in combination with pre-1900 patterns of black settlement in our sample of northern cities to instrument for black arrivals. Particularly, we construct an instrument for ΔBijt1-t0 using the universe of historical census records, digitized versions of which were recently made available by Ancestry.com, to estimate black outflows from each state in each decade (1900 through 1930) and settlement patterns established by African Americans who came to the North before the Great Migration and were thus living in our sample cities by 1900.15

To estimate the total number of black out-migrants from each state over each census decade, we exploit the 100% census microdata samples for 1900 through 1930 and count, for each state, the number of black individuals who appear outside their state of birth in each gender, state of birth, and birth cohort cell. For simplicity, we consider only individuals under the age of 60 and aggregate birth cohorts into ten-year intervals. To illustrate, for the census year 1900, we count the number of individuals of each gender observed outside each birth state in the 1840–1849, 1850–1859, 1860–1869, 1870–1879, 1880–1889, and 1890–1899 birth cohorts. The total number of out-migrants in each cell is obtained by summing over the number of out-migrants present in each state of residence. To obtain the estimated outflow at the national level by cell over a census decade, we take the difference in the number of out-migrants by the five birth cohort intervals (c), two genders (g), and 51 states of birth (s) appearing in each state:
black_outflowcgst1-t0=ksblack_outmigrantskcgst1-ksblack_outmigrantskcgst0,
(2)
where k indexes the state of residence where the individual was observed (state i=51 is the District of Columbia).
For the 1900 base year component of the instrument, we count the number of black out-migrants in each birth cohort-gender-state of birth cell present in each neighborhood of our sample in 1900 to obtain black_basepopcgs1900. To construct the predicted change in the number of blacks in a neighborhood i in decade t1, we assign the estimated outflows according to the base year population for each cell and sum over each cell:
pred_Δ_blackit1-t0=c=15g=12s=151black_basepopicgs1900black_basepopcgs1900black_outflowcgst1-t0,
(3)
where black_basepopcgs1900 is the national sum of all black out-migrant individuals in the cell in 1900.16 Our instrument for ΔBijt1-t0 is thus pred_Δ_blackit1-t0.

Our approach departs from much of the literature on the impact of immigration on local labor markets, where previous papers measure actual inflow rates across origin sources. Because there are no systematic data on internal migration flows in the United States prior to 1940, we need to instead work with estimated outflows, which are inferred from data on state of birth and state of residence. However, we are able to observe a rich set of characteristics of black migrants living outside their birth state, in particular year of birth and gender, enabling a close approximation to the true size of outflows in each decade. These two approaches are thus in principle very similar. Following other work in this literature, our instrument relies on the fact that blacks departing their states of birth (primarily in the South) tended to follow a settlement distribution pattern that was similar to that of blacks who had left their state in earlier decades due to the stability of railway routes and enduring social networks.17 We are able to use additional aspects of the chain migration process than has generally been possible in previous work. In particular, we exploit the fact that migrants tended to cluster near previous arrivals from the same state of origin, generating plausibly exogenous variation in black populations at the neighborhood level. Furthermore, because of the source state variation, we can control for baseline neighborhood-level black population in our analysis.

For our instrument to have power, two types of variation are needed. First, within a given city, the distribution of blacks across neighborhoods must differ by state of origin. To illustrate the presence of variation in this dimension, appendix figure III provides city-level scatter plots showing by neighborhood the share of black men aged 20 to 29 in 1900 who were born in two exemplar pairs of sending states. Panel A shows that, for instance, neighborhoods within Boston, Brooklyn, Chicago, Cleveland, and Philadelphia all exhibit rich variation in the share of black men from this cohort originating in North Carolina as opposed to Virginia. Panel B shows the significant variation across neighborhoods in Chicago, Cincinnati, and St. Louis in the share of the black population originating in Kentucky versus Tennessee.

In addition to differential within-city sorting, we also require that variation exists across sending states over time. Appendix figure IV shows the estimated outflows from the thirteen most important sending states for black men aged 20 to 29 across each of the decades we study in this paper.18 Texas and Virginia provided relatively more out-migrants during the 1900–1910 decade, while South Carolina and Georgia were the most significant sending states by the 1920 to 1930 decade. Taken together, appendix figures III and IV suggest the potential predictive power of our instrument. The instrument is further strengthened by the fact that we compute its components separately by birth cohort and gender.19 Formal F-tests presented below confirm this suggestive evidence regarding the instrument's power.

V.  Analysis of White Flight in the Early Twentieth Century

To estimate the impact of black arrivals on white population dynamics, we begin with OLS estimation of equation (1). Results from this analysis are presented in table 2. Here we follow the literature and consider changes in population numbers while controlling for the average change in white population with city fixed effects.20 Between 1900 and 1910, we find that one black arrival has no statistically significant effect on white population dynamics. By the second decade (1910–1920), one black arrival is associated with a statistically significant .9 decline in the number of whites. This estimated relationship increases in precision and magnitude by our sample's final decade (1920–1930), with one black arrival now associated with the loss of 1.5 whites. The variation underlying the regressions for the latter two decades is shown in the scatter plots in figure 3. A linear trend line through the plot of black and white population difference indicates that negative relationship is not driven by outliers and becomes larger in magnitude between the 1910s and 1920s. We investigate the potential for nonlinear flight effects in section VI.

Figure 3.

Black and White Population Dynamics

The scatter plots show the decadal change in white and black population in the 1,975 sample hexagons. See table 1 for details.

Figure 3.

Black and White Population Dynamics

The scatter plots show the decadal change in white and black population in the 1,975 sample hexagons. See table 1 for details.

Table 2.
Baseline OLS and IV Results for Effect of Black Arrivals on White Departures
Dependent Variable: Change in White Population
1900–1910 Decade1910–1920 Decade1920–1930 Decade
OLS results    
Change in black population 0.189 −0.908*** −1.492*** 
 (0.264) (0.122) (0.075) 
R2-squared 0.088 0.139 0.258 
IV results    
Change in black population −0.936 −1.886*** −3.389*** 
LIML standard errors (0.577) (0.227) (0.246) 
Conley GMM spatial standard errors (0.719) (0.238) (0.386) 
Change in black population: 
Spatial subsample −0.871 −1.956*** −3.550*** 
Bootstrapped standard errors (1.178) (0.368) (0.805) 
First Stage    
Predicted change in black 0.918*** 0.732*** 0.878*** 
population (0.040) (0.025) (0.053) 
F-test on first stage 520.2 829.0 275.9 
Observations 1,975 1,975 1,975 
Dependent Variable: Change in White Population
1900–1910 Decade1910–1920 Decade1920–1930 Decade
OLS results    
Change in black population 0.189 −0.908*** −1.492*** 
 (0.264) (0.122) (0.075) 
R2-squared 0.088 0.139 0.258 
IV results    
Change in black population −0.936 −1.886*** −3.389*** 
LIML standard errors (0.577) (0.227) (0.246) 
Conley GMM spatial standard errors (0.719) (0.238) (0.386) 
Change in black population: 
Spatial subsample −0.871 −1.956*** −3.550*** 
Bootstrapped standard errors (1.178) (0.368) (0.805) 
First Stage    
Predicted change in black 0.918*** 0.732*** 0.878*** 
population (0.040) (0.025) (0.053) 
F-test on first stage 520.2 829.0 275.9 
Observations 1,975 1,975 1,975 

See table 1 for sample and variable details. All regressions include city fixed effects. The instrumental variables regressions are estimated using limited information maximum likelihood estimation (LIML). The Conley (1999) spatial standard errors are estimated using GMM. The spatial subsample standard errors are generated using 25% spatially independent subsamples bootstrapped 100 times. ***Significant at 1%.

Given the concerns about endogenity raised in the previous section, it would be inappropriate to directly interpret the OLS results for the later decades as evidence of flight behavior. However, they are suggestive, and the final decade coefficient estimate is of a magnitude that exceeds that which could be explained solely through the assumption of a perfectly inelastic neighborhood-level housing supply. To further consider these issues, we turn to the instrumental variables results also presented in table 2. The IV estimate is -.9 and insignificant in the 1900s but grows to -1.9 in the 1910s before reaching -3.4 in the 1920s. The latter two coefficient estimates are both highly significant, and in all three cases, F-tests demonstrate an extremely robust first stage.21 Taken together, the OLS and IV estimates suggest that whites were leaving neighborhoods in response to growing black arrivals but that this effect is masked in the OLS regressions, likely due to positive correlation between neighborhood-level demand shocks experienced by both blacks and whites. This result stands in contrast to that of Boustan (2010), who finds OLS coefficients that are negative in all years (1940–1970) and generally similar in magnitude to IV results from an estimation strategy similar to ours when measuring flight from the center city to the suburbs.

One potential concern with our approach is that spatial dependency across neighborhoods may cause our standard errors to be understated. Table 2 also presents standard errors computed using the GMM methodology that Conley (1999) proposed for addressing spatial clustering. The average ratio of the Conley standard error to the baseline IV standard error (estimated using LIML) is 1.57, indicating that spatial standard errors are roughly 60% larger than those estimated under the assumption of spatial independence. To further investigate the extent of spatial correlation in our data, we also run our specification on spatially independent subsamples, each comprising 25% of the overall sample. Appendix figure V presents a visualization of a subsample for Pittsburgh.22 In table 2 we report the results from 100 bootstraps of 25% spatially independent subsamples. Our coefficient estimates are essentially unchanged, and while the smaller sample size is associated with higher standard errors, they remain highly significant for the latter two decades. It is also interesting to note that if we adjust for the impact of the bootstrap sample size on standard error magnitude, both the Conley approach and the spatially independent subset approach suggest roughly the same level of attenuation in the uncorrected standard errors due to spatial dependence. Given this finding, except where noted, in the remaining analysis we report Conley standard errors.23

A second potential concern is the validity of our IV approach. The exogeneity of our instrument hinges on two critical assumptions. First, state-level black out-migration rates must not be influenced by differences in within-city cross-neighborhood pull factors that are systematically related to the origin state of early black settlers. Consider, for example, the fact that during the 1920s more blacks left Virginia than Texas. It cannot be the case that this state-level differential in out-migrants arose (at least partially) because during the 1920s levels of economic opportunity were higher in Chicago neighborhoods that received large numbers of Virginian blacks before 1900 than in Chicago neighborhoods that received large numbers of Texan blacks. Second, because by construction, our instrument will predict higher black population growth in neighborhoods that had relatively higher numbers of black residents in 1900, we need to generally assume that there are no systematic differences between these neighborhoods and low or no black neighborhoods that could potentially have a persistent confounding impact on migration patterns.

While we believe the first assumption to be quite defendable, the second is a potential concern. In 1900, even in those neighborhoods where they were most concentrated, blacks were generally a substantial minority. However, these neighborhoods were typically located in the urban core and hence may differ systematically in other potentially important dimensions. Fortunately, this concern is quite straightforward to address by controlling for the size of each neighborhood's 1900 black population in our IV analysis. In doing so, we essentially guarantee that we are identifying the flight effect based solely on variation in the pre-1900 source state composition of these neighborhoods' black populations, independent of the overall size of their black populations.

This concern is the first issue we address in table 3, which presents a number of robustness checks. We control for percent black in 1900 in the first set of checks and show that our results are essentially unchanged (slightly larger in magnitude). We also control for the number of blacks in 1900 in the next robustness check, but we cannot do this exercise for the 1900 to 1910 decade because the number of blacks in 1900 is used to compute change in black population. The results are reduced in magnitude somewhat but are still sizable and significant.

Table 3.
White Flight Effect Robustness Checks (IV)
1900–1910 Decade1910–1920 Decade1920–1930 Decade
 Dependent Variable: Change in White Population 
Change in black population −0.936 −1.886*** −3.389*** 
(baseline) (0.719) (0.238) (0.386) 
Change in black population 0.703 −1.877*** −3.883*** 
 (0.939) (0.379) (0.554) 
Percent black in 1900 −41.15*** −0.556 39.89* 
 (15.256) (13.901) (23.113) 
Change in black population  −1.399* −2.910*** 
  (0.906) (0.644) 
Number of blacks in 1900  −0.249 −0.343 
  (0.388) (0.358) 
Change in black population  −1.889*** −3.429*** 
  (0.314) (0.524) 
Percent black in 1900  12.94 46.49** 
  (10.828) (23.895) 
Pretrend in white population  0.373*** 0.389*** 
  (0.058) (0.052) 
Southern states IV −0.749 −2.605*** −3.947*** 
 (1.437) (0.561) (0.636) 
No birth cohort IV 8.413 −1.962*** −3.507*** 
 (10.686) (0.260) (0.442) 
Observations 1,975 1,975 1,975 
 Dependent Variable: Change in 
 White Households 
Change in black households −0.625 −0.925*** −3.472*** 
 (0.859) (0.178) (0.482) 
Observations 1,975 1,975 1,975 
1900–1910 Decade1910–1920 Decade1920–1930 Decade
 Dependent Variable: Change in White Population 
Change in black population −0.936 −1.886*** −3.389*** 
(baseline) (0.719) (0.238) (0.386) 
Change in black population 0.703 −1.877*** −3.883*** 
 (0.939) (0.379) (0.554) 
Percent black in 1900 −41.15*** −0.556 39.89* 
 (15.256) (13.901) (23.113) 
Change in black population  −1.399* −2.910*** 
  (0.906) (0.644) 
Number of blacks in 1900  −0.249 −0.343 
  (0.388) (0.358) 
Change in black population  −1.889*** −3.429*** 
  (0.314) (0.524) 
Percent black in 1900  12.94 46.49** 
  (10.828) (23.895) 
Pretrend in white population  0.373*** 0.389*** 
  (0.058) (0.052) 
Southern states IV −0.749 −2.605*** −3.947*** 
 (1.437) (0.561) (0.636) 
No birth cohort IV 8.413 −1.962*** −3.507*** 
 (10.686) (0.260) (0.442) 
Observations 1,975 1,975 1,975 
 Dependent Variable: Change in 
 White Households 
Change in black households −0.625 −0.925*** −3.472*** 
 (0.859) (0.178) (0.482) 
Observations 1,975 1,975 1,975 

See table 2 for sample and specification details. For the southern states, IV only black outflows from Alabama, Arkansas, Florida, Georgia, Louisiana, Mississippi, North Carolina, South Carolina, Tennessee, Texas, and Virginia are used. Spatial standard errors are reported for all specifications. ***Significant at 1%.

As a further robustness test, we also show our results with the inclusion of pretrends in white population in addition to percent black in 1900. Although the pretrend may absorb some of the true effect of white flight from black arrivals carrying over from the previous decade, our results for both the 1910 to 1920 and 1920 to 1930 decades are still significant and similar in magnitude to the baseline.24 We also present results from an alternate definition of our instrument where only southern states are used to compute black outflows (instead of all fifty states, as in our original instrument). Our results are again similar to the baseline, suggesting that as expected, migration shocks out of the South are driving our instrument. The estimates of the flight effect are also quantitatively similar if we drop birth cohort or both birth cohort and gender from the instrumental variable calculation, approaches that reflect a more general chain migration approach.

Finally, one might be concerned that black households are smaller on average than white households, leading to an exaggerated appearance of “flight” when a white family is replaced by a black family. Using the relationship to the head-of-household variable, we created an alternate data set using only heads of household in the census and replicated our analysis at the household level.25 The results from the 1920s indicate that the arrival of one black household led to the departure of 3.5 white households, strongly suggesting that differences in household composition are not driving our findings. We also show in appendix table II that the results are generally similar when the estimation is run on each city and decade separately.26

The white population in our sample cities was split relatively evenly between first-generation immigrants, second-generation immigrants, and third-or-more-generation whites (see table 1). A natural question to ask about these results concerns the subgroups engaged in white flight. Table 4 reports the results of the white flight IV regressions by white subgroup. Between .7 and 1.6 white natives left their neighborhood in response to each black arrival in all decades. The acceleration of the overall white flight affect appears to be driven in part by the emergence of such behavior by first- and second-generation immigrants. While there is no evidence of causal departures in the 1900s, the coefficient by the 1920s is close to -1 for both groups, suggesting that immigrants account for more of the flight effect over time.27

Table 4.
White Flight by Subgroup
1900–1910 Decade1910–1920 Decade1920–1930 Decade
Dependent Variable: Change in white third-generation population    
Change in black population −1.678*** −0.752*** −1.351*** 
 (0.495) (0.172) (0.170) 
Dependent Variable: Change in second-generation population    
Change in black population −0.192 −0.579*** −1.025*** 
 (0.261) (0.102) (0.153) 
Dependent Variable: Change in first-generation population    
Change in black population 1.082*** −0.467*** −0.936*** 
 (0.351) (0.120) (0.132) 
Observations 1,975 1,975 1,975 
1900–1910 Decade1910–1920 Decade1920–1930 Decade
Dependent Variable: Change in white third-generation population    
Change in black population −1.678*** −0.752*** −1.351*** 
 (0.495) (0.172) (0.170) 
Dependent Variable: Change in second-generation population    
Change in black population −0.192 −0.579*** −1.025*** 
 (0.261) (0.102) (0.153) 
Dependent Variable: Change in first-generation population    
Change in black population 1.082*** −0.467*** −0.936*** 
 (0.351) (0.120) (0.132) 
Observations 1,975 1,975 1,975 

See table 1 for sample and variable details. All regressions include city fixed effects. The instrumental variables regressions are estimated using limited information maximum likelihood estimation (LIML). Conley (1999) spatial standard errors are reported in all specifications. ***Significant at 1%.

Another potential source of flight is that of northern-born blacks away from black migrants from the South. Historical work emphasizes that even higher-class urban blacks were largely confined to the ghetto; however, the most economically successful blacks may have moved out to the periphery of the ghetto when new migrants arrived (Massey & Denton, 1993). Appendix table IV reports the results of a regression that relates changes in southern black population to changes in northern black population. Both the OLS and IV effects are positive, although the estimated causal effect declines from .8 to .05 across the decades we study. These results suggest that at least at the neighborhood level, northern blacks were attracted to the same neighborhoods chosen by southern blacks, although this preference attenuated over time. We find no evidence that northern blacks exhibited the same type of flight behavior as did white immigrants during this period.

Finally, a possible concern is that our results are driven by the choice of 800 meter hexagons as a neighborhood definition. Appendix table V replicates our main analysis using instead 400 meter and 1,600 meter hexagons as neighborhood definitions and demonstrates that our results are robust to choice of hexagon size. The table also presents results using instead 1940 census tract boundaries as neighborhood definitions. While generally consistent with our main findings, this approach leads to larger estimated levels of flight, with the most marked difference occurring in the first decade of our sample. One possible explanation is that census boundaries yield neighborhood definitions that are more salient than arbitrarily located hexagons in terms of white perceptions of black in-migration. Conversely, there is concern about the large heterogeneity in census tract size. The 10th and 90th percentile tracts differ in size by more than a factor of 10 (see appendix figure VI for a visualization of 800 meter hexagons and 1940 census tracts). This concern is somewhat ameliorated in the last decade of our sample when tracts sizes are closely related to population density. Here, using census tracts yields a flight estimate 17% higher than that found when using 800 meter hexagons. However, in the early decades, the link between density and tract size is much weaker. Thus, we believe our current approach to be the most conservative.

VI.  How Important Was Flight for the Rise of Segregation in U.S. Cities?

In this section we use our preferred causal estimates of white flight to construct a series of counterfactuals aimed at understanding how much of the observed increase in segregation over the 1900 to 1930 period can be attributed to sorting as opposed to discriminatory institutions. We begin with a simple exercise focusing on the 1920 to 1930 decade to demonstrate the link between our coefficient estimates and the underlying population dynamics for whites and blacks. Next, we employ a range of assumptions on the sorting behavior of newly arrived black residents in each city—representing the extent of institutional barriers constraining where black families could live—and then apply our estimates to predict neighborhood-level white population changes associated with the resulting distribution of black in-migrants. This counterfactual exercise allows us to roughly decompose the relative contribution of white flight and housing market discrimination on the growth in segregation in each decade.

A.  An Illustration for the 1920–1930 Decade

We begin with a simple exercise in table 5 to demonstrate the link between our coefficient estimates from the instrumental variables analysis and underlying population dynamics. Focusing on the 1920 to 1930 decade, we use the complete set of coefficient estimates (including the full set of city fixed effects) to predict each neighborhood's change in white population as a function of its 1900 black share and its observed change in black population between 1920 and 1930.28 These neighborhood-level predictions are aggregated to yield a sample-wide average.

Table 5.
White Flight by Neighborhood Black Share
1920 Black Share
Full Sample0–5%5–10%10–20%More Than 20%
Coefficient on black difference, 1920–1930 −3.389*** −7.632*** −4.435*** −3.887*** −2.159*** 
Standard error (0.246) (0.935) (1.291) (1.143) (0.328) 
Mean white population in 1920 3,663 3,632 3,846 3,560 4,397 
Mean black population in 1920 133 28 298 595 2,138 
Mean change in black population, 1920–1930 118 51 363 485 904 
Implied change in white population 283 470 −506 −731 −1,622 
Implied percent change in white population 8% 13% −13% −21% −37% 
N 1,975 1,680 134 109 52 
1920 Black Share
Full Sample0–5%5–10%10–20%More Than 20%
Coefficient on black difference, 1920–1930 −3.389*** −7.632*** −4.435*** −3.887*** −2.159*** 
Standard error (0.246) (0.935) (1.291) (1.143) (0.328) 
Mean white population in 1920 3,663 3,632 3,846 3,560 4,397 
Mean black population in 1920 133 28 298 595 2,138 
Mean change in black population, 1920–1930 118 51 363 485 904 
Implied change in white population 283 470 −506 −731 −1,622 
Implied percent change in white population 8% 13% −13% −21% −37% 
N 1,975 1,680 134 109 52 

All specifications include share black in 1900 as well as city fixed effects. See table 1 for sample details. The instrumental variables regressions are estimated using limited information maximum likelihood estimation (LIML). The implied change in white population is predicted from the regression on each subsample. ***Significant at 1%.

The results for the full sample are presented in the first column of table 5. The mean white population in 1920 across the sample is 3,663, and the mean black population is 133. The predicted average change in neighborhood white population based on our simple prediction exercise is 283 individuals. This result illustrates the fact that while neighborhoods with larger numbers of black in-migrants were losing whites relative to those with few black in-migrants, on average, white populations were increasing. This relationship is captured in the city-level fixed effects. We note that we are generally seeing larger numbers of black in-migrants into neighborhoods with larger black populations. However, our baseline results do not necessarily require that the causal relationship between the number of black inmigrants and the number of white out-migrants differ across neighborhoods with differing black shares.

In particular, we may expect the relationship between black inflows and white outflows to be nonlinear in light of the literature on neighborhood “tipping” (Grodzins, 1957; Schelling, 1971; Card, Mas, & Rothstein, 2008). We proceed with a simple nonparametric approach in the remaining columns of table 5, partitioning the sample by 1920 black share and rerunning our specification for neighborhoods with 0 to 5% black share, 5% to 10% black share, 10% to 20% black share, and over 20% black share. Although the estimated white flight coefficient declines as the 1920 black share increases, the implied average change in white population is positive (438) for only the 0 to 5% black neighborhoods. Neighborhoods in the 5% to 10% black range are predicted to lose on average 13% of their white population. For the two largest share black subsamples, our model predicts even larger white population losses. In particular, the -2.2 white flight coefficient for the over 20% black share subsample implies a loss of 37% of a neighborhood's white population. Thus, although we observe evidence of white flight across the black share distribution, white outflows are relatively greatest in neighborhoods with the proportionally largest black population at the start of the decade. These findings are generally consistent with the tipping literature.

B.  Assessing the Relative Importance of Institutional Barriers and White Departures

Finally, we leverage our empirical results to estimate the relative importance of white flight, as opposed to institutional barriers on the locational choices of black households in explaining the observed rise in segregation over our study period. We focus exclusively on the dissimilarity measure of segregation because, unlike isolation measures, dissimilarity measures are not sensitive to proportional changes in relative population sizes. Furthermore, nearly all of the increase in dissimilarity in large cities occurred by 1930 (see figure 1).

To identify the relative importance of white flight compared with institutional constraints on where blacks could live we must first identify a counterfactual baseline estimate of what segregation levels would have been if new black migrants had sorted based solely on their own preferences. In this counterfactual world, black arrivals from the South could have sorted into neighborhoods without facing institutional barriers or triggering white flight. Because of the inherent difficulty of this exercise, we produce several sets of counterfactual estimates that we believe span the range of possible outcomes. Details on how these counterfactuals were constructed can be found in the appendix.

Focusing on our preferred model, the most striking finding is the sharp increase in the contribution of flight in each subsequent decade (presented in panel C of appendix table VI). While the counterfactual results suggest that the overall effect of flight and institutions was relatively small at the start of the twentieth century, we estimate that flight was responsible for 26.8% of their combined impact in 1910. By 1920, flight accounted for 33.9%, and by the end of the 1920s, the decade of greatest increase in segregation, white flight was responsible for 50.4% of their combined impact.

The results from this counterfactual exercise demonstrate that decentralized sorting behavior by whites had a quantitatively important and increasing impact on the rise of residential segregation between 1900 and 1930. Our findings suggest that the transition from institutional barriers to white flight as the driving force behind segregation in U.S. cities began several decades earlier than previously thought. Although the Fair Housing Act and other legislative and legal remedies have greatly reduced (without fully eliminating) the barriers faced by blacks in the housing market, white flight from black neighbors is an individual behavior that cannot be limited by local or federal government agencies. Thus, a key finding from this exercise is that segregation could have emerged even in the absence of discriminatory barriers in the housing market through the mechanism of population sorting.

VII.  Conclusion

This paper studies why racial segregation emerged in American cities, providing the first empirical analysis of white flight and its role in the emergence of the black ghetto. Leveraging a new data set, our empirical analysis identifies the residential response of white individuals to the initial influx of rural blacks into the industrial cities of the North on the eve of the World War I. We ask to what extent white departures in response to black arrivals can account for the rise of segregation in American cities. Because restrictive covenants and racial zoning ordinances are no longer legal and racial violence and housing discrimination are less severe in the present day, our analysis to some extent investigates whether segregation could have emerged in the current institutional and legal environment.

Our analysis suggests that the dynamics of white populations likely played a key role in the sharp increase in racial segregation observed over the 1900 to 1930 period. Our analysis shows that black arrivals caused an increasing number of white departures in each decade: by the 1920s, one black arrival was associated with the loss of more than three white individuals. The robustness of these findings and the way in which they vary across time suggest that changes in white animus were a key factor in rising racial segregation.

White flight was not simply a response to deplorable ghetto conditions developed over decades of black migration to northern cities. Instead, whites appear to have been fleeing black neighbors as soon as the migration from the South got underway, and these market decisions had important impacts on the aggregate level of racial segregation in cities. These findings nuance our understanding of the persistence of segregation in the United States, suggesting that even the complete elimination of racial discrimination in housing markets may fail to bring about significant racial integration so long as the sizable numbers of white individuals remain willing to move to avoid having black neighbors.

An important question raised by the findings of this paper is what led to the accelerated white flight effect observed over the 1900–1930 period. Moving forward, understanding why white Americans fled black neighbors at increasing rates and where they settled subsequently is crucial to understanding why American cities became and remain sharply segregated by race. Increased awareness of future black arrivals, the failure of racial zoning ordinances, impacts of racial transition on housing prices, and improvements in urban transit infrastructure are all potential explanations that warrant further investigation.

Notes

1

For instance, the ten northern U.S. cities we study in this paper experienced 97% of their overall twentieth-century increase in dissimilarity and 63% of their increase in isolation by 1930.

2

A related issue is the question of housing price or rent dynamics during this period. Unfortunately, we are aware of no source for systematic spatially delineated data on prewar housing prices or rents.

3

This computation uses the center city status variable from IPUMs samples for 1880 to 1930.

4

Manhattan is the one exception. This borough actually lost population during the 1920s. Our neighborhood boundaries are defined to be time invariant. As a result, none of this reported growth is related to annexations or growth of city boundaries.

5

Formal adoption of redlining by the Home Owners' Loan Corporation and the FHA, with its implications for discrimination in mortgage assistance, was not a factor prior to the 1934 passage of the National Housing Act.

6

This sharp increase in northern urban segregation occurred against a backdrop of nationally rising segregation levels. Recent work using a household-level measure finds that segregation levels doubled between 1880 and 1940 (Logan & Parman, 2017).

7

Isolation peaked in 1970, with isolation rising from .23 to .66 between 1900 and 1970. However, 63% of the overall increase had occurred by 1930. Dissimilarity peaked in 1950, with 97% of the 1900 to 1950 increase (from .64 in 1900 to .81 in 1950) occurring between 1900 and 1930.

8

A detailed description of the construction of this data can be found in Shertzer, Walsh, and Logan (2016).

10

These street files can now be found at the Union Army Project's website (www.uadata.org). We used 1940 street maps produced by John Logan at the Spatial Structures in the Social Sciences at Brown University for Detroit, Cleveland, and St. Louis.

11

Appendix figure VI provides a visual comparison between our 800 meter hexagons and 1940 census tracts. As we discuss, our results are not sensitive to the choice of hexagon size.

12

Our approach is similar to Boustan (2010) except we assign black inflows to neighborhoods instead of cities.

13

Note that because our neighborhoods (hexagons) are all of identical size, changes in population are equivalent to changes in population density.

14

One potential remaining concern is that a levels-based model will implicitly place a higher weight on more heavily populated neighborhoods. To mitigate this concern, we trim the sample at the 1st and 99th percentiles of black and white population changes. We also trim at the 1st and 99th percentiles of black and white head of household changes to facilitate the robustness check in table 3. As a further robustness check, in appendix table I, we demonstrate that our results are robust to stratification of the sample by population quartile. We also modified our specification to more closely match what Peri and Sparber (2011) recommend in the immigration context, in particular scaling both the change in blacks and change in whites by city population at the start of the decade. Our results are largely unchanged (available on request).

15

We note that the black populations in northern cities in 1880, the next earliest year for which microdata samples are available, are generally too small to have statistical power in predicting where future black arrivals would settle.

16

We shift the cohorts for each decade so that individuals of the same age are assigned in the same proportion across time. For instance, outflows of men from Alabama who were born in the 1900–1909 decade and were thus between the ages of 21 and 30 in 1930 were assigned to neighborhoods according to the distribution of men born in Alabama aged 21 to 30 present in 1900.

17

See Grossman (1991) for a discussion of the importance of rail routes for black migration to the North.

18

These thirteen states represent between 87% and 92% of total black outflows in the years we study.

19

We construct our baseline instrument using state of birth, gender, and birth cohort cells to reflect the fact that black migration to northern cities was largely based on employment, and information on jobs in particular neighborhoods would likely have been tailored to individuals of a similar age and gender. However, we show our results are largely unchanged when using a simplified instrument that uses only state of birth and gender, reflecting a more general chain migration process, in table 3. Our results are also similar if we use only state of birth to construct the instrument (available on request).

20

As discussed in section III, we drop the 1st and 99th percentiles of both black and white population changes to ensure that our results are not being driven by outliers in the data.

21

Given the strength of our instrument, one may be concerned that our instrument is so correlated with the endogenous variable that there is no scope for the IV approach to shed unwanted endogeneity from the estimation. Exploratory analysis suggests that such overfitting is not a concern. For example, the correlation between our instrument and endogenous variable in 1930 is .234 and the R2 from a regression of the endogenous variable on the instrument is only .116.

22

These subsamples are constructed one city at a time by a simple select and reject algorithm. The algorithm randomly selects a candidate neighborhood for the subsample and tests for adjacency with the current elements of the subsample. If the candidate neighborhood is adjacent to a current subsample member, it is dropped. Otherwise it is added to the sample. This process is repeated until a 25% subsample has been obtained.

23

As noted above, an additional concern with our basic approach is the potential for a small number of very large population communities to drive our coefficient estimates. This concern motivates our decision to trim the sample at the 1st and 99th percentiles of population. However, as a further robustness check, we reran our analysis on subsets of our sample associated with the lowest quartile, highest quartile, and interquartile range of population. These results (presented in appendix table I) show no qualitative difference between results in the three subsamples and our results for the entire sample. The largest point estimate occurs on the interquartile subsample for the 1920 to 1930 decade, allaying concerns about our results being driven by a few highly populated neighborhoods.

24

As an additional test along these lines, in appendix table III we evaluate how well this decade's black inflows predict the previous decade's change in white population. Coefficient estimates are much too small in magnitude to be driving our results.

25

The head of household data set contains some significant outliers due to a fraction of a black head of household being assigned to a neighborhood, leading to very large ratios of blacks to black heads of household in areas with very few blacks. Outliers also arise for white household heads due to a large institution containing many whites but no household heads. We trim at the 99th percentile of the ratio of white to white household heads as well as black to black household heads to remove these outliers in both the head of household data set and the main data set.

26

An exception is Cleveland over the 1920 to 1930 decade. The instrument works poorly for this city-decade pair because the black population was tiny in 1900 and located in a different part of the city from where the ghetto emerged in the 1920s (near the Central Avenue District).

27

The coefficient on change in first-generation immigrant population change is actually positive and significant in the first decade. This result could be driven by recent European immigrants being drawn to the businesses and institutions that catered to the needs of recently arrived families regardless of origin and that may have been more likely to develop in neighborhoods that experienced high rates of black in-migration.

28

We use the estimates presented in the second row of table 3 that include controls for the percent black in 1900 as we believe this to be our most robust specification. The standard errors presented in table 5 are from the baseline IV specification that assumes spatial independence because of the difficulty of obtaining spatial standard errors for the smallest subsamples.

REFERENCES

REFERENCES
Altonji
,
J.
, and
D.
Card
, “The Effects of Immigration on the Labor Market Outcomes of Less-Skilled Natives” (pp.
201
234
), in
J.
Abowd
and
R.
Freeman
, eds.,
Immigration, Trade, and the Labor Market
(
Chicago
:
University of Chicago Press
,
1991
).
Boustan
,
Leah Platt
, “
Was Postwar Suburbanization ‘White Flight’? Evidence from the Black Migration
,”
Quarterly Journal of Economics
125
:
1
(
2010
),
417
443
.
Boustan
,
Leah Platt
,
Price V.
Fishback
, and
Shawn
Kantor
, “
The Effect of Internal Migration on Local Labor Markets: American Cities during the Great Depression
,”
Journal of Labor Economics
28
:
4
(
2010
),
719
746
.
Card
,
David
,
Alexandre
Mas
, and
Jesse
Rothstein
, “
Tipping and the Dynamics of Segregation
,”
Quarterly Journal of Economics
123
:
1
(
2008
),
177
218
.
Cascio
,
Elizabeth U.
, and
Ethan G.
Lewis
, “
Cracks in the Melting Pot: Immigration, School Choice, and Segregation
,”
American Economic Journal: Economic Policy
4
:
3
(
2012
),
91
117
.
Conley
,
Timothy G.
, “
GMM Estimation with Cross Sectional Dependence
,”
Journal of Econometrics
92
:
1
(
1999
),
1
45
.
Cutler
,
David M.
,
Edward L.
Glaeser
, and
Jacob L.
Vigdor
, “
The Rise and Decline of the American Ghetto,
Journal of Political Economy
107
:3 (
1999
),
455
506
.
Drake
,
Saint C.
, and
Horace R.
Cayton
,
Black Metropolis: A Study of Negro Life in a Northern City
(
Chicago
:
University of Chicago Press
,
1970
).
Duncan
,
Otis Dudley
, and
Beverly
Duncan
, “
A Methodological Analysis of Segregation Indexes
,”
American Sociological Review
20
:
2
(
1955
),
210
217
.
Farley
,
Reynolds
, and
Walter R.
Allen
,
The Color Line and the Quality of Life in America
(
New York
:
Russell Sage Foundation
,
1987
).
Grodzins
,
Morton
, “
Metropolitan Segregation
,”
Scientific American
197
:
4
(
1957
),
33
41
.
Grossman
,
James R.
,
Land of Hope: Chicago, Black Southerners and the Great Migration
(
Chicago
:
University of Chicago Press
,
1991
).
Kucheva
,
Yana
, and
Richard
Sander
, “
The Misunderstood Consequences of Shelley v. Kraemer,
Social Science Research
48
(
2014
),
212
233
.
Logan
,
T.
, and
J.
Parman
, “
The National Rise in Residential Segregation
,”
Journal of Economic History
77
:
1
(
2017
),
127
170
.
Massey
,
Douglas S.
, and
Nancy A.
Denton
,
American Apartheid: Segregation and the Making of the Underclass
(
Cambridge, MA
:
Harvard University Press
,
1993
).
Peri
,
Giovanni
, and
Chad
Sparber
, “
Assessing Inherent Model Bias: An Application to Native Displacement in Response to Immigration
,”
Journal of Urban Economics
69
:
1
(
2011
),
82
91
.
Saiz
,
Albert
, and
Susan
Wachter
, “
Immigration and the Neighborhood
,”
American Economic Journal: Economic Policy
3
:
2
(
2011
),
169
188
.
Schelling
,
Thomas C
, “
Dynamic Models of Segregation
,”
Journal of Mathematical Sociology
1
:
2
(
1971
),
143
186
.
Shertzer
,
Allison
,
Randall P.
Walsh
, and
John R.
Logan
, “
Segregation and Neighborhood Change in Northern Cities: New Historical GIS Data from 1900 to 1930
,”
Historical Methods
49
:
4
(
2016
),
187
197
.
Wozniak
,
Abigail
, and
Thomas J.
Murray
, “
Timing Is Everything: Short-Run Population Impacts of Immigration in US Cities
,”
Journal of Urban Economics
72
:
1
(
2012
),
60
78
.
Wright
,
Richard
,
Mark
Ellis
, and
Michael
Reibel
, “
The Linkage between Immigration and Internal Migration in Large Metropolitan Areas in the United States
,”
Economic Geography
73
:
2
(
1997
),
234
254
.

Author notes

Support for this research was provided by the National Science Foundation (SES-1459847). Additional support was provided by the Central Research Development Fund and the Center on Race and Social Problems at the University of Pittsburgh. We are grateful to Brian Cadena, Terra McKinnish, Elizabeth Cascio, Ethan Lewis, Leah Platt Boustan, Bob Margo, Lowell Taylor, Brian Kovak, Spencer Banzhaf, Tom Mroz, Aimee Chin, and Judith Hellerstein, and seminar audiences at the NBER Summer Institute (DAE), ASSA Meetings, Carnegie Mellon, Michigan, Georgia State, Mississippi State, Colorado, and the University of Western Australia for helpful comments. We thank John Logan for assistance with enumeration district mapping and for providing 1940 street files. We also thank David Ash and the California Center for Population Research for providing support for the microdata collection, Carlos Villarreal and the Union Army Project (www.uadata.org) for the 1930 street files, Jean Roth for her assistance with the national Ancestry.com data, and Martin Brennan and Jean-François Richard for their support of the project. We are grateful to Ancestry.com for providing access to the digitized census manuscripts. Antonio Diaz-Guy, Phil Wetzel, Jeremy Brown, Andrew O'Rourke, Aly Caito, Loleta Lee, and Zach Gozlan provided outstanding research assistance.