Abstract

This paper investigates how accountability pressures under No Child Left Behind (NCLB) may have affected students’ rate of overweight. Schools facing pressure to improve academic outcomes may reallocate their efforts in ways that have unintended consequences for children's health. To examine the impact of school accountability, we create a unique panel dataset containing school-level data on test scores and students’ weight outcomes from schools in Arkansas. We code schools as facing accountability pressures if they are on the margin of making Adequate Yearly Progress, measured by whether the school's minimum-scoring subgroup had a passing rate within 5 percentage points of the threshold. We find evidence of small effects of accountability pressures on the percent of students at a school who are overweight. This finding is little changed if we controlled for the school's lagged rate of overweight, or use alternative ways to identify schools facing NCLB pressure.

1.  Introduction

The federal No Child Left Behind (NCLB) legislation was passed in 2002, ushering in a national era of test-based school accountability programs. Although states have some latitude in defining specific requirements (Davidson et al. 2013), all states were required to define and implement stringent accountability standards and to penalize schools that failed to meet these standards.1 A substantial amount of research into NCLB and the state-level accountability movement that preceded it has documented many ways that schools respond to accountability pressures (see the review chapter by Figlio and Ladd 2015; also Rouse et al. 2013; and Reback, Rockoff, and Schwartz 2014). Overall, test scores improve—sometimes quite substantially—after accountability is enacted (Dee and Jacob 2010).2 Some of the gains have been driven by schools’ changes in policies and practices that devote more time and/or effort to instruction in tested subjects (see, e.g., Chiang 2009; Reback, Rockoff, and Schwartz 2014; Chakrabarti 2014). Others are due to more strategic behavioral responses to the incentives, such as shifting effort toward students on the cusp of passing (Reback 2008; Neal and Schanzenbach 2009), strategically assigning students to special education or English Language Learner status (Cullen and Reback 2006; Chakrabarti 2013a), suspending low-performing students (Figlio 2006), or outright cheating (Jacob and Levitt 2003; Sass, Apperson, and Bueno 2015).

As described in Figlio and Ladd (2015), policy makers must make tradeoffs in determining how broad-based to make their school accountability systems. Educational stakeholders typically value not only math and reading achievement, but also a broader array of outcomes, including higher-order learning that cannot be easily measured on standardized tests, noncognitive skills, achievement in other subject matters, and broader life outcomes (such as citizenship and health). Of course, it is difficult to reliably measure a broad set of outcomes, however, and even if it were possible to measure all of the outcomes it would be difficult to assign optimal weights to each outcome in an accountability framework. As such, most accountability systems under NCLB are narrowly focused on math and reading achievement, elevating the importance of these outcomes and providing incentives to narrow the scope of instruction.

In particular, because schools are not directly held accountable for student health, there is scope for NCLB incentives to inadvertently harm outcomes in this area. In this paper, we measure whether the pressures of test-based accountability inadvertently led to increases in students’ rates of obesity and overweight. Many policies and practices of schools have potential impacts on students’ calorie expenditures and food intake, and thus ultimately their body weight. For example, recess and gym classes offer children opportunities to burn calories during the day. Foods served during school meals, as well as extra foods sold as fundraisers, at snack bars and vending machines, and treats given as rewards or during celebrations, all have potential to affect dietary intake and habit formation. Schools can also influence (with homework assignments) the amount of free time outside of school that children have available. The incentives from test-based accountability may induce schools to alter their practices along any or all of these dimensions. To increase time in instruction of tested subjects, for example, schools may reduce time spent in recess and/or physical education (PE). They may increase the length of the school day, require after-school tutoring, or assign more homework. New financial pressures may induce school administrators to try to raise new funds through food-based fundraisers or outside food and beverage contracts. Schools may use food as rewards to motivate students.3 As described later in this paper, there is evidence that schools use these types of strategies, and they have potential to impact student health as measured by the rate of overweight and obesity.

To examine the potential spillover of school accountability programs onto the rate of overweight, we create a unique dataset that combines NCLB rules, test scores, and the percent of students whose body mass index (BMI) is above the cutoff for “overweight,” using data from Arkansas. NCLB required states to set thresholds, based on the percent of students achieving a “passing” mark on standardized tests, that determine whether schools in that state are making Adequate Yearly Progress (AYP) toward the ultimate goal of having all students achieve a passing mark on the tests. As described in this paper, we argue that schools near the passing threshold are most likely to face higher levels of accountability “pressure” under NCLB and are most likely to adopt new policies and practices that may spill over to students’ rate of overweight. Empirically, we find that schools in this pressured group see moderate increases in their rates of overweight, and these findings are robust to a variety of specifications, such as including the school's lagged rate of overweight. The results broaden our understanding of the many ways that schools have responded to NCLB, and add to the scant literature on the impacts of testing-based school accountability on child health.

2.  Background

Impacts of Accountability

A large literature on NCLB and the state accountability programs that preceded it has documented various ways that test-based accountability has changed aspects of how schools operate (see Figlio and Ladd 2015 for a recent review). The literature is built on a variety of research designs, some of which are more well-identified than others.

Much of what we know about the impacts of accountability has been drawn from cleanly identified, quasi-experimental approaches. For example, one approach commonly used in the literature is to use a “program introduction” design to compare outcomes before and after accountability was introduced. Some of these also use variation across states prior to NCLB, when various states were adopting their own accountability programs (e.g., Jacob 2005; Ballou and Springer 2008; Neal and Schanzenbach 2009). Such an approach is typically limited to the short-run impacts of accountability in the few years after its introduction. Another well-identified, quasi-experimental approach is to use a regression discontinuity (RD) to isolate the response to failing to meet an annual goal (Chiang 2009; Chakrabarti 2013b, 2014; Rouse et al. 2013). In these approaches, schools with passing rates just below the AYP threshold are compared to schools with passing rates just above, under the assumption that there are no important differences between schools on either side of that threshold (other than one makes AYP and the other does not). Both the program introduction and RD approaches in the literature have had additional successes combining their basic approach with strategies that leverage differential incentives under the accountability policies. For example, studies have made comparisons across high- versus low-stakes grade levels and subject matter, and between so-called “bubble” students (those on the cusp of passing the test while their classmates are more solidly in passing or failing territory). To be sure, the literature based on the quasi-experimental evidence is extremely important, and much of what we know about the impacts of NCLB is from these studies.

A series of thoughtful papers by Dee and Jacob (2010, 2011) documents the overall impact of NCLB. Because the policy was adopted nationwide at one point in time, it has been a challenge to find a suitable comparison group that was left untreated. After showing that private schools not subject to NCLB do not turn out to be an appropriate comparison group because of changes in student-level selection into these schools, researchers use a strategy comparing states that adopted accountability prior to NCLB with those that did not. They find that NCLB led to improvements in math achievement as measured by the low-stakes National Assessment of Educational Progress (NAEP) test, especially among students who are young or disadvantaged. On the other hand, they do not find systematic improvements in reading achievement on the NAEP. They also document increases in nonfederal education spending, teacher compensation, the percentage of teachers with graduate degrees, and instructional time in both math and reading. The Dee and Jacob results estimating the impact of NCLB as a whole are consistent with a broader literature studying impacts of accountability policies in various states.

NCLB Pressure

There are reasons to believe that the impacts of NCLB go beyond what can be cleanly identified in research studies such as those described in the previous section. Consider the important RD literature on NCLB that compares schools on either side of the passing threshold, leveraging the different levels of pressure on either side of a strict cutoff. Although the RD literature has shown that there are different levels of pressure on either side, in addition, even the “untreated” schools that have just barely passed the NCLB threshold are likely to face accountability pressure despite the fact they have not (yet) failed. In other words, even though the schools on the failing side of the margin may be facing more pressure than those above the cutoff, the schools that have just managed to pass have also faced accountability pressure.4 Our approach here is different—we estimate the effect of schools coming under NCLB pressure by comparing schools near the passing threshold to those further away from it.

To help fix ideas about which schools will be “pressured” to change under NCLB, it is useful to consider some details of NCLB in Arkansas. In Arkansas, and most other states, schools were held accountable for the fraction of children in a school who earn a passing score on the state standardized tests in math and literacy.5 A feature of NCLB is that states determine the details of their accountability program within parameters defined by the federal government, including determining what test would be used for accountability, what would be the passing threshold, and what fraction of students must pass each year for a school to meet the AYP standard. For an elementary school to be deemed passing in Arkansas in 2002, approximately 30 percent of students in the school had to pass each test.6 The percent passing goal increased by about 7 percentage points each year, in order to reach the federally-mandated goal of 100 percent proficiency by 2014.7 A hallmark of NCLB is that in addition to the overall percentage passing in the school, each student subgroup—as defined by race, socioeconomic status, and other educational categories—was required to meet the same percent passing rate.8 In order for a school to make AYP in a given year, not only does the overall passing rate of its student population for each tested grade have to be above the threshold set for that year, but the passing rate of all designated subgroups in the school with a large enough enrollment must also meet or exceed the goal threshold.9 If any one of the student subgroups fails to attain AYP, then the entire school would be designated as failing to meet AYP.10 Consistent with NCLB principles nationwide, if schools in Arkansas fail to meet the AYP goals for two consecutive years, they are required to implement corrective actions that increase in severity over time.11

According to the theory of change underlying accountability policies, the threat of punishment should induce schools to alter their practices in a manner that would improve student achievement. Because the threat of punishment may have been particularly salient for schools on the margin of passing, they may be expected to have made the largest changes in response to accountability pressures.12

We expect schools facing accountability pressure—as measured by those close to, but on both sides of, the margin of passing—to respond to these pressures by adopting new practices. Although there is not a perfectly clean identification strategy to measure whether a school is exposed to accountability pressure, researchers have attempted to isolate these impacts in a variety of ways. Our approach is most similar to that used by Reback, Rockoff, and Schwartz (2014), in which they predict the likelihood that each subgroup within a school will meet the AYP requirement. They then categorize subgroups into whether they have a “high” chance of meeting AYP, a “low” chance, or if they are on the margin with only a “moderate” chance. Further aggregating this to the school level, they designate schools as being on the AYP margin and compare these schools with those that were more clearly in passing or failing territory, to measure how inputs and outcomes differ in the marginal schools. They find that scores on low-stakes exams improve in schools on the margin, suggesting real improvements in learning and not merely “teaching to the test.” In addition, they find increases in teaching hours by specialty teachers, reductions in time spent in whole-class instruction, less teaching of nontested subjects, and lower rates of teacher satisfaction in schools on the margin. Their results are robust to using various approaches to operationalize which schools are “pressured” under NCLB, and also to adding cross-state differences that make use of different accountability thresholds across states. As described in the following, our approach is broadly similar to Reback, Rockoff, and Schwartz (2014) in that we identify schools on the margin of passing, and thus “pressured by NCLB,” and test whether they have differential outcomes compared with schools farther away from the margin. The intuition is that those schools “close” to the passing threshold are those where administrators will be compelled to make changes that may affect students’ rate of overweight. In order to identify these schools, we follow the statute, which uses (along with other factors) the passing rate among specified subgroups of students to determine whether the school is designated as making AYP. We categorize schools as on the margin of passing if the lowest scoring subgroup of students is plus or minus 5 percentage points of the passing threshold. As any designation of a school as “pressured by NCLB” will necessarily be ad hoc, we also examine a number of reasonable alternative definitions.

In theory, accountability policies work to raise academic performance in part because they induce schools to exert more effort with a given level of resources. In economic parlance, we would say that prior to accountability, some schools were not operating at their production possibilities frontier, and the realignment of incentives was intended to move schools toward that frontier. Of course, we do not directly observe effort and must infer changes in effort from factors that we do observe, including test scores and other elements, such as time use and, as we argue in this paper, the percent of students who are overweight. There are several potential mechanisms through which accountability pressure could lead to increased rates of overweight. For example, in response to pressure, schools may decide to reallocate time away from physical activity and toward instructional activities (including classwork, homework, extra tutoring, and so on). Of course, some reallocations of effort may not be expected to have any impact on students’ BMI. For example, if schools adopt new curricula or instructional methods that replace old approaches but do not change the time allocated to instruction, there may be no impact on calorie balance despite a great deal of effort on the part of schools.

It is tempting to try to more precisely parameterize NCLB pressure. For example, are those schools having two subgroups on the margin of passing facing more pressure than schools that have only one? We argue that with the data available here such an endeavor is unlikely to be fruitful—the answer will depend on the underlying distribution of student ability within each of the subgroups.13 Our measure will, of course, produce an estimate of NCLB pressure that is an average across different schools, some of which will face more pressure than others, and some of which will make different types of changes than others to try to make AYP. It is important to emphasize that we do not interpret these estimates as elasticities or impulse response functions. Nevertheless, as long as our categorization of schools correctly identifies those schools more likely to be making changes in response to NCLB, one can interpret the coefficient we estimate as the average effect of all the different types of changes schools might make in an effort to comply with AYP rules on students’ rates of overweight. If our measure is primarily noise and has no relationship to being pressured by NCLB we would expect no statistically meaningful relationship between our measure of pressure and students’ weight outcomes. If our measure of pressure is systematically picking up something else that is correlated with students’ weight, such as overall quality of the school or trends in student demographic characteristics, then that, indeed, would be a problem for interpreting the coefficient we estimate. As we describe in this paper, our methodology is intended to control for these confounding factors and relies on variation in whether a school's lowest-scoring subgroup, among otherwise similar schools, is on the margin of passing. The results indicate that NCLB pressure as conceived in our measure stably predicts higher rates of overweight, and indicate an unintended spillover from academic accountability to children's health.

Research on Schools and Obesity

In addition to the literature on NCLB, another related literature is on schools’ ability to impact children's calorie balance through the food or physical activity environment. There is evidence that obesity is affected by the school food environment, including school lunches, vending machines, and other competitive foods (see Anderson, Butcher, and Schanzenbach 2010, and Hoynes and Schanzenbach 2015 for literature reviews). Cawley, Frisvold, and Meyerhoefer (2013) find that increased time in PE classes because of state mandates reduces obesity among elementary school children, though the impact is concentrated among boys.14

A smaller number of studies has focused specifically on the role of accountability on obesity-related outcomes, finding results that support the hypothesis that accountability pressure affects school environments in ways that would be expected to increase obesity. Figlio and Winicki (2005) find that schools facing accountability sanctions increase the number of calories offered in their school lunches during the testing period. Anderson and Butcher (2006b) find that schools in states with accountability measures are more likely to give students access to junk food (and schools more likely to give students access to junk food have students with higher BMI).15 Yin (2009) uses cross-state differences in the implementation of accountability laws (pre-NCLB) to explore the effects of accountability on obesity, finding that high school students in states with accountability laws show a significant increase in BMI and obesity rates. Exploring potential mechanisms, she finds evidence that female adolescents’ participation in PE classes declines with the introduction of accountability.16

As discussed in more detail subsequently, to explore the possibility that NCLB might have increased children's rate of overweight, we use data on test performance as well as rates of overweight and obesity for schools in Arkansas. Although our analysis is reduced-form and cannot isolate the particular mechanisms through which accountability pressure contributes to overweight, there is some evidence that principals report undertaking behaviors that could affect students’ calorie balance. To better understand the mechanisms connecting accountability and obesity rates, we conducted an online survey of Arkansas school principals, inquiring about school practices and how they changed after the adoption of NCLB.17 Almost 22 percent of responding elementary school principals in Arkansas reported cutting time from recess after NCLB was adopted.18 This is consistent with findings from the Center on Education Policy (2007) that 20 percent of school districts nationwide have decreased recess time since NCLB was enacted, with an average decrease of 50 minutes per week. In addition, a quarter of survey respondents in Arkansas report using food as a reward for good academic performance, and 10 percent reported holding fundraisers that involved food.

3.  Methodology

As discussed in section 2, past research has used a variety of methods to identify a school's exposure to accountability pressures. Similar to Reback, Rockoff, and Schwartz (2014), our main approach focuses on schools most likely to be marginal, which we define as whether the school's minimum-scoring grade-test-subgroup was within 5 percentage points of meeting the AYP proficiency rate in a given year.19 It is important to note, though, that we are not trying to estimate a precise reaction to accountability pressure—that is, to obtain estimates implying that a school missing AYP by x percentage points would be predicted to increase the rate of student overweight by y percentage points. Rather, given past findings on the effects of accountability pressures on school behaviors, we simply want to determine if there is any evidence for these behaviors having unintended spillover effects on student rates of overweight. Thus, to investigate the role of NCLB on students’ overweight status, we build up to estimating the following model:
formula
1
where overwgt is the percentage of students in school s at time t who are overweight. The variables mathprof and litprof are measures of the school's overall proficiency rate (relative to the AYP goal) in math and literacy, respectively, pctnw is the percent of the student enrollment that is nonwhite, pctpoor is the percent of student enrollment who are economically disadvantaged, and year is a linear time trend. The variable pressured is an indicator for whether, as of the prior year, the school is likely to be pressured by NCLB. In our main results this is an indicator variable equal to 1 if the minimum-scoring grade-test subgroup was within 5 percentage points of the AYP proficiency threshold in any previous year.20 Note that we also include a lagged dependent variable—that is, the prior year's rate of overweight—that serves to control for a host of fixed and slow-moving unobserved determinants of student health.

This specification is designed to eliminate (or at least greatly reduce) any correlation between pressuredst-1 and est. First, we include year to account for any joint trends in the rate of overweight and the probability of being pressured.21 Second, we flexibly control for a school's poverty rate and racial composition, because these factors are known to predict both test scores and the rate of overweight. To allow for a nonlinear relationship, we include polynomials to the fourth degree in each factor. We control for a measure of the school's overall proficiency rates in math and reading in a similarly flexible manner. Recall that the accountability rules assign AYP status based on the school's lowest-scoring subgroup that meets the minimum group size. As a result, two schools with similar overall performance can find themselves in different NCLB statuses based on differences in proficiency rates among their lowest-scoring subgroup. In fact, they may even have similar subgroup scores, but in one school the subgroup may not be numerically large enough to count toward the rating. Thus, by controlling generally for the school's proficiency, the coefficient of interest is driven by differences in achievement across subgroups. In other words, two schools with the same overall achievement rates may be very similar, but one school was pressured by NCLB because of a struggling subgroup, whereas the other school did not face that risk.22 Our regressions essentially compare the rate of overweight across these two schools.

Finally, we also control for the prior year's rate of overweight at the school. The prior year's rate of overweight is determined in part by factors such as the school's location, the students’ genetic propensity for overweight, and demographics that we do not observe.23 To the extent that these unobserved determinants of overweight are fixed (such as school location) or moving only slowly over time (such as the composition of the student population), then directly including the lagged rate of overweight will control for these unobserved determinants. As a result, the remaining potential source of bias would have to be unobserved—but transitory—determinants of a school's rate of overweight that are also correlated with our measure of NCLB pressure. Ultimately, although we cannot claim to rule out all potential sources of bias in the estimated impact of NCLB pressure on a school's rate of overweight, we can rule out the most reasonable sources of correlation between pressuredst-1 and est. In particular, our methodology controls for differences in observed demographics and overall school quality, and unobserved differences that would be captured in last year's rate of overweight.

4.  Data from Arkansas

In order to determine whether there is a relationship between NCLB pressure and children's overweight status, we construct a unique dataset from different publicly available sources that merges school-level information on the percentage of students in a grade in a school (overall and by subgroup) who achieve a passing score on standardized math and literacy tests, the rate of overweight, and other demographic characteristics. Details of the final dataset creation can be found in Appendix A in the online appendix.

Arkansas Assessment of Childhood and Adolescent Obesity

In 2003, the state of Arkansas passed an act intended to help combat childhood and adolescent obesity.24 Although obesity has been increasing nationwide, obesity levels were particularly high in Arkansas. In 2003, about 21 percent of school-aged children in Arkansas were obese or overweight, and this figure was about 17 percent for the nation as a whole (Ogden et al. 2006). A central component of the initiative was the reporting of health risk information to parents (ACHI 2004).

The Arkansas Center for Health Improvement spearheaded the effort to collect height and weight information for each school child in the state of Arkansas. This effort included ensuring that each school had the equipment and trained personnel necessary to accurately weigh and measure each child.25 After children were weighed and measured, a personalized letter then went home to each parent describing the child's BMI, where this fit in the BMI distribution (whether the child was obese, overweight, healthy weight, or underweight), and the type of health risks that might be associated with an unhealthy BMI.26 Parents of children with an unhealthy weight were urged to consult a physician.

Importantly, schools were not held accountable for their students’ BMI, but were merely the conduit through which the state could measure the information and provide it to parents. The implicit assumption of this effort was that if the state provides better information to parents, then parents could make—or help their children make—better informed, more healthful choices that could improve children's weight outcomes. An annual public report is available on the Arkansas Center for Health Improvement Web site with the percent of students who are underweight, normal weight, overweight, and obese at each public school in Arkansas.27 Thus, thanks to the Arkansas Assessment of Childhood and Adolescent Obesity, we have panel data on school-level rates of overweight from 2004 to 2010. Note that we do not have access to the information at the individual student level.

School Academic Performance Reports

One of the requirements of NCLB was to make publicly available school-level information on the passing rates, both overall and for student subgroups. The Arkansas Department of Education provided us with school report card data for 2002–10, which provide information on the percent of students in each tested grade who scored “proficient” on the literacy and math tests. These percentages are reported both overall for the grade and for subgroups as well.28

We coded schools as being “pressured” by NCLB according to the relationship between the pass rate for the minimum passing subgroup relative to the threshold pass rate required to make AYP for that year (Appendix table A.1 in the online appendix lists the thresholds for test by grade and year). Our preferred definition designates a school as pressured if the minimum passing subgroup was within plus or minus 5 percentage points of the threshold at any point in the past. There are many details involved in determining AYP status and it is impossible for us to perfectly predict which schools in Arkansas make adequate yearly progress with the publicly available data aggregated to the grade-level, subject, and subgroup by school level. Our approach, however, only requires that we have correctly identified the marginal schools—those close enough to the threshold that they feel it is warranted to make changes they believe will affect test scores. We check the robustness of our results to alternate definitions of “pressure,” but the definition is always based on the AYP designated passing threshold for a school by test, grade, and year.

School Demographics

We also control for measures of a school's demographic characteristics. These include the percent nonwhite (black or Hispanic) and percent economically disadvantaged. Beginning in 2008, the school report card data directly report the overall and subgroup sizes, which allows us to create these measures of demographics. Prior to that, we use the Common Core Data for each school to construct these measures, as described in more detail in Appendix A in the online appendix.

5.  Results

Summary Statistics

We start by examining some basic descriptive statistics in table 1, where column 1 presents the overall sample, column 2 presents the pressured sample, and column 3 presents the never pressured sample. The majority of observations—about 55 percent of school-by-year cells in our data—have faced our definition of NCLB pressure in the past because their minimum-scoring subgroup has scored within 5 points of the AYP threshold. This measure of pressured implies splitting the full sample into the stock of schools that have been pressured, and the stock of schools that have never been pressured. To get a better understanding of the characteristics of this stock of nonpressured schools, we additionally create a flow measure of whether the lowest scoring grade-test-subgroup last year was more than 5 points below the threshold (below the margin) or more than 5 points above the threshold (above the margin).29 In the average school-by-year observation for the full sample in column 1, about 24 percent are coded below the margin by this definition, and another 20 percent are above the AYP margin. Additionally in column 1, we see that the mean for percent of overweight students is 38.4; 57.2 for percent of students economically disadvantaged; and 30 for percent nonwhite. Finally, on average, the measure of schools’ overall passing rates (that is, not disaggregating by subgroup) exceeded the target English proficiency rate by 9.3 percentage points and the target math proficiency rate by 11.5 percentage points in the prior year.

Table 1. 
Summary Statistics: Analysis Sample
Full Sample (1)Pressured in Past (2)Never Pressured (3)
Pressured in past 0.553 
 (0.497)   
Below pressured margin 0.239 0.535 
 (0.427)  (0.499) 
Above pressured margin 0.207 0.465 
 (0.406)  (0.499) 
Overweight rate 38.44 38.92 37.85 
 (6.597) (6.281) (6.925) 
Percent nonwhite 29.99 24.45 36.85 
 (28.74) (25.24) (31.24) 
Percent economically disadvantaged 57.15 56.48 57.98 
 (19.78) (17.57) (22.20) 
English proficiency rate relative to AYP 9.319 9.513 9.079 
threshold (previous year) (17.27) (15.10) (19.63) 
Math proficiency rate relative to AYP 11.48 12.38 10.36 
Threshold (previous year) (18.14) (14.70) (21.60) 
Overweight rate (previous year) 38.27 38.69 37.76 
 (6.264) (5.979) (6.565) 
Observations 4,588 2,539 2,049 
Full Sample (1)Pressured in Past (2)Never Pressured (3)
Pressured in past 0.553 
 (0.497)   
Below pressured margin 0.239 0.535 
 (0.427)  (0.499) 
Above pressured margin 0.207 0.465 
 (0.406)  (0.499) 
Overweight rate 38.44 38.92 37.85 
 (6.597) (6.281) (6.925) 
Percent nonwhite 29.99 24.45 36.85 
 (28.74) (25.24) (31.24) 
Percent economically disadvantaged 57.15 56.48 57.98 
 (19.78) (17.57) (22.20) 
English proficiency rate relative to AYP 9.319 9.513 9.079 
threshold (previous year) (17.27) (15.10) (19.63) 
Math proficiency rate relative to AYP 11.48 12.38 10.36 
Threshold (previous year) (18.14) (14.70) (21.60) 
Overweight rate (previous year) 38.27 38.69 37.76 
 (6.264) (5.979) (6.565) 
Observations 4,588 2,539 2,049 

Notes: Pressured in past means that the minimum-scoring subgroup had a proficiency rate within 5 points of the AYP target for some year in the past. Above AYP margin implies that the minimum-scoring subgroup has never had a proficiency rate within 5 points of the AYP target, and had a proficiency rate more than 5 points above the AYP target last year. Below AYP margin implies that the minimum-scoring subgroup has never had a proficiency rate within 5 points of the AYP target, and had a proficiency rate more than 5 points below the AYP target last year. Overweight rate includes all weights above normal weight. Standard deviations in parentheses.

Comparing column 2 with column 3 shows that pressured schools have an average rate of overweight that is slightly higher than never pressured schools (38.92 versus 37.85), but this does not appear to be driven by demographic differences at these schools. Pressured schools have a very slightly lower average rate of economically disadvantaged students, and a much lower average percentage of nonwhite students. In terms of overall school performance, both pressured and nonpressured schools have similar average overall passing rates in English, and pressured schools score higher on average in math. Given the generally higher rates of obesity for poor and minority children in the United States, all else equal, we would expect the schools in column 3 to have higher rates of overweight. In fact, though, in the means we find a higher average rate of overweight for the pressured schools in column 2.30

Table 2 presents the results of building up model 1 by sequentially adding control variables. Column 1 is a basic regression that controls only for a time trend, column 2 adds controls for the fourth-order polynomials in the measures of overall math and literacy proficiency rate, and column 3 additionally adds fourth-order polynomials in percent nonwhite and economically disadvantaged in the school-year cell.31 Standard errors are clustered by school throughout. The results are stable across these specifications, with the rate of overweight about 1 percentage point higher for those schools labeled as pressured by NCLB. Although adding controls does add to the explanatory power of the model, there is very little impact on the key coefficient. Our interpretation of this result is that although the school's overall proficiency rates and demographics are important correlates of the overall rate of overweight, none of these variables is strongly correlated with our measure of NCLB pressure. In other words, two schools that are otherwise very similar in terms of overall achievement and demographic makeup may face starkly different pressures under NCLB. That differential pressure may lead to policies and practices being adopted that increase students’ rate of overweight, as suggested by the positive estimated coefficient on our measure of NCLB pressure.

Table 2. 
Effects of Accountability Pressures on School Rates of Overweight Students
(1)(2)(3)(4)
Pressured in past 1.033*** 1.220*** 1.205*** 0.522*** 
 (0.351) (0.313) (0.286) (0.151) 
Overweight rate    0.608*** 
(previous year)    (0.0153) 
Overall proficiency rate NO YES YES YES 
Demographic controls NO NO YES YES 
Observations 4,588 4,588 4,588 4,588 
R2 0.007 0.125 0.248 0.496 
(1)(2)(3)(4)
Pressured in past 1.033*** 1.220*** 1.205*** 0.522*** 
 (0.351) (0.313) (0.286) (0.151) 
Overweight rate    0.608*** 
(previous year)    (0.0153) 
Overall proficiency rate NO YES YES YES 
Demographic controls NO NO YES YES 
Observations 4,588 4,588 4,588 4,588 
R2 0.007 0.125 0.248 0.496 

Notes: Pressured in past is defined as whether a school's minimum-scoring subgroup had a proficiency rate within 5 points of the AYP target for some year in the past. Overweight rate includes all weights above normal weight. Demographic controls are a quartic in percent nonwhite and a quartic in percent economically disadvantaged. Overall proficiency rate controls are a quartic in the standardized overall literature proficiency rate and a quartic in the standardized overall math proficiency rate. All models include an annual trend. Standard errors which are robust to heteroskedasticity and within-school correlation are in parentheses.

*p < 0.1; **p < 0.05; ***p < 0.01.

The possibility remains, of course, that there are unobserved determinants of a school's obesity rate that are correlated with our measure of pressure. In addition to school policies, the rate of overweight will be determined by factors such as the school's location, students’ genetic predispositions to being overweight, local norms around food consumption and exercise, and other characteristics that we cannot observe. To the extent that these unobserved determinants are either fixed or change relatively slowly, though, we can control for them by including the school's lagged rate of overweight, because the lagged rate was itself determined by these unobserved characteristics.32 Column 4 incorporates the lagged rate of overweight along with a trend and polynomials in the school's overall proficiency rate and demographic characteristics, and is our preferred specification. After controlling for the lagged rate of overweight, the only potential source of omitted variables bias would be due to unobserved, but transitory, determinants of a school's rate of overweight that are also correlated with our measure of NCLB pressure. Note also that this saturated specification leaves the model essentially trying to explain a one-year change in the rate of overweight, a potentially much more difficult task than explaining its level. As the results in column 4 indicate, there is a strong positive relationship across years in the rate of overweight, with a coefficient of 0.61 on the lagged rate. Nonetheless, even after accounting for the lagged rate of overweight and thus all of the otherwise unobservable factors for which that controls, there is still an effect of NCLB pressure that is positive and statistically significantly different from zero. The estimated coefficient is 0.522, indicating that schools facing NCLB pressure have about a one-half of one percentage point increase in the rate of overweight since the previous year. Simulations in Anderson, Butcher, and Schanzenbach (2010) suggest that an increase of this magnitude could be obtained with just over a twelve-minute per week reduction in moderate activity (such as recess) over the school year, or well fewer than an additional 150 calories consumed per week.33 Thus, these results from our preferred model that controls for unobserved (but slow-moving or fixed) determinants of overweight are well within the range of what is plausible for an annual change in the rate of overweight from the types of policy changes schools adopt in response to NCLB pressure.

Robustness Checks

The results of table 2 are consistent with the idea that NCLB pressures lead schools to make changes to policies and practices that are harmful to students’ weight outcomes. In that table, we define NCLB pressure according to whether a school's minimum-scoring subgroup had a passing rate within 5 percentage points of the AYP threshold in any past year. This is our preferred specification because, by statute, a school's AYP status is determined by the minimum-scoring subgroup. If all subgroups but one were easily meeting the proficiency targets, and the minimum-scoring subgroup is close to the passing threshold, then the school would achieve AYP status if it could modestly improve only one subgroup's performance. Schools in this category may have been especially likely to make the types of marginal changes that could affect students’ rates of overweight. We believe these schools are unlikely to immediately undo these changes, even if they meet AYP the next year. Recall that the AYP thresholds are increasing over time, making it important for schools to maintain their effort levels. Thus, we prefer this measure of pressure that turns on once, and remains on. However, by focusing on the minimum-scoring subgroup having been close to the target at any time in the past, our treatment group will grow over time, with our control group potentially becoming more extreme. That is, over time, schools that have never been close to the AYP threshold may be more likely to have been consistently missing or making AYP by more than 5 points. Table 3 investigates the importance of the changing stock of pressured schools by showing our preferred specification for a rolling sample, starting with just the first two years of data in column 1, up to our main sample in column 5, which simply reproduces column 4 of table 2. The estimates are always statistically significant, and quite stable, remaining in the range of 0.4 to 0.5.

Table 3. 
Effects of Accountability Pressures on School Rates of Overweight Students: Alternative Sample Periods
2005—06 (1)2005—07 (2)2005—08 (3)2005—09 (4)2005—10 (5)
Min. subgroup pressured in past 0.427** 0.481*** 0.375** 0.493*** 0.522*** 
 (0.214) (0.165) (0.153) (0.155) (0.151) 
Observations 1,388 2,127 2,941 3,766 4,588 
R2 0.646 0.644 0.582 0.520 0.496 
2005—06 (1)2005—07 (2)2005—08 (3)2005—09 (4)2005—10 (5)
Min. subgroup pressured in past 0.427** 0.481*** 0.375** 0.493*** 0.522*** 
 (0.214) (0.165) (0.153) (0.155) (0.151) 
Observations 1,388 2,127 2,941 3,766 4,588 
R2 0.646 0.644 0.582 0.520 0.496 

Notes: Minimum subgroup pressured in past is defined as whether the lowest scoring subgroup in a school had a proficiency rate within 5 points of the AYP target for some year in the past. All models include the lagged rate of overweight, a quartic in percent nonwhite, a quartic in percent economically disadvantaged, a quartic in the standardized overall literature proficiency rate, a quartic in the standardized overall math proficiency rate, and an annual trend. Standard errors which are robust to heteroskedasticity and within-school correlation are in parentheses.

**p < 0.05; ***p < 0.01.

Although this stock measure of being pressured based on the minimum-scoring subgroup is our preferred measure, there are other ways to operationalize the concept of NCLB pressure. First, the assumption that schools would not immediately reverse the changes in practices that they made in response to NCLB pressure may be incorrect, meaning schools’ behaviors may only be influenced by the prior year's proficiency results. Additionally, although overall AYP status is determined by the lowest-performing subgroup, the passing rate of each group is publicly disclosed. This gives schools the incentive to get as many subgroups as possible up to the proficiency targets. As a result, having any subgroup on the margin of the passing threshold could trigger the changes in policies and practices that increase students’ rate of overweight. Finally, it may be the case that all schools that are not meeting AYP will implement these types of policies, even if on their own such marginal changes are unlikely to increase scores sufficiently to reach the thresholds. In such a case, some of our control schools are, in fact, being pressured by NCLB, and only schools clearly meeting AYP should be considered unpressured. Recall, however, that only when using a flow measure of pressure (i.e., pressured just last year) is there an obvious way to assign schools as being above or below the pressure point. For our preferred stock measure of pressure (i.e., pressured at any time in the past), we simply assign nonpressured schools a flow measure of being below the margin if the lowest-scoring subgroup last year missed AYP by more than 5 points.

Table 4, then, implements a range of alternative definitions of NCLB pressure. Each column represents a different definition of NCLB pressure, with the bottom panel restricting the control group to schools coded as above the AYP margin. The first column uses our preferred definition of pressured, so the first column of the top panel reproduces column 4 from table 2. The second column continues to define NCLB pressure by the minimum-scoring subgroup, but only codes a school as facing pressure if it was on the AYP margin the prior year. In this column, the effect is positive but not significantly different from zero.34 A potential explanation for this lack of significance is that our initial assumption is correct, and schools do not reverse their policy changes if they move out of marginal status, and thus the comparison group is increasingly populated by schools that were pressured in the past. In other words, over time the schools in the control group undertake the same behaviors as those in the treatment group, having implemented them in the past when they were first pressured.

Table 4. 
Effects of Accountability Pressures on School Rates of Overweight Students: Alternative Definitions of Pressure and Comparison Groups
Pressure Definition
Minimum Subgroup, Any Past Pressure (1)Minimum Subgroup, Pressure Last Year (2)Any Subgroup, Any Past Pressure (3)Any Subgroup, Pressure Last Year (4)
Panel A: Comparing Marginal with Non-marginal Schools 
Pressure indicator 0.522*** 0.170 1.129*** 0.650*** 
 (0.151) (0.189) (0.195) (0.179) 
R2 0.496 0.495 0.498 0.496 
Panel B: Comparing High, Marginal and Low-Scoring Schools 
Pressure indicator 1.052*** 0.550** 1.169*** 0.692*** 
 (0.202) (0.231) (0.210) (0.220) 
Below pressure margin 1.043*** 0.749*** 0.349 0.102 
 (0.268) (0.267) (0.418) (0.318) 
R2 0.497 0.495 0.498 0.496 
Pressure Definition
Minimum Subgroup, Any Past Pressure (1)Minimum Subgroup, Pressure Last Year (2)Any Subgroup, Any Past Pressure (3)Any Subgroup, Pressure Last Year (4)
Panel A: Comparing Marginal with Non-marginal Schools 
Pressure indicator 0.522*** 0.170 1.129*** 0.650*** 
 (0.151) (0.189) (0.195) (0.179) 
R2 0.496 0.495 0.498 0.496 
Panel B: Comparing High, Marginal and Low-Scoring Schools 
Pressure indicator 1.052*** 0.550** 1.169*** 0.692*** 
 (0.202) (0.231) (0.210) (0.220) 
Below pressure margin 1.043*** 0.749*** 0.349 0.102 
 (0.268) (0.267) (0.418) (0.318) 
R2 0.497 0.495 0.498 0.496 

Notes: Sample size is 4,588. The columns differ in how NCLB pressure is defined. Minimum subgroup, any past pressure (column 1) defines the pressured group by whether a school's lowest scoring subgroup had a proficiency rate within 5 points of the AYP target for some year in the past. Minimum subgroup, pressure last year (column 2) defines it based on the school's lowest-scoring subgroup score the previous year only. Any subgroup, any past pressure (column 3) is an indicator for whether any accountability subgroup in the school had a proficiency rate within 5 points of the AYP target in any prior year. Any subgroup, pressure last year (column 4) indicates whether any subgroup had a marginal proficiency rate the prior year. All models include the lagged rate of overweight, a quartic in percent nonwhite, a quartic in percent economically disadvantaged, a quartic in the standardized overall literature proficiency rate, a quartic in the standardized overall math proficiency rate, and an annual trend. Panel A compares the pressured group to nonpressured groups, and Panel B includes controls for the pressured group and below the pressured margin (omitted group is above the pressured margin—see text for definition). Standard errors which are robust to heteroskedasticity and within-school correlation are in parentheses.

**p < 0.05; ***p < 0.01.

The third and fourth columns allow for the possibility that schools are focused on getting as many subgroups to the target proficiency level as possible. Thus, a school is defined as pressured if any subgroup is in the margin of passing, either at any time in the past (column 3) or just last year (column 4). Note that in columns 1 and 2, in schools labeled as facing NCLB pressure, all other subgroups for these pressured schools were either easily meeting their proficiency targets or were just slightly above the minimum-scoring group and also within the 5-point region. By contrast, in columns 3 and 4 some of the school's other subgroups may be more than 5 points below the target threshold, but the school is defined as facing NCLB pressure as long as at least one subgroup is close to passing. Turning to column 3, the estimated coefficient on “pressured” is 1.13—about twice as large as the coefficient from our main specification. Finally, in column 4 we find an estimate of 0.65, which is well within the 95 percent confidence interval of the coefficient from our main specification.

Turning to the first column in the bottom panel, we see that using our preferred definition of “pressured” at some time in the past, but comparing this group only to those schools where the minimum-scoring subgroup was more than 5 points above the AYP threshold last year, results in a much larger estimate of 1.05. Note that this is very similar to the coefficient in the following row, which compares schools for which the minimum-scoring subgroup last year was more than 5 points below the threshold with those for which it was more than 5 points above.35 In the second column, we use a consistent flow measure, comparing schools whose minimum-scoring subgroup last year was within 5 points of AYP with those that were greater than 5 points. In this case, the contemporaneous pressure measure is significantly different from zero, at 0.55. At the same time, the coefficient for schools missing by more than 5 points is even larger (but not significantly different) at 0.75. These results seem consistent with the idea described earlier that the “control” group using the contemporaneous measure is contaminated with schools that have also felt pressure in the past, and have made policy changes that affect rates of overweight. In columns 3 and 4 the comparison group is made up of schools for which no subgroup scored less than 5 points above AYP last year, and the second row is schools for which some subgroup scored more than 5 points below last year.36 Both of these alternative definitions find positive and significant coefficients on pressure, but positive and insignificant coefficients on being below the margin. The estimated coefficients of 1.17 and 0.69 on the pressure indicator are very similar to those from the top panel.

In sum, using alternate definitions of pressure and a variety of time periods does little to change our overall finding that schools facing NCLB pressure see increases in their students’ growth rate of overweight. Using our preferred specification, which includes the school's lagged rate of overweight and defines NCLB pressure by whether the lowest-performing subgroup has ever been on the margin of the passing threshold, we find that facing NCLB pressure implies about a 0.5 percentage-point increase in a pressured school's growth rate of overweight. This result is fairly stable over time. Only when we change our definition to a contemporaneous measure, where a school is only categorized as pressured if the minimum-scoring subgroup was within 5 points in the prior year, do we find a statistically insignificant, but still positive, effect. If this measure of pressure is compared to only schools whose minimum-scoring subgroup scored more than 5 points above the threshold last year (column 2, panel B), however, significance is obtained. This suggests that schools whose minimum-scoring subgroup scored more than 5 points below the threshold last year are also likely to have been pressured by NCLB in past years. Overall, our alternate approaches find significant estimates of between about 0.4 and 1.2, all of which are consistent with the idea that policy changes due to NCLB have contributed to increasing rates of overweight among pressured schools’ students.

Event Study

The timing of increases in the rate of overweight can be estimated directly using an event study analysis. Specifically, we fit the following equation to our analysis sample:
formula
2
where overwgt is the percentage of students in school s at time t who are overweight or obese, and mathprof, litprof, pctnw, pctpoor, and year are as above. Note that the event study does not include the lagged dependent variable. The key set of explanatory variables are the event-year dummies, τst, defined so that τ = 0 in the first year that a school's minimum subgroup scores within 5 points of the passing threshold (i.e., the first year a school is pressured by our preferred measure), and τ = 1 denotes the first year after a school is declared to be under pressure, and so on. In the years in which τ < 0, the school was not yet under pressure. The coefficients are measured relative to the first year of pressure, that is, τ = 0. Schools that never have pressured status are still in the dataset, but they only help identify the relationship between covariates and overweight because the vectors of τ's are all zero for such schools. Because we have a relatively short panel, the event study analysis is unbalanced, meaning that not all schools facing NCLB pressure are observed in all time periods.

Figure 1 plots the event-year coefficients from estimating equation 2 on our sample. In the period prior to the school's minimum subgroup first scoring within 5 points of the AYP threshold, the school's rate of overweight is not significantly different from zero (that is, the rate for the omitted time period, or the first year that the school faces NCLB pressure). The year after first becoming pressured, however, the school's rate of overweight increases by a statistically significant 1 percentage point. Simulations in Anderson, Butcher, and Schanzenbach (2010) suggest that a one percentage-point increase in the rate of overweight could occur after a school year in which moderate activity levels were cut by twenty-four minutes per week. Alternatively, a slightly larger change in the rate of overweight is predicted if children consumed 150 additional calories on average per week. To the extent that schools responded to NCLB pressure by cutting as little as five minutes per day of recess or adding vending machines that resulted in students eating an extra pack of chips per week, the results from this event study conform well to the hypothesis that accountability pressures under NCLB may have caused increases in the rate of overweight.

Figure 1. 

Estimated Impact of Being Pressured under NCLB on Overweight Rate, by Year.

Figure 1. 

Estimated Impact of Being Pressured under NCLB on Overweight Rate, by Year.

6.  Conclusions

Through the No Child Left Behind Act, schools faced increasing incentives to improve performance on standardized tests, and past research clearly documents that a wide variety of schools’ policies and practices are affected by accountability pressures. Because schools are graded based primarily on standardized test scores (but not on other student outcomes such as children's health), schools facing accountability pressure may make decisions designed to increase test scores but that have unintended negative consequences for children's weight.

This paper adds to the small amount of evidence on the effect of school accountability on child health. We find schools on the margin of making AYP under NCLB rules, and thus presumably under strong pressure to improve test scores, have about a 0.5 percentage point higher growth rate of overweight among their students. This result is not identified by comparing poorly performing schools to better performing schools. In fact, the schools we identify as facing NCLB pressure are generally in the middle of the socioeconomic spectrum, with schools that perform both generally very well and very poorly. Note that this effect is estimated in models that include flexible measures of overall school test performance and demographic characteristics, as well as the schools’ lagged rate of overweight. This specification helps ensure that the coefficient of interest is not merely picking up other differences across schools on the margin of making AYP that are correlated with student overweight. Finally, the result is robust in a variety of different specifications.

These results present the first direct evidence that the NCLB accountability rules may have unintended adverse consequences for students’ weight outcomes. As we attempt to improve schools along a given dimension, it seems logical that other areas for which schools are not explicitly held accountable may suffer. As a result, parents, school administrators, and policy makers should keep in mind the potential for impacts on children's health as they consider how to devise incentives and reallocate school resources in pursuit of test score gains.

Notes

1 

In 2011, the U.S. Department of Education began allowing waivers for states to implement alternate plans. We focus only on the years when NCLB was in full effect.

2 

See also Carnoy and Loeb 2002; Jacob 2005; Figlio and Rouse 2006; Rockoff and Turner 2010; Dee and Jacob 2011; and Wong, Steiner, and Cook 2013.

3 

Although there is dispute in the literature as to whether cortisol contributes to obesity, if it does, there is the possibility that testing pressures cause stress in children, which may increase cortisol secretions (and thus perhaps lead to weight gain). Reback, Rockoff, and Schwartz (2014) find that attending a school on the margin of passing does not cause students to report being more anxious about standardized tests.

4 

We investigated the RD strategy in this paper but abandoned it when we found that, in the case of Arkansas, there were discontinuities at the passing threshold in background characteristics such as percent black and percent low-income, thus invalidating the required assumptions for RD.

5 

The passing threshold on the Arkansas state test is lower than the threshold on the National Assessment of Educational Progress (NAEP) test. In particular, 62 (61) percent of students passed the fourth grade state test in literacy (math), and 28 (26) percent of fourth graders passed the NAEP test. This 34–35 percentage-point difference in pass rates across tests is in line with the U.S. average of 32–37 points (Education Week 2006).

6 

It is common to refer to “failing” schools or “passing” schools. However, the official nomenclature is that schools that are “failing” schools are in “School Improvement Status.”

7 

Annual AYP percent passing goals by grade and subject are listed in Appendix table A.1 (available in a separate online appendix that can be accessed on Education Finance and Policy's Web site at http://www.mitpressjournals.org/doi/suppl/10.1162/EDFP_a_00201). The starting points were slightly lower for higher grades, and the annual increase in the goals were thereby slightly higher in order to reach 100 percent proficiency by 2014. Note that in 2006, Arkansas revised the AYP goals downward, requiring larger annual increases going forward. Like most other states, Arkansas received a waiver in 2011 (after our data conclude) and was no longer required to hold schools accountable to these goals.

8 

Additional details important for the data to work but add little to the intuition of the program, such as minimum subgroup size rules, the safe harbor provision, and the ability of schools to use a three-year average percent passing instead of their current pass rate, are described in more detail in Appendix B in the online appendix.

9 

“Large enough” is defined in Arkansas as 40 students or 5 percent of enrollment (whichever is larger). Student subgroups for accountability under NCLB are defined by race (whites, African Americans, Hispanics, etc.), and for low socioeconomic status students. We omit the English language learners and students with disabilities subgroups because of lack of consistent data on group size.

10 

Although the basic AYP rules are straightforward enough, in practice a school could be deemed to meet or fail to meet AYP for several other reasons. For example, even if a school (or subgroup) has a lower fraction of students meeting AYP than the passing standard required, it still might meet AYP through the “Safe Harbor” provision, which allows a school to be deemed as passing if the percentage of failing students (within subject and subgroup) declined by 10 percent relative to the prior year. On the other hand, a school would be deemed as failing despite its passing rate if too low a fraction of its students participated in the test, or if attendance or graduation rates were below the target thresholds.

11 

These corrective actions ranged from allowing students to transfer to a different non-failing school in the district in year 1, to being required to offer supplemental instruction to students in year 2, to more extreme measures such as school restructuring in year 5.

12 

In the first few years of NCLB, approximately 25 percent of Arkansas schools were out of compliance with AYP (Blankenship and Barnet 2006); 46 percent were failing in 2009.

13 

For example, if two groups are within 5 points, and the students who are failing are very close to the passing score, it may require less effort to move the school to passing than in a school with one group on the margin, but the highest-scoring student who has failed the test has to improve by 30 or more points. We do not have the underlying data that allow us to categorize schools according to the improvement that individual students need to make to achieve a passing score. However, even this detailed underlying data would not necessarily solve the problem. A school might be on the margin of passing because of a negative shock to some or all of its students (a barking dog or bad cold on the test day), and schools might expect mean reversion to improve scores with little or no effort on the part of the schools.

14 

On the other hand, Cawley, Meyerhoefer, and Newhouse (2007) find no evidence among high school students that an increase in time in PE class reduces students’ body weight or likelihood of being overweight.

15 

This paper uses a two sample two-stage least squares estimation strategy, and whether or not the school is in a state that has an accountability rule is one of the factors used in the first stage which predicts the fraction of schools in a county that gives students access to junk food.

16 

Indicating the potential for school accountability to spill over to children's health more generally, Bokhari and Schneider (2011) find that state accountability policies lead to more children being diagnosed with attention deficit hyperactivity disorders and prescribed psychostimulant drugs for its treatment.

17 

Sample size of the principal survey is 191, or approximately a 20 percent response rate. Responding schools were positively selected based on test scores, socioeconomic status, and whether they were passing under NCLB. See Anderson, Butcher, and Schanzenbach (2011) for more details.

18 

Only 1 percent of the elementary school sample reported reducing time spent in PE class, though 4 percent reported pulling poor-performing students in from recess or PE class to work further on tested subjects.

19 

As described earlier, there is no theoretically “clean” cutoff for where in the distribution NCLB pressure occurs, and 5 percentage points is an ad hoc cutoff. Using plus or minus 1, 3, …, 15 percentage points as the cutoffs results in coefficients that are within the 95 percent confidence interval of our main result (see Appendix figure B.1 in the online appendix).

20 

In a series of robustness checks, we define “pressured” in a few different ways, as described later in this paper.

21 

We also experiment with allowing the linear trend to differ by the school's demographics or past rate of overweight. This will control for differential trends in overweight for schools that differ by demographics or initial weight. Including these controls for differential trends has no impact on the role of pressure.

22 

One might be interested in measuring whether there is a particularly strong impact on the rate of overweight of students in the actual subgroups that were marginal. Unfortunately, to estimate this, we do not have access to sufficiently disaggregated data on weight outcomes.

23 

Location could be important for multiple reasons. For example, Currie et al. (2010) find that being located close to fast food restaurants increases a school's students’ probability of being obese, and Anderson and Butcher (2006a) point to the potential importance of being able to walk or bike to school.

24 

This section draws heavily from the yearly reports on the Arkansas Assessment of Childhood and Adolescent Obesity released by the Arkansas Center for Health Improvement (ACHI). Reports are available online at: www.achi.net.

25 

Training included taking each measure a number of times to ensure accuracy.

26 

Weight categories are based on age-by-gender percentiles from a fixed population, where underweight is below the 5th percentile, overweight is above the 85th percentile, and obese is above the 95th percentile.

27 

Note that for some schools, overweight and obesity rates are not broken out separately, but combined, so we use this measure of overweight or obese in our models.

28 

The subgroups are white, African American, Hispanic, and low socioeconomic status students. Pass rates are listed as long as there are ten in a subgroup but only count toward AYP if there are forty individuals or a number equaling at least 5 percent of school enrollments, whichever is greater. We omit the English language learners and students with disabilities subgroups because of lack of consistent data on group size.

29 

It is important to realize that although the control group of never pressured schools includes both schools clearly failing last year and clearly passing last year, this should not be interpreted as representing schools that always fail or always pass. Of schools clearly failing in one year, 10 percent are clearly passing the next, and of schools clearly passing in one year, 13 percent are clearly failing the next.

30 

Appendix table A.2 in the online appendix provides summary statistics separately for schools above and below the pressured margin.

31 

We experimented with allowing the trend to vary across demographic groups and/or past overweight rate. In all cases the coefficient on pressured was significant and between 0.50 and 0.52, so for simplicity we use only a simple trend. We also investigated allowing the impact of pressure to vary by whether the school just missed (versus just passed) the threshold, but the interaction term was close to zero and insignificant.

32 

An alternative approach is to include school fixed effects to control for unobserved (and observed) determinants of the rate of overweight that are fixed over time. When we control for these other potential determinants of the rate of overweight using fixed effects, we estimate a coefficient on pressured that is positive, statistically significantly different from zero, and not statistically significantly different from the estimate in our preferred specification (see Appendix table A.3 in the online appendix).

33 

Although the published tables only present calorie changes in 150-calorie increments, unpublished results indicate that it would actually take under 50 extra calories per week to increase a school's rate of overweight by 0.522 percentage points.

34 

Using this measure with different samples (as in table 3) reveals that the impact of “pressured last year” declines monotonically across the samples but is significant, around p = 0.05 for the first two samples.

35 

It is important to remember that these “tails” are defined on a flow basis (the performance of the lowest-scoring subgroup last year), and the “pressure” measure is defined on a stock basis (the performance of the lowest-scoring subgroup any time in the past).

36 

Defining the “tails” here is even more complicated because of the use of any subgroup to define pressured. Even when focused on just last year, although some subgroup is within 5 points, another subgroup may be below 5 points and another subgroup above 5 points. Thus, if any subgroup missed AYP by more than 5 points last year, we assign the school to that group, making the control group schools where no subgroup was less than 5 points above AYP last year.

Acknowledgments

We thank Jannine Riggs, Denise Airola, and Jim Boardman of the Arkansas Department of Education for helpful discussions about the Arkansas education data and accountability rules. We received generous financial support from the Robert Wood Johnson Foundation (grant 57922), a Rockefeller Center at Dartmouth College Reiss Family Faculty Research Grant, and a Wellesley College Faculty Award. Elora Ditton, Brian Dunne, A. J. Felkey, Brenna Jenny, and Alan Kwan provided excellent research assistance. We thank Eric Edmonds; Jonathan Guryan; seminar participants at UC-Davis, Boston College, Louisiana State University, the University of Toulouse; and conference participants at the American Economic Association 2010 annual meetings and the Rockefeller Center Health Policy Workshop for helpful comments.

REFERENCES

Anderson
,
Patricia M.
, and
Kristin F.
Butcher
.
2006a
.
Childhood obesity: Trends and potential causes
.
Future of Children: Child Overweight and Obesity
16
(
1
):
19
45
. doi:10.1353/foc.2006.0001.
Anderson
,
Patricia M.
, and
Kristin F.
Butcher
.
2006b
.
Reading, writing, and refreshments: Do school finances contribute to childhood obesity
?
Journal of Human Resources
41
(
3
):
467
494
. doi:10.3368/jhr.XLI.3.467.
Anderson
,
Patricia M.
,
Kristin F.
Butcher
, and
Diane Whitmore
Schanzenbach
.
2010
.
School policies and children's obesity
. In
Current issues in health economics
, edited by
Daniel
Slottje
and
Rusty
Tchernis
, pp.
1
16
.
Bingley, UK
:
Emerald Group Publishing Limited
. doi:10.1108/S0573-8555(2010)0000290004.
Anderson
,
Patricia M.
,
Kristin F.
Butcher
, and
Diane Whitmore
Schanzenbach
.
2011
.
Adequate (or adipose?) yearly progress: Assessing the effect of No Child Left Behind on children's obesity
.
NBER Working Paper No. 16873
.
Arkansas Center for Health Improvement (ACHI)
.
2004
.
The Arkansas assessment of childhood and adolescent obesity
.
Little Rock
:
ACHI
.
Ballou
,
Dale
, and
Matthew G.
Springer
.
2008
.
Achievement trade-offs and No Child Left Behind
.
Unpublished paper, Vanderbilt University
.
Blankenship
,
Virginia H.
, and
Joshua H.
Barnet
.
2006
.
AYP in Arkansas: Who's on track
?
Arkansas Education Report
3
(
2
):
1
13
.
Bokhari
,
Farasat A. S.
, and
Helen
Schneider
.
2011
.
School accountability laws and the consumption of psychostimulants
.
Journal of Health Economics
30
(
2
):
355
372
. doi:10.1016/j.jhealeco.2011.01.007.
Carnoy
,
Martin
, and
Susanna
Loeb
.
2002
.
Does external accountability affect student outcomes? A cross-state analysis
.
Educational Evaluation and Policy Analysis
24
(
4
):
305
331
. doi:10.3102/01623737024004305.
Cawley
,
John
,
Chad
Meyerhoefer
, and
David
Newhouse
.
2007
.
The impact of state physical education requirements on youth physical activity and overweight
.
Health Economics
16
(
12
):
1287
1301
. doi:10.1002/hec.1218.
Cawley
,
John
,
David
Frisvold
, and
Chad
Meyerhoefer
.
2013
.
The impact of physical education on obesity among elementary school children
.
Journal of Health Economics
32
(
4
):
743
755
. doi:10.1016/j.jhealeco.2013.04.006.
Center on Education Policy
.
2007
.
Choices, changes, and challenges: Curriculum and instruction in the NCLB era
.
Washington, DC
:
Center on Education Policy
.
Chakrabarti
,
Rajashri
.
2013a
.
Accountability with voucher threats, responses, and the test-taking population: Regression discontinuity evidence from Florida
.
Education Finance and Policy
8
(
2
):
121
167
. doi:10.1162/EDFP_a_00088.
Chakrabarti
,
Rajashri
.
2013b
.
Vouchers, public school response, and the role of incentives: Evidence from Florida
.
Economic Inquiry
51
(
1
):
500
526
. doi:10.1111/j.1465-7295.2012.00455.x.
Chakrabarti
,
Rajashri
.
2014
.
Incentives and responses under No Child Left Behind: Credible threats and the role of competition
.
Journal of Public Economics
110
:
124
146
. doi:10.1016/j.jpubeco.2013.08.005.
Chiang
,
Hanley
.
2009
.
How accountability pressure on failing schools affects student achievement
.
Journal of Public Economics
93
(
9
):
1045
1057
. doi:10.1016/j.jpubeco.2009.06.002.
Cullen
,
Julie Berry
, and
Randall
Reback
.
2006
.
Tinkering toward accolades: School gaming under performance accountability system
. In
Improving school accountability: Check-ups or choice, advances in applied microeconomics
, vol.
14
, edited by
Timothy J.
Gronberg
and
Dennis W.
Jansen
, pp.
1
34
.
Amsterdam
:
Elsevier Science
. doi:10.1016/S0278-0984(06)14001-8.
Currie
,
Janet
,
Stefano Della
Vigna
,
Enrico
Moretti
, and
Vikram
Pathania
.
2010
.
The effect of fast food restaurants on obesity and weight gain
.
American Economic Journal: Economic Policy
2
(
3
):
32
63
. doi:10.1257/pol.2.3.32.
Davidson
,
Elizabeth
,
Randall
Reback
,
Jonah E.
Rockoff
, and
Heather L.
Schwartz
.
2013
.
Fifty ways to leave a child behind: Idiosyncrasies and discrepancies in states’ implementation of NCLB
.
NBER Working Paper No. 18988
.
Dee
,
Thomas S.
, and
Brian A.
Jacob
.
2011
.
The impact of No Child Left Behind on student achievement
.
Journal of Policy Analysis and Management
30
(
3
):
418
446
. doi:10.1002/pam.20586.
Dee
,
Thomas S.
, and
Brian A.
Jacob
.
2010
.
The impact of No Child Left Behind on students, teachers, and schools
.
Brookings Papers on Economic Activity
(
Fall
):
149
207
. doi:10.1353/eca.2010.0014.
Education Week
.
2006
.
Quality counts at 10: A decade of standard-based education
. Available www.edweek.org/ew/toc/2006/01/05/index.html.
Accessed 13 November 2012
.
Figlio
,
David
.
2006
.
Testing, crime and punishment
.
Journal of Public Economics
90
(
4–5
):
837
851
. doi:10.1016/j.jpubeco.2005.01.003.
Figlio
,
David
, and
Helen
Ladd
.
2015
.
School accountability and student achievement
. In
Handbook of research in education finance and policy
, edited by
Helen F.
Ladd
and
Margaret E.
Goertz
, pp. 
166
182
.
New York
:
Routledge
.
Figlio
,
David
, and
Cecilia Elena
Rouse
.
2006
.
Do accountability and voucher threats improve low-performing schools
?
Journal of Public Economics
90
(
1–2
):
239
255
. doi:10.1016/j.jpubeco.2005.08.005.
Figlio
,
David
, and
Joshua
Winicki
.
2005
.
Food for thought: The effects of school accountability plans on school nutrition
.
Journal of Public Economics
89
(
2–3
):
381
394
. doi:10.1016/j.jpubeco.2003.10.007.
Hoynes
,
Hilary W.
, and
Diane Whitmore
Schanzenbach
.
2015
.
U.S. food and nutrition programs
.
NBER Working Paper No. 21057
.
Jacob
,
Brian
.
2005
.
Accountability, incentives and behavior: The impact of high-stakes testing in Chicago public schools
.
Journal of Public Economics
89
(
5–6
):
761
796
. doi:10.1016/j.jpubeco.2004.08.004.
Jacob
,
Brian A.
, and
Steven
Levitt
.
2003
.
Rotten apples: An investigation of the prevalence and predictors of teacher cheating
.
Quarterly Journal of Economics
118
(
3
):
843
877
. doi:10.1162/00335530360698441.
Neal
,
Derek
, and
Diane Whitmore
Schanzenbach
.
2009
.
Left behind by design: Proficiency counts and test-based accountability
.
Review of Economics and Statistics
92
(
2
):
263
283
. doi:10.1162/rest.2010.12318.
Ogden
,
Cynthia L.
,
Margaret D.
Carroll
,
Lester R.
Curtin
,
Margaret A.
McDowell
,
Carolyn J.
Tabak
, and
Katherine M.
Flegal
.
2006
.
Prevalence of overweight and obesity in the United States, 1999–2004
.
Journal of the American Medical Association
295
(
13
):
1549
1555
. doi:10.1001/jama.295.13.1549.
Reback
,
Randall
.
2008
.
Teaching to the rating: School accountability and distribution of student achievement
.
Journal of Public Economics
92
(
5
):
1394
1415
. doi:10.1016/j.jpubeco.2007.05.003.
Reback
,
Randall
,
Jonah
Rockoff
, and
Heather L.
Schwartz
.
2014
.
Under pressure: Job security, resource allocation, and productivity in schools under No Child Left Behind
.
American Economic Journal: Economic Policy
6
(
3
):
207
241
. doi:10.1257/pol.6.3.207.
Rockoff
,
Jonah
, and
Lesley J.
Turner
.
2010
.
Short-run impacts of accountability on school quality
.
American Economic Journal: Economic Policy
2
(
4
):
119
147
. doi:10.1257/pol.2.4.119.
Rouse
,
Cecelia Elena
,
Jane
Hannaway
,
Dan
Goldhaber
, and
David
Figlio
.
2013
.
Feeling the Florida heat? How low-performing schools respond to voucher and accountability pressure
.
American Economic Journal: Economic Policy
5
(
2
):
251
281
. doi:10.1257/pol.5.2.251.
Sass
,
Tim R.
,
Jarod
Apperson
, and
Carycruz
Bueno
.
2015
.
The long-run effects of teacher cheating on student outcomes: A report for the Atlanta public schools
. Available www.atlantapublicschools.us/crctreport.
Accessed 1 October 2015
.
Wong
,
Vivian C.
,
Peter M.
Steiner
, and
Thomas D.
Cook
.
2013
.
Analyzing regression-discontinuity designs with multiple assignment variables: A comparative study of four estimation methods
.
Journal of Educational and Behavioral Statistics
38
(
2
):
107
141
. doi:10.3102/1076998611432172.
Yin
,
Lu
.
2009
.
Are school accountability systems contributing to adolescent obesity
?
Unpublished paper, University of Florida
.

Supplementary data