Abstract

Student absences are a potentially important, yet understudied, input in the educational process. Using longitudinal data from a nationally representative survey and rich administrative records from North Carolina, we investigate the relationship between student absences and academic performance. Generally, student absences are associated with modest but statistically significant decreases in academic achievement. The harmful effects of absences are approximately linear, and are two to three times larger among fourth and fifth graders in North Carolina than among kindergarten and first-grade students in the nationally representative Early Childhood Longitudinal Study. In both datasets, absences similarly reduce achievement in urban, rural, and suburban schools. In North Carolina, the harm associated with student absences is greater among both low-income students and English language learners, particularly for reading achievement. Also, in North Carolina, unexcused absences are twice as harmful as excused absences. Policy implications and directions for future research are discussed.

1.  Introduction

The achievement gap between students from different socioeconomic backgrounds has grown over the past several decades despite substantial efforts to close such gaps (Reardon 2011). Understanding the source(s) of the achievement gap is crucial to devising an appropriate policy response (Fryer and Levitt 2004). Student attendance is a potentially important, yet relatively understudied, input in the educational process: Absences disrupt learning, weaken schools’ and classrooms’ sense of community, and reduce students’ exposure to classroom instruction. By reducing student instructional time, student absences also undermine investments in school and teacher quality. Indeed, quasi-experimental research that exploits various sources of exogenous variation in instructional time consistently finds evidence of a significant, arguably causal relationship between instructional time and academic achievement (e.g., Pischke 2007; Marcotte and Hemelt 2008; Sims 2008; Marcotte and Hansen 2010; Fitzpatrick, Grissmer, and Hastedt 2011; Hansen 2011).

Accordingly, student absences potentially contribute to the achievement gap in two ways. First, absence rates are higher among socioeconomically disadvantaged students (Ready 2010; Morrissey, Hutchison, and Winsler 2014), so such students are exposed to the potentially harmful effects of absences more often. Second, absences may cause greater harm to students who reside in socioeconomically disadvantaged households because such households may be less able to compensate for lost instructional time than their more advantaged counterparts (Chang and Romero 2008). It is particularly important that policy makers and educators understand the consequences of primary school student absences, as children's sociobehavioral (i.e., noncognitive) skills are affected by their early environment (Heckman, Stixrud, and Urzua 2006) and problems of chronic absence and school disengagement manifest as early as first grade (Alexander, Entwisle, and Kabbani 2001; Schoeneberger 2012). Understanding the extent to which absences harm student achievement, how such effects vary across students and schools, and how absences contribute to the persistence of achievement gaps, will inform policy makers’ understanding of why school-based interventions have not closed the achievement gap. This will also help us to understand the likely benefits of interventions that aim to either reduce student absenteeism or assist students in “catching up” following an absence spell.

The relationship between student attendance and academic achievement is relatively understudied, particularly at the primary level (Ready 2010). Much of the existing literature on the relationship between attendance and academic performance is correlational and the few studies that have attempted to identify causal effects of absences have limited external validity because they focus on single urban districts (e.g., Gottfried 2009, 2011). There are two recent exceptions to this critique. First, using snowfall as an instrumental variable (IV) for absences, Goodman (2014) finds that absences are associated with relatively large, statistically significant reductions in the math achievement of third- through eighth-grade public school students in Massachusetts. Goodman also presents student fixed effects (FE) estimates of the effect of student absences on academic achievement, which are smaller in magnitude than the IV estimates, yet remain negative and statistically significant. Second, Aucejo and Romano (2016) use administrative data from North Carolina to identify the effect of absences on achievement by estimating three-way student, teacher, and school FE models. The authors find significant, negative effects of absences that are robust to conditioning on family-by-year FE and to instrumenting for student absences using data on county-level influenza cases.

We contribute to this emerging literature by investigating the influence of primary-school student absences on academic achievement using survey data from the nationally representative Early Childhood Longitudinal Study-Kindergarten Cohort (ECLS-K) and longitudinal student-level administrative data on the population of primary-school students in North Carolina's public schools. Our empirical strategy is to include student absences as a current input in value-added models (VAMs) of the education production function that condition on classroom FE. The inclusion of classroom FE is a potentially important methodological innovation of the current study, as they control for the nonrandom sorting of teachers across schools and classrooms, and classroom-specific shocks that jointly influence both absences and achievement (e.g., a flu epidemic or a particularly effective teacher) (Monk and Ibrahim 1984; Gershenson 2015). As a result, our estimates of absences’ effects on performance rely on within-classroom variation in student absences, holding past achievement constant.1 Additionally, we examine the functional form of the relationship between student absences and academic achievement, test for heterogeneity across students and absence type in the relationship between student absences and academic achievement, and provide the first formal analysis of whether the effect of absences varies between urban, rural, and suburban settings.

The ECLS-K and North Carolina data are complementary in that they each have unique strengths and weaknesses, which we discuss below. Thus, it is reassuring that the two datasets provide largely similar results, and suggests that the results generalize beyond the state of North Carolina.2 Namely, the harmful effects of student absences are statistically significant for math and reading and modest in size: A one standard deviation (SD) increase in absences is associated with decreases in reading and math achievement of 0.02 and 0.04 test-score SD, respectively. These effects are arguably practically significant, as they are similar in magnitude to the effect of a one SD increase in teacher absences (Clotfelter, Ladd, and Vigdor 2009; Herrmann and Rockoff 2012) and constitute about one third of the effect of a one SD increase in teacher effectiveness (Kane, Rockoff, and Staiger 2008; Hanushek and Rivkin 2010).

The paper proceeds as follows: Section 2 reviews the existing literature on student absences. Sections 3 and 4 describe the data and methods used in the current study, respectively. The results are reported in section 5. Section 6 concludes with a discussion of policy implications and directions for future research.

2.  Literature Review

If student absences harm achievement and disadvantaged students are absent at higher rates than their more advantaged counterparts, differential rates of student attendance may contribute to achievement gaps. A small number of studies have investigated the household-level correlates of primary-school student absences in the United States (e.g., Ready 2010; Morrissey, Hutchison, and Winsler 2014). Ready (2010) shows that household socioeconomic status (SES), as measured by an index composed of parents’ income, educational attainment, and occupational prestige, is strongly negatively correlated with student absences in the nationally representative ECLS-K. Romero and Lee (2008) note that children of young mothers are more likely to be chronically absent and a National Center for Education Statistics report (USDOE 2006) indicates that poor children are about 25 percent more likely than their wealthier peers to be absent three or more times per month.3

Gottfried (2009) provides a more nuanced analysis of the predictors of second through fourth graders’ absences in Philadelphia Public Schools by distinguishing between excused and unexcused absences. Perhaps unsurprisingly, Gottfried finds that as students’ total absences increase, so too does the percentage of absences that are unexcused. Similarly, Gottfried finds that students who either have behavioral problems or are eligible for reduced-price lunch programs experience significantly more total absences and unexcused absences, yet fewer excused absences. Together, these findings suggest that a sizable percentage of chronically absent students’ absences are discretionary and thus potentially avoidable, at least in urban, high-poverty districts such as Philadelphia.

From the standpoint of education policy, the importance of student absences depends upon the causal relationship between student absences and children's cognitive and social development. The early literature on the relationship between student attendance and academic performance largely focused on high school students. For example, Monk and Ibrahim's (1984) analysis of student-level data from one school in upstate New York generally found student absences to be negatively associated with performance on ninth-grade algebra exams. Absences may affect the educational achievement of older students differently than they affect elementary school students, however, for at least three reasons. First, parents may be less able to assist older students make up more advanced work. Second, the underlying causes of absences may be different for older students. Third, elementary school students in self-contained classrooms may have an easier time making up missed work, as doing so requires coordinating with one self-contained classroom teacher. Early empirical studies of the relationship between primary school attendance and academic performance used cross-sectional school-level data to show a negative and statistically significant correlation between schools’ average daily attendance and performance on standardized tests (e.g., Caldas 1993; Roby 2004).

More recently, scholars have recognized the benefits of using student-level longitudinal data to investigate the relationship between absences and academic performance among primary school students. Such school-level analyses ignore potentially substantial within-school and within-student variation in absence rates (Ready 2010). Two of these studies were conducted by Gottfried (2009, 2011), who estimated lagged test score VAMs of the education production function that included student absences as contemporaneous inputs using data on second through fourth graders in Philadelphia Public Schools. Gottfried found that a one SD increase in absences lowered test scores by about one tenth of a test-score SD, students with higher ratios of excused to unexcused absences performed better, and conditioning on family FE slightly increased the estimated magnitude of absences’ effects on academic performance. Similarly, Noell et al. (2008) controlled for student absences in value-added analyses of teacher preparation programs in Louisiana and found a statistically significant negative coefficient on absences. Using the ECLS-K, Ready (2010) estimated growth-curve models of students’ academic performance in kindergarten and first grade, paying particular attention to the effects of absences (an SES index) and SES-absence interactions, and found a statistically significant negative relationship between absences and literacy development during kindergarten and first grade that was stronger among low-SES students.

Two recent studies provide the most convincing evidence to date of a causal relationship between student absences and academic achievement. Using administrative data from Massachusetts, Goodman (2014) begins by conditioning on both student and grade-by-year FE in linear regression models. The resulting estimates suggest that an additional absence lowers math and reading achievement by about 0.008 test-score SD. A limitation of this approach is that teachers cannot be linked to students in the Massachusetts data, so the student-FE estimates are potentially biased by omitted teacher effects that jointly affect absences and achievement. Thus the author uses an IV strategy that exploits geographic and temporal variation in snowfall as a source of exogenous variation in student absences. The IV estimates are substantially larger, suggesting that an additional absence decreases math achievement by 0.05 test-score SD. The study by Aucejo and Romano (2016), which is most closely related to the current study, similarly uses state administrative data from North Carolina to generate both FE and IV estimates of the effect of student absences on achievement. Regarding the former, the authors estimate a three-way FE model that includes student, teacher, and school FE. Estimates of this, their preferred specification, suggests that a reduction of ten absences would improve math test scores by about 0.05 SD. Importantly, the authors show that this result is robust to either controlling for time-varying family FE or instrumenting for absences with flu data from North Carolina's Disease Event Tracking and Epidemiologic Collection Tool.

The current study contributes in several ways to the existing literature on the relationship between student absences and academic achievement in U.S. primary schools. We are the first to condition on classroom FE in addition to lagged achievement and unobserved student heterogeneity. Controlling for classroom FE is an important extension of the recent studies mentioned above, because there may be any number of unobservable classroom-level shocks that jointly predict student absences and student achievement (e.g., a particular mix of peers, an outbreak of a contagious disease, a particularly disruptive student, a particularly good match between teacher and students). Classroom FE also account for potential differences in how absences and tardies are reported across classrooms. Second, we are the first to simultaneously analyze rich longitudinal administrative data alongside nationally representative survey data to make inferences about the generalizability of the former. Third, we investigate several dimensions of potential heterogeneity in the effect of absences, such as by absence type and by students’ gender, grade level, English language proficiency, poverty status, special education classification, and prior achievement. We also test for heterogeneity by school locale, to determine whether absences harm student achievement in all school settings or only in urban districts. Finally, we investigate the functional form of the relationship between absences and performance by allowing for a nonparametric, nonlinear relationship between student absences and achievement.

3.  Data

The current study investigates the relationship between student absences and academic achievement using two complementary datasets, each with its own strengths and limitations. In this section we describe each in turn, and conclude by comparing the two.

ECLS-K Data

The ECLS-K is a longitudinal dataset collected by the National Center for Education Statistics (NCES). The original sample of approximately 21,400 children from about 1,000 schools was designed to be nationally representative of kindergartners during the 1998–99 academic year. Because some demographic groups were intentionally oversampled, we weight all subsequent analyses of the ECLS-K data using sampling weights provided by the NCES.4 The ECLS-K data include information collected from children, parents, teachers, and school administrators during the fall and spring of the kindergarten and first-grade academic years as well as the spring of third, fifth, and eighth grades. The primary analyses use both kindergarten waves and the spring first-grade wave of data, as test scores are available for the full sample of children around the beginning and end of kindergarten and the end of first grade. Students who experienced a mid-year classroom change, repeated a grade, or changed schools during either kindergarten or first grade are excluded from the analysis, as are students who are missing demographic, total absence, or test-score data. These exclusions result in a baseline analytic sample of 11,600 student-year observations.5

Importantly for the current study, the majority of schools surveyed by the ECLS-K reported administrative student-level attendance and school lateness (tardy) records in the spring survey waves and 7,500 student records in the baseline analytic sample (about 65 percent) distinguish between excused and unexcused absences. Missing data on excused versus unexcused absences in the ECLS-K is generally a school-level phenomenon. The absences survey instrument specifically asks that the student record form be completed after the last day of school, so ECLS-K attendance records contain students’ total absences for the entire school year. Unfortunately, the dates of specific absences are unobserved, which prevents restricting the analysis to absences that occurred prior to year-end tests or before the kindergarten fall assessment.6

The ECLS-K directly measured cognitive development by administering age-appropriate reading and mathematics tests in each wave of the survey. In kindergarten and first grade, math examinations tested children's abilities on the following subjects: numbers and shapes, relative size, ordinality and sequence, addition and subtraction, and multiplication and division. The reading examinations tested kindergartners and first-graders on letter recognition, beginning sounds, ending sounds, sight words, and words in context. Because achievement tests used a two-stage assessment approach, all children did not take the same exam. Hence, the ECLS-K computed scaled-test scores based on the full set of test items using Item Response Theory (NCES 2002).

In addition to being nationally representative of the 1998–99 U.S. kindergarten cohort, a second advantage of the ECLS-K data over state or district administrative data is the availability of detailed information on the composition and characteristics of students’ households over and above what is typically found in administrative data. In addition to information on race/ethnicity, gender, poverty status, mother's education, the child's kindergarten redshirt status, urbanicity, whether the child spoke English at home, and whether the child had an individualized education program (IEP), the ECLS-K contains information on three household characteristics that may jointly predict academic achievement and school attendance: the number of adults living in the student's household, mother's employment status, and mother's marital status.7 For example, the presence of multiple household adults might increase achievement by providing additional tutoring support at home and increase absences by ensuring that an adult is available to care for children who do not attend school. Alternatively, the presence of multiple household adults may decrease absences by increasing the likelihood that someone is available to facilitate attendance. Similar arguments apply to mother's employment.

North Carolina Data

A limitation of the ECLS-K is that only a small number of students were sampled in most classrooms, which results in limited within-classroom variation in student absences with which to identify the relationship between student absences and student achievement. Accordingly, we augment analyses of the ECLS-K data with similar analyses of longitudinal administrative data on the population of third through fifth graders who attended North Carolina's public schools between the 2005–06 and 2009–10 school years. These student-level data are maintained and provided by the North Carolina Education Research Data Center (NCERDC).8 The NCERDC data contain administrative records on students’ race, gender, poverty status, urbanicity, limited English proficiency (LEP) status, whether the student had administratively classified math or reading learning disabilities, total absences, whether the absence was excused or unexcused, total tardies, student–classroom links, and end-of-grade math and reading test scores.9 The baseline analytic sample comprises fourth and fifth graders.

Students who experienced a mid-year classroom change, repeated a grade, or changed schools are excluded from the analysis, as are students who are missing total absence, test score, test date, or demographic data. These exclusions result in a sample of 903,314 student-year observations, which we subsequently refer to as the full sample. As in the ECLS-K, however, only about two thirds (634,013) of these student-year records distinguish between excused and unexcused absences, and again data on absence type are generally missing at the school level. Data on tardies are frequently missing as well, mostly at the school level for the 2008–09 and 2009–10 academic years, and are only available for 587,919 student-year observations. The distinction between excused and unexcused absences is made for all students for whom tardies are observed. Accordingly, we treat students for whom test-score, background characteristics, classroom identifiers, and absence and tardy data are observed as the baseline analytic sample, because models that exclude tardies will yield biased estimates if tardies are correlated with absences and influence achievement.10

Sample Characteristics

Table 1 provides summary statistics for the ECLS-K and North Carolina analytic samples. In this and all subsequent analyses, test scores in both datasets are standardized by subject, grade, and year to have mean equal to zero and SD equal to 1. Standardized test-score means and SD are not precisely zero and one in the analytic samples because they were standardized using all available test scores.

Table 1. 
Descriptive Statistics of Analytic Samples
ECLS-KNorth Carolina
MeanSDNMeanSDN
Standardized test scores       
Math 0.20 0.89 11,600 0.06 0.98 903,314 
Reading 0.22 0.79 11,600 0.05 0.98 903,314 
Absences and tardies       
Total absences 7.98 9.54 11,600 6.22 5.66 903,314 
Within-classroom SD  6.88   5.42  
Within-student SD  5.39   1.95  
Total tardies 3.33 7.17 11,600 1.94 5.37 587,919 
Excused absences 6.63 6.43 7,500 3.40 4.23 634,013 
Unexcused absences 1.80 7.95 7,500 2.35 3.32 634,013 
Chronic absence (18+) 0.08  11,600 0.04  903,314 
First grade 45.3%  11,600    
Fifth grade    50.1%  903,314 
Child race/ethnicity       
Non-Hispanic white 70.7%  11,600 56.7%  903,314 
Non-Hispanic black 12.6%  11,600 26.0%  903,314 
Hispanic 10.2%  11,600 9.8%  903,314 
Other 6.5%  11,600 7.5%  903,314 
Female 50.8%  11,600 50.0%  903,314 
Below poverty level 12.8%  11,600 47.2%  903,314 
No English at home/LEP 4.7%  11,600 1.3%  903,314 
Student has an IEP 5.8%  11,600    
Math disability    1.5%  903,314 
Reading disability    3.0%  903,314 
Any learning disability    3.5%  903,314 
Kindergarten redshirt 7.1%  11,600    
Mother's education       
No HS diploma 8.0%  11,600    
HS graduate 30.0%  11,600    
Some college 35.3%  11,600    
Bachelor's or more 26.8%  11,600    
Urban school 31.0%  11,600 33.2%  903,314 
Rural school 16.0%  11,600 45.8%  903,314 
Suburban school 53.0%  11,600 21.0%  903,314 
ECLS-KNorth Carolina
MeanSDNMeanSDN
Standardized test scores       
Math 0.20 0.89 11,600 0.06 0.98 903,314 
Reading 0.22 0.79 11,600 0.05 0.98 903,314 
Absences and tardies       
Total absences 7.98 9.54 11,600 6.22 5.66 903,314 
Within-classroom SD  6.88   5.42  
Within-student SD  5.39   1.95  
Total tardies 3.33 7.17 11,600 1.94 5.37 587,919 
Excused absences 6.63 6.43 7,500 3.40 4.23 634,013 
Unexcused absences 1.80 7.95 7,500 2.35 3.32 634,013 
Chronic absence (18+) 0.08  11,600 0.04  903,314 
First grade 45.3%  11,600    
Fifth grade    50.1%  903,314 
Child race/ethnicity       
Non-Hispanic white 70.7%  11,600 56.7%  903,314 
Non-Hispanic black 12.6%  11,600 26.0%  903,314 
Hispanic 10.2%  11,600 9.8%  903,314 
Other 6.5%  11,600 7.5%  903,314 
Female 50.8%  11,600 50.0%  903,314 
Below poverty level 12.8%  11,600 47.2%  903,314 
No English at home/LEP 4.7%  11,600 1.3%  903,314 
Student has an IEP 5.8%  11,600    
Math disability    1.5%  903,314 
Reading disability    3.0%  903,314 
Any learning disability    3.5%  903,314 
Kindergarten redshirt 7.1%  11,600    
Mother's education       
No HS diploma 8.0%  11,600    
HS graduate 30.0%  11,600    
Some college 35.3%  11,600    
Bachelor's or more 26.8%  11,600    
Urban school 31.0%  11,600 33.2%  903,314 
Rural school 16.0%  11,600 45.8%  903,314 
Suburban school 53.0%  11,600 21.0%  903,314 

Notes: ECLS-K means and standard deviations (SDs) are weighted by ECLS-K provided sampling weight C#CW0. Kindergarten and fourth grade are the omitted grade categories in the ECLS-K and North Carolina (NC) data, respectively. The ECLS-K asks whether English is spoken in the student's home. The NC data classifies children as having limited English proficiency (LEP). Individualized education plans (IEPs) identify students who have learning disabilities in the ECLS-K. The redshirt variable indicates whether the family of a kindergarten-aged child delayed entry into kindergarten. ECLS-K sample sizes are rounded to the nearest 50 to conform to NCES regulations.

The average student was absent six times in the North Carolina data and eight times in the ECLS-K, which is in line with average student absence rates in Massachusetts (Goodman 2014) and in southern Florida (Morrissey, Hutchison, and Winsler 2014), but notably smaller than the average of twelve absences per year in the predominantly black, low-income Philadelphia School District (Gottfried 2011). There is a sizable amount of variation in absences in both datasets, particularly in the ECLS-K, as seen in the estimated SD of about 9.5 and 5.7 in the ECLS-K and North Carolina data, respectively. To better understand the sources of this variation we also computed within-classroom and within-student SD.11 The within-classroom SDs indicate that about 72 percent and 95 percent of the variation in student absences in the ECLS-K and in North Carolina data, respectively, occur within, as opposed to between, classrooms. This is the variation in absences exploited by the preferred classroom fixed effects estimators. The within-student SDs indicate that only one half to one third of the total variation in student absences is the result of year-to-year changes in student attendance. This suggests that student attendance is somewhat “sticky,” a point to which we return when discussing the econometric model and threats to validity.

When information on the type of absence was collected, which is an important caveat, excused absences are more common in both datasets. Tardies are less frequent than absences in both datasets. Chronic absence, which we characterize as being absent eighteen or more times in a given school year, is twice as prevalent in the ECLS-K as in the North Carolina data.12

Table 1 also shows that the North Carolina data are constituted of more black and low-income students than the nationally representative ECLS-K, which is to be expected given North Carolina's demographics. The ECLS-K and North Carolina analytic samples are approximately evenly split between kindergarteners and first graders, and fourth and fifth graders, respectively. About 7 percent of sampled kindergarteners are “redshirts” who delayed kindergarten entrance by one year. Boys and girls are equally represented in both datasets. About 5 percent of children in the ECLS-K reported not speaking English at home and one percent of children in North Carolina were administratively classified as LEP. Intuitively, the latter rate is likely lower because some children of first-generation immigrants speak English proficiently but speak the parents’ native language at home. About 6 percent of students had an IEP in the ECLS-K and 3.5 percent of students were categorized as having a learning disability in either math or reading in North Carolina. Years are equivalent to grade levels in the ECLS-K, and are thus not reported in table 1, because the survey follows one cohort of children over time and grade skippers and repeaters are excluded from the analytic sample.

As discussed in the Introduction, differential rates of student absences may contribute to achievement gaps even if absences uniformly harm all students’ achievements. Tables 2 and 3 examine differences in student absence and tardy rates by grade level, gender, poverty status, English language proficiency, and learning disabilities in the ECLS-K and North Carolina data, respectively. Table 2 finds a small but statistically significant difference of about one additional absence for children who do not speak English at home and children who have an IEP, but no difference between boys and girls. The most striking difference in table 2 is by poverty status, as students living in households below the poverty line experienced nearly five more absences than their counterparts in households at or above the poverty line as defined by the ECLS-K. This is an arguably practically significant difference, which corresponds to half the sample standard deviation reported in table 1. Similar patterns are observed for tardies, excused absences, and unexcused absences. It is worth noting that the distinction between excused and unexcused absences by SES may not be meaningful. Specifically, SES might be correlated with the probability that parents take the time to contact the school to officially excuse an absence. In table 3, the differences in total absences in the North Carolina data are all strongly statistically significant, which is at least partly due to the large sample size. However, the practical importance of these differences is limited, as the largest differences are about one absence per year.

Table 2. 
Conditional Descriptive Statistics of ECLS-K Analytic Sample
Total AbsencesTotal TardiesExcused AbsencesUnexcused AbsencesChronically Absent
Kindergarten Mean 8.4 3.1 6.5 1.9 0.09*** 
 SD (11.0)*** (7.1)*** (6.4)** (9.1)  
 N 6,300 6,300 5,400 5,400 6,300 
First grade Mean 7.4 3.6 6.9 1.7 0.07 
 SD (7.4) (7.2) (6.4) (4.0)  
 N 5,300 5,300 2,250 2,250 5,300 
Male Mean 7.9 3.2 6.5 1.8 0.07** 
 SD (9.6) (6.8)* (6.3)** (8.6)  
 N 5,650 5,650 3,750 3,750 5,650 
Female Mean 8.1 3.5 6.8 1.8 0.09 
 SD (9.4) (7.5) (6.6) (7.3)  
 N 5,950 5,950 3,850 3,850 5,950 
At or above poverty level Mean 7.4 3.1 6.4 1.4 0.06*** 
 SD (8.0)*** (6.8)*** (6.0)*** (6.1)***  
 N 10,200 10,200 6,600 6,600 10,200 
Below poverty level Mean 11.9 5.0 8.3 4.3 0.19 
 SD (15.9) (9.2) (8.2) (14.6)  
 N 1,400 1,400 1,000 1,000 1,400 
Speaks English at home Mean 7.9 3.3 6.6 1.8 0.08** 
 SD (9.4)*** (7.1) (6.4)*** (7.8)  
 N 10,900 10,900 7,150 7,150 10,900 
No English at home Mean 9.2 3.9 8.1 2.3 0.12 
 SD (12.3) (8.3) (7.3) (10.8)  
 N 700 700 450 450 700 
No IEP Mean 7.9 3.3 6.6 1.8 0.08*** 
 SD (9.6)** (7.2) (6.4) (8.1)  
 N 11,000 11,050 7,200 7,200 11,000 
IEP Mean 8.9 3.4 7.1 1.9 0.12 
 SD (7.9) (6.2) (6.4) (4.9)  
 N 600 600 400 400 600 
Total AbsencesTotal TardiesExcused AbsencesUnexcused AbsencesChronically Absent
Kindergarten Mean 8.4 3.1 6.5 1.9 0.09*** 
 SD (11.0)*** (7.1)*** (6.4)** (9.1)  
 N 6,300 6,300 5,400 5,400 6,300 
First grade Mean 7.4 3.6 6.9 1.7 0.07 
 SD (7.4) (7.2) (6.4) (4.0)  
 N 5,300 5,300 2,250 2,250 5,300 
Male Mean 7.9 3.2 6.5 1.8 0.07** 
 SD (9.6) (6.8)* (6.3)** (8.6)  
 N 5,650 5,650 3,750 3,750 5,650 
Female Mean 8.1 3.5 6.8 1.8 0.09 
 SD (9.4) (7.5) (6.6) (7.3)  
 N 5,950 5,950 3,850 3,850 5,950 
At or above poverty level Mean 7.4 3.1 6.4 1.4 0.06*** 
 SD (8.0)*** (6.8)*** (6.0)*** (6.1)***  
 N 10,200 10,200 6,600 6,600 10,200 
Below poverty level Mean 11.9 5.0 8.3 4.3 0.19 
 SD (15.9) (9.2) (8.2) (14.6)  
 N 1,400 1,400 1,000 1,000 1,400 
Speaks English at home Mean 7.9 3.3 6.6 1.8 0.08** 
 SD (9.4)*** (7.1) (6.4)*** (7.8)  
 N 10,900 10,900 7,150 7,150 10,900 
No English at home Mean 9.2 3.9 8.1 2.3 0.12 
 SD (12.3) (8.3) (7.3) (10.8)  
 N 700 700 450 450 700 
No IEP Mean 7.9 3.3 6.6 1.8 0.08*** 
 SD (9.6)** (7.2) (6.4) (8.1)  
 N 11,000 11,050 7,200 7,200 11,000 
IEP Mean 8.9 3.4 7.1 1.9 0.12 
 SD (7.9) (6.2) (6.4) (4.9)  
 N 600 600 400 400 600 

Notes: Means and standard deviations (SDs) are weighted by ECLS-K provided sampling weight C#CW0. Sample sizes are rounded to the nearest 50 to conform to NCES regulations. Mean difference t-tests were performed to compare kindergartners and first graders, males and females, students above and below poverty line, students who do and do not speak English at home, and students without and with individualized education plans (IEPs). Chronically absent is defined as 18 or more annual absences.

*p < 0.1; **p < 0.05; ***p < 0.01.

Table 3. 
Conditional Descriptive Statistics of North Carolina Analytic Sample
TotalTotalExcusedUnexcusedChronically
AbsencesTardiesAbsencesAbsencesAbsent
Fourth grade Mean 6.2 2.0 3.4 2.3 0.04*** 
 SD (5.6)*** (5.5)* (4.2)** (3.3)***  
 450,714 292,838 315,753 315,753 450,714 
Fifth grade Mean 6.3 1.9 3.4 2.4 0.04 
 SD (5.8) (5.2) (4.3) (3.4)  
 452,600 295,081 318,260 318,260 452,600 
Male Mean 6.3 1.9 3.4 2.4 0.04*** 
 SD (5.7)*** (5.3) (4.3)*** (3.4)***  
 451,895 293,735 317,042 317,042 451,895 
Female Mean 6.1 1.9 3.4 2.3 0.04 
 SD (5.6) (5.4) (4.2) (3.2)  
 451,419 294,184 316,971 316,971 451,419 
At or above poverty level Mean 5.7 1.8 3.4 1.8 0.03*** 
 SD (5.0)*** (5.1)** (4.1) (2.7)***  
 476,775 313,601 346,220 346,220 476,775 
Below poverty level Mean 6.8 2.1 3.4 3.0 0.06 
 SD (6.3) (5.7) (4.4) (3.9)  
 426,539 274,318 287,793 287,793 426,539 
Not LEP Mean 6.2 2.0 3.4 2.3 0.04*** 
 SD (5.7)*** (5.4)*** (4.2)*** (3.3)***  
 891,125 578,207 622,961 622,961 891,125 
LEP Mean 5.3 1.3 2.5 2.6 0.03 
 SD (5.0) (3.9) (3.4) (3.5)  
 12,189 9,712 11,052 11,052 12,189 
Learning disability Mean 7.3 2.2 3.8 2.9 0.07*** 
 SD (6.4)*** (6.0)** (4.6)*** (3.9)***  
 31,807 22,017 24,544 24,544 31,807 
No learning disabilities Mean 6.2 1.9 3.4 2.3 0.04*** 
 SD (5.6) (5.3) (4.2) (3.3)  
 871,507 565,902 609,469 609,469 871,507 
TotalTotalExcusedUnexcusedChronically
AbsencesTardiesAbsencesAbsencesAbsent
Fourth grade Mean 6.2 2.0 3.4 2.3 0.04*** 
 SD (5.6)*** (5.5)* (4.2)** (3.3)***  
 450,714 292,838 315,753 315,753 450,714 
Fifth grade Mean 6.3 1.9 3.4 2.4 0.04 
 SD (5.8) (5.2) (4.3) (3.4)  
 452,600 295,081 318,260 318,260 452,600 
Male Mean 6.3 1.9 3.4 2.4 0.04*** 
 SD (5.7)*** (5.3) (4.3)*** (3.4)***  
 451,895 293,735 317,042 317,042 451,895 
Female Mean 6.1 1.9 3.4 2.3 0.04 
 SD (5.6) (5.4) (4.2) (3.2)  
 451,419 294,184 316,971 316,971 451,419 
At or above poverty level Mean 5.7 1.8 3.4 1.8 0.03*** 
 SD (5.0)*** (5.1)** (4.1) (2.7)***  
 476,775 313,601 346,220 346,220 476,775 
Below poverty level Mean 6.8 2.1 3.4 3.0 0.06 
 SD (6.3) (5.7) (4.4) (3.9)  
 426,539 274,318 287,793 287,793 426,539 
Not LEP Mean 6.2 2.0 3.4 2.3 0.04*** 
 SD (5.7)*** (5.4)*** (4.2)*** (3.3)***  
 891,125 578,207 622,961 622,961 891,125 
LEP Mean 5.3 1.3 2.5 2.6 0.03 
 SD (5.0) (3.9) (3.4) (3.5)  
 12,189 9,712 11,052 11,052 12,189 
Learning disability Mean 7.3 2.2 3.8 2.9 0.07*** 
 SD (6.4)*** (6.0)** (4.6)*** (3.9)***  
 31,807 22,017 24,544 24,544 31,807 
No learning disabilities Mean 6.2 1.9 3.4 2.3 0.04*** 
 SD (5.6) (5.3) (4.2) (3.3)  
 871,507 565,902 609,469 609,469 871,507 

Notes: Mean difference t-tests were performed to compare fourth and fifth graders, males and females, students above and below poverty line, students without and with limited English proficiency (LEP), and students with and without learning disabilities. Chronically absent is defined as 18 or more annual absences.

*p < 0.1; **p < 0.05; ***p < 0.01.

4.  Empirical Strategy

We investigate the relationship between student absences and academic achievement by including absences as a contemporaneous input in VAMs of the education production function. Intuitively, VAMs exploit longitudinal student data by using lagged test scores to proxy for the unobserved histories of educational and familial inputs received by each child. Todd and Wolpin (2003), Harris, Sass, and Semykina (2014), and Guarino, Reckase, and Wooldridge (2015) provide thorough discussions of the empirical difficulties created by a lack of data on historical inputs, derivations of lagged test-score VAM specifications from a structural education production function, and the assumptions required for consistent estimation of various VAM specifications. Following Guarino, Reckase, and Wooldridge (2015), our baseline model of the spring test score (y) of student i, in classroom j, in time period t is
formula
1
where f(A) is a general function of absences; x is a vector of student and household characteristics summarized in table 1, some of which vary over time; η is a classroom FE; and u is a composite error term that contains student i's time-invariant unobserved ability and idiosyncratic shocks to achievement.13

The year, grade, teacher, and school FE commonly included in VAMs are subsumed by the classroom FE, which are crucial to our identification strategy. Specifically, classroom FE control for nonrandom sorting of teachers across schools and classrooms, classroom-specific shocks that jointly influence both absences and achievement (e.g., a flu epidemic or a particularly effective teacher; see Monk and Ibrahim 1984 or Gershenson 2015), and potential differences across classrooms in how absences and tardies are coded. As a result, our estimates of absences’ effects on performance rely on within-classroom variation in student absences, holding past achievement constant. Standard errors are clustered by school district, which makes statistical inference robust to the presence of arbitrary heteroskedasticity and arbitrary serial correlation within districts, schools, and students over time because the analytic sample is restricted to students who did not change schools during the study's time period and schools are nested within districts (Angrist and Pischke 2009).14 Ordinary least squares (OLS) is taken as the preferred estimator of equation 1, as Guarino, Reckase, and Wooldridge (2015) find this approach the most robust to a variety of potential nonrandom student–teacher assignment scenarios and recent research suggests that simply conditioning on lagged achievement adequately controls for the sorting of students into classrooms (e.g., Chetty, Friedman, and Rockoff 2014). In the sensitivity analysis, however, we also consider first-differenced (FD) estimates of equation 1 that remove unobserved, time-invariant student heterogeneity from the model.

Having chosen an appropriate VAM specification and estimator, a related question regards the functional form of the relationship between absences and achievement. For example, the effect of absences may be nonlinear either because absences below some minimal threshold are relatively harmless or because the effect is cumulative. Similarly, the effect of absences may vary by absence type (Gottfried 2009) or by observed student characteristics, as households likely vary by SES in their ability to support “catch up” following an absence spell (Chang and Romero 2008; Ready 2010). Girls may have stronger noncognitive skills (Jacob 2002; Bertrand and Pan 2013), and teachers may struggle to assist exceptional students (i.e., students with disabilities and English language learners) in catching up following absence spells (Jones, Buzick, and Turkan 2013). Finally, we allow for the relationship between absences and student achievement to vary by school locale, as schools might vary in their ability to facilitate catch up following an absence spell. Incorrectly assuming that f(A) is linear or failing to properly model heterogeneity in the effect of student absences on achievement may obfuscate the empirical relationship between student absences and achievement. Accordingly, we test for potential nonlinearities and heterogeneities by considering quadratic and nonparametric specifications of f(A), and by interacting A with the subset of x described above.

5.  Results

Main Results

Table 4 reports baseline estimates of the effect of student absences on math and reading achievement that use the preferred linear specification of f(A). The first four columns of table 4 report estimates for the kindergarteners and first graders in the ECLS-K. Columns 1 and 2 suggest that an additional student absence is associated with statistically significant 0.002 test-score SD reductions in math and reading achievement, respectively. These estimates are smaller than those found in many state- and district-level administrative datasets, though they are also less precisely estimated, perhaps because of the significantly smaller sample size (e.g., Gottfried 2011; Aucejo and Romano 2016; Goodman 2014). The estimated effects of tardies are similar in magnitude to, and statistically indistinguishable from, the estimated effects of absences. Columns 3 and 4 of table 4 examine whether excused and unexcused absences differentially affect student achievement using the subsample of the ECLS-K students for whom this information is available. Somewhat surprisingly, the point estimates on excused absences are larger than those on unexcused absences for both math and reading achievement, though neither difference is statistically significant at traditional confidence levels. The estimates of effects by type of absence are less precise than in the model that assumes a homogeneous effect across all types of absences, which is mostly due to the estimated standard errors sometimes being twice as large. This is likely because of the 30 percent smaller sample of students for whom excused and unexcused absences are identified.

Table 4. 
Baseline Estimates of Absences’ Effect on Student Achievement
ECLS-KNorth Carolina
Math (1)Reading (2)Math (3)Reading (4)Math (5)Reading (6)Math (7)Reading (8)
Total absences −0.002 −0.002   −0.007 −0.004   
 (0.001)** (0.001)**   (0.0002)*** (0.0001)***   
Total tardies −0.001 −0.003 −0.001 −0.002 −0.001 −0.001 −0.001 −0.001 
 (0.001) (0.001)*** (0.001) (0.001)** (0.0001)*** (0.0001)*** (0.0001)*** (0.0002)*** 
Differential p = 0.40 p = 0.30   p< 0.0001 p< 0.0001   
effect t-test 
Excused   −0.003 −0.002   −0.005 −0.002 
absences   (0.002)* (0.001)**   (0.0002)*** (0.0002)*** 
Unexcused   −0.002 −0.001   −0.010 −0.006 
absences   (0.001)* (0.001)   (0.0003)*** (0.0003)*** 
Differential   p = 0.68 p = 0.36   p< 0.0001 p< 0.0001 
effect t-test 
Adjusted R2 0.44 0.47 0.46 0.49 0.66 0.60 0.66 0.60 
N 11,600 11,600 7,500 7,500 587,919 587,919 587,919 587,919 
ECLS-KNorth Carolina
Math (1)Reading (2)Math (3)Reading (4)Math (5)Reading (6)Math (7)Reading (8)
Total absences −0.002 −0.002   −0.007 −0.004   
 (0.001)** (0.001)**   (0.0002)*** (0.0001)***   
Total tardies −0.001 −0.003 −0.001 −0.002 −0.001 −0.001 −0.001 −0.001 
 (0.001) (0.001)*** (0.001) (0.001)** (0.0001)*** (0.0001)*** (0.0001)*** (0.0002)*** 
Differential p = 0.40 p = 0.30   p< 0.0001 p< 0.0001   
effect t-test 
Excused   −0.003 −0.002   −0.005 −0.002 
absences   (0.002)* (0.001)**   (0.0002)*** (0.0002)*** 
Unexcused   −0.002 −0.001   −0.010 −0.006 
absences   (0.001)* (0.001)   (0.0003)*** (0.0003)*** 
Differential   p = 0.68 p = 0.36   p< 0.0001 p< 0.0001 
effect t-test 
Adjusted R2 0.44 0.47 0.46 0.49 0.66 0.60 0.66 0.60 
N 11,600 11,600 7,500 7,500 587,919 587,919 587,919 587,919 

Notes: Columns 1–4 are weighted by ECLS-K provided weight, C#CW0. ECLS-K sample sizes are rounded to the nearest 50. Each model controls for lagged achievement, classroom fixed effects, child race/ethnicity, child gender, poverty status, English speaking status, and individualized education plans (IEPs). Columns 1–4 control for ECLS-K test dates, child redshirt status, and maternal education. Standard errors are robust to clustering at the school level.

*p < 0.1; **p < 0.05; ***p < 0.01.

Columns 5 through 8 of table 4 report similar estimates for fourth and fifth graders in North Carolina. Columns 5 and 6 suggest that an additional student absence is associated with statistically significant 0.007 and 0.004 test-score SD reductions in math and reading achievement, respectively. These estimates are similar in magnitude to Aucejo and Romano's (2016) preferred estimates of three-way FE models using the same North Carolina data, as well as to Goodman's (2014) student-FE estimates in Massachusetts. The math estimate is only about half as large as Gottfried's (2011) school- and family-FE estimates in Philadelphia, however, perhaps because the harm of absences is greater among the low-income and racial minority students who constitute the majority of Philadelphia's public school enrollments. We test for the presence of such heterogeneous effects in both the ECLS-K and North Carolina (see below).

Columns 7 and 8 of table 4 show that in North Carolina, unexcused absences are two to three times more harmful than excused absences, and these differences are strongly statistically significant. Although the North Carolina estimates are larger and more precisely estimated than in the ECLS-K, both datasets provide compelling evidence that student absences are associated with lower levels of academic performance.

Nonlinearities in the Relationship between Student Absences and Achievement

The estimates reported in table 4 are potentially misleading if the true relationship between student absences and academic achievement is nonlinear. We investigate this possibility by plotting the conditional relationships between achievement and absences in North Carolina generated by three specifications of f(A): the linear specification used in table 4 and two nonlinear specifications.15 The linear specification, indicated by the dotted lines in panels A and B of figure 1, yields straight lines whose slopes equal the estimated absence coefficients in columns 5 and 6 of table 4. The nonlinear specifications include a parametric quadratic function and a nonparametric step function that omits zero absences as the benchmark, includes a unique indicator for each integer of absences between 1 and 25, and is top-coded at 26 or more absences.16 Points along each of the three lines can be interpreted as the reduction in achievement attributable to x absences, relative to having zero absences.

Figure 1.

A. Effect of Absences on Math Achievement in North Carolina. B. Effect of Absences on Reading Achievement in North Carolina Notes: N = 587,919. Estimates come from baseline models that condition on lagged achievement, tardies, classroom fixed effects, and observed student characteristics. The quadratic terms are jointly statistically significant in both subjects, but the squared term is only individually statistically significant for math. The 26 nonparametric indicators are jointly and individually statistically significant. In the nonparametric specifications 0 absences is the omitted reference category and “26 absences” is top-coded to include students with 26 or more absences.

Figure 1.

A. Effect of Absences on Math Achievement in North Carolina. B. Effect of Absences on Reading Achievement in North Carolina Notes: N = 587,919. Estimates come from baseline models that condition on lagged achievement, tardies, classroom fixed effects, and observed student characteristics. The quadratic terms are jointly statistically significant in both subjects, but the squared term is only individually statistically significant for math. The 26 nonparametric indicators are jointly and individually statistically significant. In the nonparametric specifications 0 absences is the omitted reference category and “26 absences” is top-coded to include students with 26 or more absences.

Panels A and B of figure 1 present these results for math and reading achievement in North Carolina, respectively. Interestingly, for both subjects, the nonlinear specifications closely follow the linear specification. This suggests that the relationship between student absences and academic achievement over the range of absences is approximately linear. Moreover, the nonparametric estimates provide no evidence of a discontinuity in the relationship between absences and achievement at eighteen absences, which is the most commonly used definition of chronic absence (e.g., Balfanz and Byrnes 2012). Thus, although chronically absent students score significantly lower than their peers who are rarely absent, because of the linear, cumulative effect of absences there are only marginal differences in the achievement of students who are and are not classified as chronically absent (near the threshold of the commonly used “chronically absent” definition of eighteen absences [e.g., between students with sixteen or seventeen absences and students with nineteen or twenty absences]). Nonetheless, high levels of absences are associated with significantly lower levels of achievement and likely contribute to the achievement gap. As tables 2 and 3 show, low-income students are two to three times more likely to be chronically absent than their non-poor counterparts.

Heterogeneous Effects of Student Absences

Table 5 tests for heterogeneity in the relationship between student absences and academic achievement by observable student characteristics and school locale. Specifically, table 5 reports estimates of augmented versions of the baseline linear specification that interact student absences with six observed student characteristics and two geographic locale indicators: grade level, poverty status, gender, an English as a second language indicator, a learning disability indicator, lagged achievement, and rural and urban school indicators (suburban is the omitted reference category). Columns 1 and 2 report estimates for math and reading achievement, respectively, in the ECLS-K. The IEP interaction term in column 1 is negative and relatively large in magnitude, but imprecisely estimated. The lagged achievement interaction terms are positive and statistically significant for both subjects, suggesting that current absences are less harmful for high-achieving students. Finally, the school locale interaction terms are statistically indistinguishable from zero, suggesting that the average relationship between student absences and achievement is similar in urban, rural, and suburban schools.17

Table 5. 
Heterogeneity in Absences’ Effect on Student Achievement
ECLS-KNorth Carolina
Math (1)Reading (2)Math (3)Reading (4)
Total absences (TA) −0.003* −0.003** −0.007 −0.004 
 (0.002) (0.001) (0.000)*** (0.0004)*** 
First (fifth) grade*TA −0.001 0.002 −0.0003 0.001 
 (0.002) (0.002) (0.0003) (0.0004)** 
Female*TA 0.001 −0.000 0.0004 0.0005 
 (0.001) (0.001) (0.0003) (0.0003) 
Poverty*TA 0.001 −0.001 −0.0006 −0.001 
 (0.002) (0.001) (0.0003)* (0.0003)*** 
Does not speak English*TA 0.001 0.001 −0.002 −0.004 
 (0.002) (0.002) (0.001) (0.002)*** 
Student has an IEP*TA −0.004 0.003 0.002 0.001 
 (0.004) (0.003) (0.001)* (0.001) 
Lagged score*TA 0.002* 0.004*** −0.001 −0.001 
 (0.001) (0.001) (0.0002)*** (0.0002)*** 
Urban school*TA 0.002 0.001 −0.0003 −0.0001 
 (0.002) (0.002) (0.0003) (0.0004) 
Rural school*TA −0.004 −0.002 0.0003 0.0005 
 (0.003) (0.002) (0.0004) (0.0003) 
Joint significance of interactions p = 0.16 p < 0.001 p < 0.001 p < 0.001 
Adjusted R2 0.44 0.47 0.67 0.60 
N 11,600 11,600 587,919 587,919 
ECLS-KNorth Carolina
Math (1)Reading (2)Math (3)Reading (4)
Total absences (TA) −0.003* −0.003** −0.007 −0.004 
 (0.002) (0.001) (0.000)*** (0.0004)*** 
First (fifth) grade*TA −0.001 0.002 −0.0003 0.001 
 (0.002) (0.002) (0.0003) (0.0004)** 
Female*TA 0.001 −0.000 0.0004 0.0005 
 (0.001) (0.001) (0.0003) (0.0003) 
Poverty*TA 0.001 −0.001 −0.0006 −0.001 
 (0.002) (0.001) (0.0003)* (0.0003)*** 
Does not speak English*TA 0.001 0.001 −0.002 −0.004 
 (0.002) (0.002) (0.001) (0.002)*** 
Student has an IEP*TA −0.004 0.003 0.002 0.001 
 (0.004) (0.003) (0.001)* (0.001) 
Lagged score*TA 0.002* 0.004*** −0.001 −0.001 
 (0.001) (0.001) (0.0002)*** (0.0002)*** 
Urban school*TA 0.002 0.001 −0.0003 −0.0001 
 (0.002) (0.002) (0.0003) (0.0004) 
Rural school*TA −0.004 −0.002 0.0003 0.0005 
 (0.003) (0.002) (0.0004) (0.0003) 
Joint significance of interactions p = 0.16 p < 0.001 p < 0.001 p < 0.001 
Adjusted R2 0.44 0.47 0.67 0.60 
N 11,600 11,600 587,919 587,919 

Notes: Columns 1 and 2 are weighted by ECLS-K provided weight, C#CW0. Each model controls for classroom fixed effects, child race/ethnicity, child gender, poverty status, English speaking status, and individualized education plans (IEPs). Columns 1 and 2 control for ECLS-K test dates, tardies, child redshirt status, and maternal education. Standard errors are robust to clustering at the school level.

*p < 0.1; **p < 0.05; ***p < 0.01.

Columns 3 and 4 do the same for North Carolina, in which several of the interaction terms are individually statistically significant. Notably, the poverty interactions are negative and statistically significant for both math and reading achievement. Specifically, absences are about 25 percent more harmful to the reading achievement of low-income students than they are to more affluent students. That the differential effect is larger for reading is consistent with the hypothesis that although reading skills are primarily developed at home (Currie and Thomas 2001), they are more effectively developed in high-SES households that are able to invest more time reading to children (Baydar and Brooks-Gunn 1991; Guryan, Hurst, and Kearney 2008), whereas low-income households struggle to compensate for the lost instructional time caused by student absences. Intuitively, the harmful effect of absences on LEP students’ reading achievement is even stronger, as the interaction effect of –0.004 is strongly statistically significant and suggests that the effect of absences on LEP students’ reading achievement is more than twice as large as the effect of absences on native speakers’ reading achievement. The lag-score interaction terms are negative and statistically significant, suggesting that absences are marginally more harmful to previously high-achieving students. This is one of the few instances in which the ECLS-K and North Carolina data yield contradictory results. Finally, as in the ECLS-K analysis, the school-locale interaction terms are small and statistically insignificant, suggesting that absences reduce student achievement in all school settings. Together, the results presented in table 5 confirm the general finding of table 4 that, on average, there is a negative, statistically significant effect of student absences on achievement. Moreover, these results suggest that absences are particularly harmful to two subsets of vulnerable students: low-income and LEP students.

The models estimated in table 5 restrict the sources of student-level heterogeneity to be the same in all types of schools. We relax this assumption in table 6 by estimating the baseline interactions model separately by school locale. Doing so furthers our understanding of the potentially nuanced relationship between student absences and academic achievement, and highlights the usefulness of the nationally representative survey and state administrative data analyzed in the current study. Consistent with the results presented in table 5, table 6 shows that the average effect of a student absence is similar in size across geographic locales. Again, this suggests that student absences harm achievement in all schools, not only the disadvantaged urban districts that were the focus of much previous research on student absences. Interestingly, column 2 shows that in the nationally representative ECLS-K sample, absences are particularly harmful to low-income students in rural districts, whereas columns 4 and 5 show that this is true in both urban and rural districts in North Carolina. This suggests that low-income students are disproportionately harmed by absences in multiple school contexts, and not just in urban settings.

Table 6. 
Heterogeneity in Absences’ Effect on Student Achievement by School Type
ECLS-KNorth Carolina
SampleUrban (1)Rural (2)Suburban (3)Urban (4)Rural (5)Suburban (6)
A. Math       
Total absences (TA) −0.003 −0.004 −0.004** −0.006 −0.006 −0.007 
 (0.002) (0.003) (0.002) (0.001)*** (0.000)*** (0.001)*** 
1st (5th) grade*TA 0.000 −0.003 −0.001 −0.001 0.0001 −0.001 
 (0.003) (0.005) (0.003) (0.001) (0.0004) (0.001) 
Female*TA 0.004* 0.000 0.002 0.001 0.001 0.0002 
 (0.002) (0.001) (0.002) (0.001) (0.0004) (0.0005) 
Poverty*TA 0.001 −0.005** 0.006* −0.001 −0.0004 −0.0001 
 (0.003) (0.002) (0.003) (0.001)** (0.0004) (0.001) 
LEP*TA 0.005* −0.005* −0.002 −0.002 −0.002 −0.002 
 (0.003) (0.003) (0.004) (0.002) (0.002) (0.003) 
IEP*TA −0.012 −0.011 −0.002 0.006 −0.0004 −0.001 
 (0.009) (0.008) (0.006) (0.001)*** (0.002) (0.003) 
Lagged score*TA 0.001 −0.000 0.003** −0.001 −0.001 −0.001 
 (0.002) (0.003) (0.002) (0.0003)*** (0.0003)*** (0.0003)** 
B. Reading       
Total absences (TA) −0.002 −0.004 −0.004** −0.004 −0.003 −0.003 
 (0.002) (0.003) (0.002) (0.001)*** (0.0001)*** (0.001)*** 
1st (5th) grade*TA −0.000 0.003 0.003 0.0003 0.001 0.0004 
 (0.003) (0.004) (0.002) (0.001) (0.001)** (0.001) 
Female*TA 0.001 −0.001 −0.000 0.001 0.001 −0.0002 
 (0.002) (0.001) (0.002) (0.001)* (0.0004) (0.001) 
Poverty*TA −0.002 −0.003** 0.001 −0.002 −0.001 −0.001 
 (0.003) (0.001) (0.002) (0.001)** (0.0004)*** (0.001) 
LEP*TA 0.004 −0.001 0.000 −0.003 −0.004 −0.009 
 (0.005) (0.002) (0.003) (0.003) (0.002)** (0.004)** 
IEP*TA 0.001 −0.006 0.006* 0.003 0.001 −0.001 
 (0.007) (0.007) (0.003) (0.002) (0.001) (0.001) 
Lagged score*TA 0.004** 0.003 0.004*** −0.001 −0.001 −0.001 
 (0.002) (0.002) (0.001) (0.0003) (0.0003)** (0.0003)** 
N 4,100 1,750 5,800 176,376 264,564 146,979 
ECLS-KNorth Carolina
SampleUrban (1)Rural (2)Suburban (3)Urban (4)Rural (5)Suburban (6)
A. Math       
Total absences (TA) −0.003 −0.004 −0.004** −0.006 −0.006 −0.007 
 (0.002) (0.003) (0.002) (0.001)*** (0.000)*** (0.001)*** 
1st (5th) grade*TA 0.000 −0.003 −0.001 −0.001 0.0001 −0.001 
 (0.003) (0.005) (0.003) (0.001) (0.0004) (0.001) 
Female*TA 0.004* 0.000 0.002 0.001 0.001 0.0002 
 (0.002) (0.001) (0.002) (0.001) (0.0004) (0.0005) 
Poverty*TA 0.001 −0.005** 0.006* −0.001 −0.0004 −0.0001 
 (0.003) (0.002) (0.003) (0.001)** (0.0004) (0.001) 
LEP*TA 0.005* −0.005* −0.002 −0.002 −0.002 −0.002 
 (0.003) (0.003) (0.004) (0.002) (0.002) (0.003) 
IEP*TA −0.012 −0.011 −0.002 0.006 −0.0004 −0.001 
 (0.009) (0.008) (0.006) (0.001)*** (0.002) (0.003) 
Lagged score*TA 0.001 −0.000 0.003** −0.001 −0.001 −0.001 
 (0.002) (0.003) (0.002) (0.0003)*** (0.0003)*** (0.0003)** 
B. Reading       
Total absences (TA) −0.002 −0.004 −0.004** −0.004 −0.003 −0.003 
 (0.002) (0.003) (0.002) (0.001)*** (0.0001)*** (0.001)*** 
1st (5th) grade*TA −0.000 0.003 0.003 0.0003 0.001 0.0004 
 (0.003) (0.004) (0.002) (0.001) (0.001)** (0.001) 
Female*TA 0.001 −0.001 −0.000 0.001 0.001 −0.0002 
 (0.002) (0.001) (0.002) (0.001)* (0.0004) (0.001) 
Poverty*TA −0.002 −0.003** 0.001 −0.002 −0.001 −0.001 
 (0.003) (0.001) (0.002) (0.001)** (0.0004)*** (0.001) 
LEP*TA 0.004 −0.001 0.000 −0.003 −0.004 −0.009 
 (0.005) (0.002) (0.003) (0.003) (0.002)** (0.004)** 
IEP*TA 0.001 −0.006 0.006* 0.003 0.001 −0.001 
 (0.007) (0.007) (0.003) (0.002) (0.001) (0.001) 
Lagged score*TA 0.004** 0.003 0.004*** −0.001 −0.001 −0.001 
 (0.002) (0.002) (0.001) (0.0003) (0.0003)** (0.0003)** 
N 4,100 1,750 5,800 176,376 264,564 146,979 

Notes: Columns 1, 2, and 3 are weighted by ECLS-K provided weight, C#CW0. Each model controls for classroom fixed effects, child race/ethnicity, child redshirt status, child gender, poverty status, English speaking status, and individualized education plans (IEPs). Columns 1–3 control for ECLS-K test dates, child redshirt status, and maternal education. Standard errors are robust to clustering at the school level.

*p < 0.1; **p < 0.05; ***p < 0.01.

Sensitivity Analysis

OLS estimates of equation 1 are potentially biased for two reasons. First, time-invariant unobserved student heterogeneity in the composite error term of equation 1 may jointly predict both achievement and absences, even after conditioning on lagged achievement. Second, even after conditioning on time-invariant student heterogeneity, the possibility remains that time-varying student-specific shocks jointly determine absences and achievement. Table 7 presents some alternative estimators that condition on unobserved student heterogeneity and examine the robustness of the main results more generally. Columns 1 through 4 do so for the ECLS-K data. Column 1 reproduces the baseline estimates of columns 1 and 2 in table 4 to facilitate comparisons. In column 2 we show that the baseline estimates are robust to not weighting the regressions, as suggested by Solon, Haider, and Wooldridge (2015). Column 3 contains estimates of an extensive specification that, in addition to the baseline student characteristics, conditions on mothers’ employment status, mothers’ marital status, and the number of adults residing in the household. The estimated effects of absences on both math and achievement remain the same. Moreover, both estimates remain strongly statistically significant, suggesting that the baseline estimates are not biased by changes over time in household structure that jointly determine absences and achievement.

Table 7. 
Sensitivity Analyses
ECLS-KNorth Carolina
BaselineUn-weightedStatistical controlsFirst DifferencedBaselineFull sampleAnalytic sampleFirst Differenced
(1)(2)(3)(4)(5)(6)(7)(8)
A. Math Achievement 
Lag score 0.701*** 0.701 0.699 0.748 0.782 0.782 0.782 0.128 
 (0.011) (0.009)*** (0.012)*** (0.113)*** (0.003)*** (0.002)*** (0.003)*** (0.015)*** 
Total −0.002** −0.003 −0.002 −0.008 −0.007 −0.007 −0.007 −0.005 
absences (0.001) (0.001)*** (0.001)** (0.003)** (0.0002)*** (0.0001)*** (0.0001)*** (0.0004)*** 
Total tardies −0.001 −0.001 −0.001 −0.005 −0.001   −0.001 
 (0.001) (0.001) (0.001) (0.004) (0.0001)***   (0.0003) 
Adjusted R2 0.44 0.44 0.44 −1.64 0.66 0.66 0.66 −0.22 
B. Reading Achievement 
Lag score 0.716*** 0.716 0.711 0.973 0.744 0.751 0.744 0.052 
 (0.010) (0.008)*** (0.010)*** (0.148)*** (0.003)*** (0.002)*** (0.003)*** (0.012)*** 
Total −0.002** −0.002 −0.002 −0.001 −0.004 −0.004 −0.004 −0.003 
absences (0.001) (0.001)* (0.001)*** (0.004) (0.0001)*** (0.0001)*** (0.0001)*** (0.0004)*** 
Total tardies −0.003*** −0.003 −0.003 −0.004 −0.001   −0.0005 
 (0.001) (0.001)*** (0.001)*** (0.003) (0.0001)***   (0.0003) 
Adjusted R2 0.47 0.47 0.46 −1.93 0.60 0.60 0.61 −0.13 
N 11,600 11,600 11,200 2,850 587,919 903,314 587,919 157,813 
ECLS-KNorth Carolina
BaselineUn-weightedStatistical controlsFirst DifferencedBaselineFull sampleAnalytic sampleFirst Differenced
(1)(2)(3)(4)(5)(6)(7)(8)
A. Math Achievement 
Lag score 0.701*** 0.701 0.699 0.748 0.782 0.782 0.782 0.128 
 (0.011) (0.009)*** (0.012)*** (0.113)*** (0.003)*** (0.002)*** (0.003)*** (0.015)*** 
Total −0.002** −0.003 −0.002 −0.008 −0.007 −0.007 −0.007 −0.005 
absences (0.001) (0.001)*** (0.001)** (0.003)** (0.0002)*** (0.0001)*** (0.0001)*** (0.0004)*** 
Total tardies −0.001 −0.001 −0.001 −0.005 −0.001   −0.001 
 (0.001) (0.001) (0.001) (0.004) (0.0001)***   (0.0003) 
Adjusted R2 0.44 0.44 0.44 −1.64 0.66 0.66 0.66 −0.22 
B. Reading Achievement 
Lag score 0.716*** 0.716 0.711 0.973 0.744 0.751 0.744 0.052 
 (0.010) (0.008)*** (0.010)*** (0.148)*** (0.003)*** (0.002)*** (0.003)*** (0.012)*** 
Total −0.002** −0.002 −0.002 −0.001 −0.004 −0.004 −0.004 −0.003 
absences (0.001) (0.001)* (0.001)*** (0.004) (0.0001)*** (0.0001)*** (0.0001)*** (0.0004)*** 
Total tardies −0.003*** −0.003 −0.003 −0.004 −0.001   −0.0005 
 (0.001) (0.001)*** (0.001)*** (0.003) (0.0001)***   (0.0003) 
Adjusted R2 0.47 0.47 0.46 −1.93 0.60 0.60 0.61 −0.13 
N 11,600 11,600 11,200 2,850 587,919 903,314 587,919 157,813 

Notes: Columns 1, 3, and 4 are weighted by ECLS-K provided weight, C#CW0. Each model controls for lagged achievement, classroom fixed effects, and observed student characteristics. Column 3 additionally controls for mother's employment status, mother's marital status, and the number of household adults. Columns 1–3 control for ECLS-K test dates, child redshirt status, and maternal education. Standard errors are robust to clustering at the school level. The first-differenced estimates in columns 4 and 8 use twice-lagged test scores as instrumental variables for the lagged gain scores.

*p < 0.1; **p < 0.05; ***p < 0.01.

Unobserved time-invariant student heterogeneity is another potential source of endogeneity. This is easily removed from the baseline specification by FD equation 1. Because OLS estimates of the resulting FD equation are biased (Nickell 1981), we apply the IV procedure proposed by Anderson and Hsiao (1982), in which twice-lagged achievement instruments for the first-differenced lag score. These FD estimates are reported in column 4 of table 7.18 The FD estimate of the effect of absences on math achievement is actually larger than the corresponding OLS estimate and remains statistically significant at 5 percent significance. This finding is consistent with Gottfried's (2011) finding that conditioning on family FE yields larger estimates of the effect of absences on achievement and indicates the baseline math results are not driven by unobserved student heterogeneity. The corresponding FD estimate for reading is imprecisely estimated. It is difficult to interpret the reading estimate, however, as the estimated coefficient on the lag score (α) is close to one, suggesting a weak-IV problem (Wooldridge 2010, p. 374).

Columns 5 through 8 of table 7 present somewhat similar sensitivity analyses of the North Carolina data. Column 5 reproduces the baseline estimates of columns 5 and 6 in table 4 to facilitate comparisons. In column 6 we report estimates of the baseline specification, excluding tardies, for the full sample of 903,314 student-years for which all relevant variables except tardies are observed. The point estimates on absences are unchanged, suggesting that the results are not biased by omitting tardies from the model or by restricting the sample to observations for which tardies are observed. Column 7 again excludes tardies from the model, but now uses only the baseline analytic sample for which tardies are observed. Again, the point estimates on student absences remain unchanged, which suggests that the main results are not biased by nonrandomly missing data on student tardies.

Finally, column 8 of table 7 reports FD estimates analogous to those reported in column 4. The FD estimates, which remove the student effect from equation 1, are slightly smaller but similar in magnitude to the baseline OLS estimates and remain statistically significant at 1 percent confidence. This indicates that the baseline estimates were not driven by unobserved student heterogeneity. Interestingly, however, the estimated tardy coefficients lose their statistical significance. Taken as a whole, the sensitivity analyses of both datasets reported in table 7 suggest that the main finding of a significant negative relationship between student absences and academic achievement is robust to a number of modeling and sample restriction decisions, as well as conditioning on unobserved student heterogeneity.

6.  Discussion

The current study uses two longitudinal datasets: the ECLS-K (a nationally representative survey of the 1998–99 cohort of U.S. kindergarteners), and administrative data on the population of third through fifth graders who attended North Carolina's public schools between 2005–06 and 2009–10. We investigate the relationship between student absences and academic achievement by estimating VAMs that exploit within-classroom and within-student variation in absences. Both datasets provide evidence of modest but statistically significant negative relationships between student absences and academic achievement in urban, rural, and suburban schools: A one SD increase in absences is associated with decreases in achievement of 0.02 to 0.04 test-score SD. That the harmful effects of student absences are generally stronger on math achievement than on reading achievement is consistent with the general finding that educational inputs and policies have relatively greater impacts on math achievement (e.g., Rockoff 2004; Jacob 2005; Rivkin, Hanushek, and Kain 2005; Hanushek and Rivkin 2010), perhaps because children are more apt to learn and develop reading skills at home (Currie and Thomas 2001).

The practical significance and policy relevance of these results are most easily observed by comparing these effects to those of other educational inputs that are considered to be practically significant. Specifically, these results suggest that a one SD increase in absences is roughly equivalent to between one third and one quarter of the effect of a one SD increase in teacher effectiveness (Kane, Rockoff, and Staiger 2008; Hanushek and Rivkin 2010). Other useful benchmarks for contextualizing the marginal effect of a student absence are the marginal effects of teacher absences and additional school days. Regarding the former, studies by Herrmann and Rockoff (2012) and Clotfelter, Ladd, and Vigdor (2009) find that a one SD increase in teacher absences similarly reduces student achievement by 0.02 to 0.04 test-score SD. Regarding the latter, a small literature is emerging that attempts to estimate the effect of school days on student achievement by exploiting plausibly random variation in school days caused by either inclement weather or changes in test dates. Marcotte and Hansen (2010) review the literature on snow days, which tends to find that each instructional day lost to snow decreases achievement by about 0.02 to 0.04 test-score SD. This effect is ten times larger than the harm associated with one student absence, perhaps because the average student is able to catch up following an absence, whereas snow days simply eliminate a day of learning that cannot be made up until after the spring test. Fitzpatrick, Grissmer, and Hastedt's (2011) estimates of the effect of a school day are more in line, and perhaps more comparable, with our estimates because the authors exploit the quasi-randomness of test dates in the ECLS-K. Specifically, Fitzpatrick, Grissmer, and Hastedt (2011) estimate that each day in school is associated with an increase of 0.005 to 0.007 test-score SD. Exploiting a series of state-mandated changes in administration of Minnesota's end-of-year assessments, Hansen (2011) comes to a similar conclusion regarding the causal relationship between time in school and student performance. Still, these estimates are slightly larger than our ECLS-K estimates of the marginal effect of an absence. Again, this may be due to the fact that there is a mechanism in place to help students catch up following an absence, and the average student is able to do so, to some extent.

Heterogeneity in the relationship between student absences and achievement is the one area in which the ECLS-K and North Carolina analyses yield moderately different results, though this might partly be driven by the relative lack of power in the smaller ECLS-K sample.

The harmful effect of absences on reading achievement is significantly stronger among low-income students in North Carolina, perhaps because reading skills are more effectively developed in high-SES households that are able to invest more time reading to children (Baydar and Brooks-Gunn 1991; Guryan, Hurst, and Kearney 2008) and are thus better able to help children “catch up” following an absence spell. In North Carolina, there is an even larger difference between the effect of absences on the math and reading achievement of LEP and non-LEP students. This difference is particularly large for reading achievement, as the harmful effect of absences on reading achievement for LEP students is more than twice that for non-LEP students. Again, this may be partly due to LEP students’ parents being less able to help with reading assignments.

The last notable difference between the two datasets is that no statistically significant difference between the effects of excused and unexcused absences was found in the ECLS-K, whereas unexcused absences were found to be twice as harmful as excused absences in North Carolina. Again, the lack of differential effects by absence type in the ECLS-K analysis could be because of a lack of statistical power in the substantially smaller ECLS-K analytic sample. Alternatively, the different results could be driven by differences in the grade levels contained in the two datasets or perhaps differences in the compositions of the ECLS-K and North Carolina analytic samples. We further investigate the grade-level question by estimating the ECLS-K interaction specifications reported in table 5 on a balanced panel of kindergarten through fifth-grade students using the third- and fifth-grade waves of the ECLS-K. The first-, third-, and fifth-grade interactions reported in Appendix table A.2 are neither individually nor jointly statistically significant, suggesting that the relationship between absences and achievement is approximately constant between kindergarten and fifth grade of the ECLS-K sample. Nonetheless, this result should be interpreted with caution, as the third- and-fifth grade estimates use first- and third-grade lag scores, respectively.19

The empirical finding of a practically significant and robust negative relationship between student absences and student achievement is worrisome and has implications for education and social policy. These results suggest that student learning can be increased by reducing either the frequency or the deleterious effects of student absences. The former suggests the importance of future research that examines how household and neighborhood characteristics, as well as school and classroom policies, influence student attendance. Outreach to the parents of such students might yield useful information regarding the challenges that the household or student is facing outside of school. Low-cost policies that nudge parents to facilitate regular attendance may be especially cost effective. The latter suggests the potential benefits of programs that assist students who are frequently absent to “catch up” through some combination of compensating for lost instructional time and ensuring that absent students receive prompt and complete information on missed lessons and assignments. For example, in-school and after-school tutoring, counseling, and related support programs might be targeted to students who are absent more frequently than in previous school years.

The results of the current study also have implications for value-added estimates of teacher effectiveness, as student attendance is an educational input that is at least partially outside of teachers’ control. Accordingly, failing to control for student absences in VAMs may yield biased estimates of teacher effects. If teachers influence attendance, however, controlling for student absences in VAMs will effectively penalize teachers who indirectly increase student performance by increasing student attendance. These issues, and the extent to which teachers affect student attendance more generally, are further investigated in Gershenson (2014, 2015).

Finally, we consider the ability of absences to explain the achievement gap between students of different socioeconomic statuses. Simple comparisons of means show unconditional math achievement gaps between students below and above the poverty line of about 0.6 and 0.7 test-score SD in the ECLS-K and North Carolina data, respectively. Because the harmful effects of absences on math achievement were only marginally stronger among low-income students, the current back-of-the-envelope analysis considers only how differences in the frequency of absences are likely to contribute to the achievement gap, which provides conservative estimates of absences’ contributions to achievement gaps. The average differences reported in tables 2 and 3 suggest that only about 1 percent of the achievement gap is attributable to differential rates of student absences. Nevertheless, the baseline results indicate that reducing low-income students’ absences by ten absences, relative to non-poor students, would reduce the achievement gap by 5 to 10 percent.

Notes

1 

In the context of self-contained primary school classrooms, classroom FE are the same as teacher-by-year FE. Classroom FE could be similarly applied in secondary or tertiary school contexts in which teachers teach specific subjects and classroom FE are equivalent to teacher-by-year-by-subject FE. For example, Fairlie, Hoffmann, and Oreopoulos (2014) utilize classroom FE for similar reasons to identify the effect of instructor–student racial mismatch on student outcomes in the community college context.

2 

Our results are also consistent with recent studies that apply alternative identification strategies (e.g., Aucejo and Romano 2016; Goodman 2014).

3 

Definitions of “chronically absent” vary across states and districts, but the modal definition is being absent on at least 10 percent of school days (eighteen absences per year, or two to three absences per month) (Bruner, Discher, and Chang 2011; Balfanz and Byrnes 2012).

4 

Specifically, we use the C#CW0 longitudinal weight, where # is wave number.

5 

All ECLS-K sample sizes are rounded to the nearest 50 in accordance with NCES guidelines for restricted data.

6 

This also raises the issue of systematic measurement error in the ECLS-K data, as annual absences are systematically larger than the ideal measure of absences, which is the number of absences that occurred between fall and spring tests in kindergarten and before the spring test in first grade. Such measurement error is unlikely to be hugely problematic from a practical standpoint in the ECLS-K for at least three reasons. First, the degree to which this measurement error biases the estimated coefficient on absences downward is a function of the percentage of absences that occurred outside of the fall and spring tests in kindergarten and after the test in first grade. Although we do not know the dates of absences, we do know the exact dates of the ECLS-K tests, which can be used to estimate the percentage of school days that occur between the fall and spring tests and after the spring tests. For example, on average, 12 percent of school days are after the spring first grade assessment, and the interquartile range is 8 percent to 16 percent. Assuming that absences are uniformly distributed throughout the year, which is arguably a conservative assumption since the distribution of absences is likely centered in the winter during flu season, suggests that the baseline estimates are attenuated by about 12 percent. This is not large enough to change the practical interpretation of the results (i.e., the baseline estimate still rounds to 0.002). Second, although all ECLS-K models explicitly control for the number of school days between tests in kindergarten and that occur before the spring test in first grade, failing to do so does not appreciably change the baseline estimates. This is unsurprising, as there is relatively little within-classroom variation in assessment dates and the classroom fixed effects control for much of the variation in test dates. Finally, Fitzpatrick, Grissmer, and Hastedt (2011) find that ECLS-K test dates are essentially randomly distributed across students, which suggests that students who are prone to absences or who are disproportionately harmed by absences are neither more nor less likely to have a later test date. Thus, only the mechanical type of measurement error bias, which is a simple function of the percentage of school days that occur outside of the fall and spring tests in kindergarten and after the test in first grade, is likely present.

7 

IEPs are an important part of the 2004 Individuals with Disabilities Education Act (IDEA). Specifically, IEPs document the goals and support systems in place for children with learning disabilities. Parents and educators work together to develop an appropriate IEP.

8 

See www.childandfamilypolicy.duke.edu/project_detail.php?id=35 for additional information.

9 

North Carolina's end-of-grade tests are state mandated, criterion-referenced, vertically aligned, and are given to all students in the spring of third, fourth, and fifth grades. The tests in 2006–09 were administered during the last three weeks of the academic year (i.e., the last two weeks of May or the first week of June). In 2010 this changed to the last twenty-two days of the academic year, making it possible for tests to occur slightly earlier in May.

10 

However, we conduct sensitivity analyses using the full sample of 903,314 student-years for which absences are observed and, to assuage concerns that the results are influenced by endogenously missing data on student tardies, show in Appendix table A.1 that the average characteristics of students for whom tardies are observed are similar in magnitude to those for whom tardies are unobserved.

11 

This was done by taking the SD of the ordinary least squares residuals from regressions of absences on full sets of classroom and student fixed effects, respectively.

12 

Chronic absence is typically defined as being absent on 10 percent or more of school days (Bruner, Discher, and Chang 2011; Balfanz and Byrnes 2012). Most states have about 180 school days. In 2011 North Carolina increased the legal minimum to 185 (see www.ncpublicschools.org/fbs/accounting/calendar/).

13 

We refer to time periods rather than years because for kindergarteners in the ECLS-K, yi,t-1 is the student's score on the fall kindergarten assessment. For first graders in the ECLS-K and for students in North Carolina, yi,t-1 is the student's score on the previous spring's assessment. The ECLS-K regression models also control for the number of days between fall and spring tests in kindergarten and the number of days prior to the spring test in first grade, though failing to do so does not appreciably change the estimated effect of absences, which is unsurprising given that variation in test dates is random and there is little within-classroom variation in test dates. The classroom FE directly control for test timing in the North Carolina models, because all students in a classroom (school) take the end-of-grade test on the same day.

14 

It is worth noting that the main results are robust to including “school changers” in the analytic sample and including a school-changer indicator in the vector of student controls. Nonetheless, we exclude such students from the baseline sample to avoid conflating the effect of changing schools with the effect of absences, as the unobserved shock that led to a school change may also affect a student's attendance patterns.

15 

Corresponding results for the ECLS-K are reported in Appendix figures B.1 (math) and B.2 (reading). Although these figures paint a qualitatively similar picture, the smaller sample size of the ECLS-K results in imprecise estimates of the nonparametric specification of f(A). Specifically, only five and two of the 26 absence indicators are individually statistically significant at the 5% confidence level in the math and reading ECLS-K regressions, respectively, compared with all 26 in the North Carolina regressions.

16 

The analytic sample's 99th percentile is about 26 absences. Formally, the quadratic specification of f(A) is and the nonparametric specification of f(A) is where 1{·} is the indicator function.

17 

The first-grade, poverty, female, English, and locale interaction terms remain statistically insignificant when each is added individually to the baseline specification.

18 

In the ECLS-K data, the instrument is twice-lagged achievement, which is the test score from the autumn of kindergarten.

19 

Similarly, we do not control for days between tests in these regressions.

Acknowledgments

This research was supported in part by grants from the Spencer Foundation and the American Educational Research Association (AERA). AERA receives funds for its AERA Grants Program from the National Science Foundation under NSF grant DRL-0941014. Opinions reflect those of the authors and do not necessarily reflect those of the granting agencies. The authors thank Quentin Brummet, Dan Goldhaber, Steven Haider, Joe Sabia, two anonymous referees, and participants at the 2013 Southern Economic Association Annual Meeting for providing helpful comments on an earlier draft. Any errors are our own.

REFERENCES

Alexander
,
Karl
,
Doris
Entwisle
, and
Nader
Kabbani
.
2001
.
The dropout process in life course perspective: Early risk factors at home and school
.
Teachers College Record
103
(
5
):
760
822
. doi:10.1111/0161-4681.00134.
Anderson
,
T. W.
, and
Cheng
Hsiao
.
1982
.
Formulation and estimation of dynamic models using panel data
.
Journal of Econometrics
18
(
1
):
47
82
. doi:10.1016/0304-4076(82)90095-1.
Angrist
,
Joshua
, and
Jörn-Steffen
Pischke
.
2009
.
Mostly harmless econometrics: An empiricists’ companion
.
Princeton, NJ
:
Princeton University Press
.
Aucejo
,
Esteban M.
, and
Teresa Foy
Romano
.
2016
.
Assessing the effect of school days and absences on test score performance
.
Economics of Education Review
55
:
70
87
. doi:10.1016/j.econedurev.2016.08.007.
Balfanz
,
Robert
, and
Vaughan
Byrnes
.
2012
.
Chronic absenteeism: Summarizing what we know from nationally available data
.
Baltimore, MD
:
Johns Hopkins University Center for Social Organization of Schools
.
Baydar
,
Nazli
, and
Jeanne
Brooks-Gunn
.
1991
.
Effects of maternal employment and child-care arrangements on preschoolers’ cognitive and behavioral outcomes: Evidence from the Children of the National Longitudinal Survey of Youth
.
Developmental Psychology
27
(
6
):
932
945
. doi:10.1037/0012-1649.27.6.932.
Bertrand
,
Marianne
, and
Jessica
Pan
.
2013
.
The trouble with boys: Social influences and the gender gap in behavior
.
American Economic Journal. Applied Economics
5
(
1
):
32
64
. doi:10.1257/app.5.1.32.
Bruner
,
Charles
,
Anne
Discher
, and
Hedy
Chang
.
2011
.
Chronic elementary absenteeism: A problem hidden in plain sight
. Available www.edweek.org/media/chronicabsence-15chang.pdf.
Accessed 20 January 2014
.
Caldas
,
Stephen J.
1993
.
Reexamination of input and process factor effects on public school achievement
.
Journal of Educational Research
86
(
4
):
206
214
. doi:10.1080/00220671.1993.9941832.
Chang
,
Hedy N.
, and
Mariajosé
Romero
.
2008
.
Present, engaged, and accounted for: The critical importance of addressing chronic absence in the early grades
.
New York
:
National Center for Children in Poverty
.
Chetty
,
Raj
,
John N.
Friedman
, and
Jonah E.
Rockoff
.
2014
.
Measuring the impacts of teachers I: Evaluating bias in teacher value-added estimates
.
American Economic Review
104
(
9
):
2593
2632
. doi:10.1257/aer.104.9.2593.
Clotfelter
,
Charles
,
Helen
Ladd
, and
Jacob L.
Vigdor
.
2009
.
Are teacher absences worth worrying about in the United States
?
Education Finance and Policy
4
(
2
):
115
149
. doi:10.1162/edfp.2009.4.2.115.
Currie
,
Janet
, and
Duncan
Thomas
.
2001
.
Early test scores, school quality, and SES: Longrun effects on wage and employment outcomes
.
Research in Labor Economics: Worker Wellbeing in a Changing Labor Market
20
:
103
132
. doi:10.1016/S0147-9121(01)20039-9.
Fairlie
,
Robert W.
,
Florian
Hoffmann
, and
Philip
Oreopoulos
.
2014
.
A community college instructor like me: Race and ethnicity interactions in the classroom
.
American Economic Review
104
(
8
):
2567
2591
. doi:10.1257/aer.104.8.2567.
Fitzpatrick
,
Maria D.
,
David
Grissmer
, and
Sarah
Hastedt
.
2011
.
What a difference a day makes: Estimating daily learning gains during kindergarten and first grade using a natural experiment
.
Economics of Education Review
30
(
2
):
269
279
. doi:10.1016/j.econedurev.2010.09.004.
Fryer
,
Roland G.
, Jr.
, and
Steven D.
Levitt
.
2004
.
Understanding the black-white test score gap in the first two years of school
.
Review of Economics and Statistics
86
(
2
):
447
464
. doi:10.1162/003465304323031049.
Gershenson
,
Seth
.
2015
.
Linking teacher quality, student attendance, and student achievement
.
Education Finance and Policy
11
(
2
):
125
149
. doi:10.1162/EDFP_a_00180.
Gershenson
,
Seth
.
2016
.
Should value-added models control for student absences
?
Teachers College Record
.
September, ID No. 21629
.
Goodman
,
Joshua
.
2014
.
Flaking out: Student absences and snow days as disruptions of instructional time
.
NBER Working Paper No. w20221
.
Gottfried
,
Michael A.
2009
.
Excused versus unexcused: How student absences in elementary school affect academic achievement
.
Educational Evaluation and Policy Analysis
31
(
4
):
392
419
.
Gottfried
,
Michael A.
2011
.
The detrimental effects of missing school: Evidence from urban siblings
.
American Journal of Education
117
(
2
):
147
182
. doi:10.1086/657886.
Guarino
,
Cassandra M.
,
Mark D.
Reckase
, and
Jeffrey M.
Wooldridge
.
2015
.
Can value-added measures of teacher performance be trusted
?
Education Finance and Policy
10
(
1
):
117
156
. doi:10.1162/EDFP_a_00153.
Guryan
,
Jonathan
,
Erik
Hurst
, and
Melissa
Kearney
.
2008
.
Parental education and parental time with children
.
Journal of Economic Perspectives
22
(
3
):
23
46
. doi:10.1257/jep.22.3.23.
Hansen
,
Benjamin
.
2011
.
School year length and student performance: Quasi-experimental evidence
. Available http://dx.doi.org/10.2139/ssrn.2269846.
Accessed 11 July 2016
.
Hanushek
,
Eric A.
, and
Steven G.
Rivkin
.
2010
.
Generalizations about using value-added measures of teacher quality
.
American Economic Review
100
(
2
):
267
271
. doi:10.1257/aer.100.2.267.
Harris
,
Douglas N.
,
Tim R.
Sass
, and
Anastasia
Semykina
.
2014
.
Value-added models and the measurement of teacher productivity
.
Economics of Education Review
38
(
1
):
9
23
.
Heckman
,
James J.
,
Jora
Stixrud
, and
Sergio
Urzua
.
2006
.
The effects of cognitive and noncognitive abilities on labor market outcomes and social behavior
.
Journal of Labor Economics
24
(
3
):
411
482
. doi:10.1086/504455.
Herrmann
,
Mariesa A.
, and
Jonah E.
Rockoff
.
2012
.
Worker absence and productivity: Evidence from teaching
.
Journal of Labor Economics
30
(
4
):
749
782
. doi:10.1086/666537.
Jacob
,
Brian A.
2002
.
Where the boys aren’t: Non-cognitive skills, returns to school and the gender gap in higher education
.
Economics of Education Review
21
(
6
):
589
598
. doi:10.1016/S0272-7757(01)00051-6.
Jacob
,
Brian A.
2005
.
Accountability, incentives and behavior: The impact of high-stakes testing in the Chicago public schools
.
Journal of Public Economics
89
(
5–6
):
761
796
. doi:10.1016/j.jpubeco.2004.08.004.
Jones
,
Nathan D.
,
Heather M.
Buzick
, and
Sultan
Turkan
.
2013
.
Including students with disabilities and English learners in measures of educator effectiveness
.
Educational Researcher
42
(
4
):
234
241
. doi:10.3102/0013189x12468211.
Kane
,
Thomas J.
,
Jonah E.
Rockoff
, and
Douglas O.
Staiger
.
2008
.
What does certification tell us about teacher effectiveness? Evidence from New York City
.
Economics of Education Review
27
(
6
):
615
631
. doi:10.1016/j.econedurev.2007.05.005.
Marcotte
,
Dave E.
, and
B.
Hansen
.
2010
.
Time for school
.
Education Next
10
(
1
):
52
59
.
Marcotte
,
Dave E.
, and
Steven W.
Hemelt
.
2008
.
Unscheduled school closings and student performance
.
Education Finance and Policy
3
(
3
):
316
338
. doi:10.1162/edfp.2008.3.3.316.
Monk
,
David H.
, and
Mohd A.
Ibrahim
.
1984
.
Patterns of absence and pupil achievement
.
American Educational Research Journal
21
(
2
):
295
310
. doi:10.3102/00028312021002295.
Morrissey
,
Taryn W.
,
Lindsey
Hutchison
, and
Adam
Winsler
.
2014
.
Family income, school attendance, and academic achievement in elementary school
.
Developmental Psychology
50
(
3
):
741
753
. doi:10.1037/a0033848.
National Center for Education Statistics (NCES)
.
2002
.
User's guide to the kindergarten-first grade public use data file
(
NCES-2002–149
).
Washington, DC
:
NCES, U.S. Department of Education
.
Nickell
,
Stephen
.
1981
.
Biases in dynamic models with fixed effects
.
Econometrica
49
(
6
):
1417
1426
. doi:10.2307/1911408.
Noell
,
George H.
,
Bethany A.
Porter
,
R.
Maria Patt
, and
Amanda
Dahir
.
2008
.
Value added assessment of teacher preparation in Louisiana: 2004–2005 to 2006–2007
(
Technical Report
). Available www.regents.la.gov/assets/docs/2013/09/Final-Value-Added-Report-12.02.08-Yr5.pdf.
Accessed 5 July 2016
.
Pischke
,
Jörn-Steffen
.
2007
.
The impact of length of the school year on student performance and earnings: Evidence from the German short school years
.
Economic Journal (Oxford)
117
(
523
):
1216
1242
. doi:10.1111/j.1468-0297.2007.02080.x.
Ready
,
Douglas D.
2010
.
Socioeconomic disadvantage, school attendance, and early cognitive development: The differential effects of school exposure
.
Sociology of Education
83
(
4
):
271
286
. doi:10.1177/0038040710383520.
Reardon
,
Sean F.
2011
.
The widening academic achievement gap between the rich and the poor: New evidence and possible explanations
. In
Whither opportunity? Rising inequality, schools, and children's life chances
, edited by
Greg J.
Duncan
and
Richard J.
Murnane
, pp.
91
116
.
New York
:
Russell Sage Foundation
.
Rivkin
,
Steven G.
,
E.
Hanushek
, and
J. F.
Kain
.
2005
.
Teachers, schools, and academic achievement
.
Econometrica
73
(
2
):
417
458
. doi:10.1111/j.1468-0262.2005.00584.x.
Roby
,
Douglas E.
2004
.
Research on school attendance and student achievement: A study of Ohio schools
.
Educational Research Quarterly
28
(
1
):
3
14
.
Rockoff
,
Jonah E.
2004
.
The impact of individual teachers on student achievement: Evidence from panel data
.
American Economic Review
94
(
2
):
247
252
. doi:10.1257/0002828041302244.
Romero
,
Mariajose
, and
Young-Sun
Lee
.
2008
.
The influence of maternal and family risk on chronic absenteeism in early schooling
.
New York
:
National Center for Children in Poverty
.
Schoeneberger
,
Jason A.
2012
.
Longitudinal attendance patterns: Developing high school dropouts
.
Clearing House: A Journal of Educational Strategies, Issues and Ideas
85
(
1
):
7
14
. doi:10.1080/00098655.2011.603766.
Sims
,
David P.
2008
.
Strategic responses to school accountability measures: It's all in the timing
.
Economics of Education Review
27
(
1
):
58
68
. doi:10.1016/j.econedurev.2006.05.003.
Solon
,
Gary
,
Steven J.
Haider
, and
Jeffrey M.
Wooldridge
.
2015
.
What are we weighting for
?
Journal of Human Resources
50
(
2
):
301
316
. doi:10.3368/jhr.50.2.301.
Todd
,
Petra E.
, and
Kenneth I.
Wolpin
.
2003
.
On the specification and estimation of the production function for cognitive achievement
.
Economic Journal (Oxford)
113
(
485
):
F3
F33
. doi:10.1111/1468-0297.00097.
U.S. Department of Education (USDOE)
.
2006
.
The condition of education 2006
(
NCES 2006–071
).
Washington, DC
:
U.S. Government Printing Office, NCES
.
Wooldridge
,
Jeffrey M.
2010
.
Econometric analysis of cross section and panel data
, 2nd ed.
Cambridge, MA
:
MIT Press
.

Appendix A:  Additional Data

Table A.1. 
Descriptive Statistics by Tardy Status in North Carolina
Tardies Observed MeanTardies Missing Mean
Standardized math score 0.05 0.08 
Standardized reading score 0.04 0.06 
Math lag score 0.07 0.09 
Reading lag score 0.06 0.08 
Total absences 6.27 6.12 
Non-Hispanic white 56.3% 57.6% 
Non-Hispanic black 26.7% 24.6% 
Hispanic 9.6% 10.1% 
Non-Hispanic other 7.4% 7.7% 
Fifth grade 50.2% 49.9% 
Female 50.0% 49.9% 
Below poverty level 46.7% 48.3% 
Limited English proficiency (LEP) 1.7% 0.8% 
Learning disability – math 1.5% 1.4% 
Learning disability – reading 3.3% 2.5% 
2006 25.1% 9.9% 
2007 27.6% 4.3% 
2008 27.8% 5.3% 
2009 13.2% 30.8% 
2010 6.3% 49.7% 
N 587,919 315,395 
Tardies Observed MeanTardies Missing Mean
Standardized math score 0.05 0.08 
Standardized reading score 0.04 0.06 
Math lag score 0.07 0.09 
Reading lag score 0.06 0.08 
Total absences 6.27 6.12 
Non-Hispanic white 56.3% 57.6% 
Non-Hispanic black 26.7% 24.6% 
Hispanic 9.6% 10.1% 
Non-Hispanic other 7.4% 7.7% 
Fifth grade 50.2% 49.9% 
Female 50.0% 49.9% 
Below poverty level 46.7% 48.3% 
Limited English proficiency (LEP) 1.7% 0.8% 
Learning disability – math 1.5% 1.4% 
Learning disability – reading 3.3% 2.5% 
2006 25.1% 9.9% 
2007 27.6% 4.3% 
2008 27.8% 5.3% 
2009 13.2% 30.8% 
2010 6.3% 49.7% 
N 587,919 315,395 

Notes: Excused and unexcused absences are only observed for 46,094 cases when data on tardies are missing. All differences are statistically significant at 1% significance.

Table A.2. 
Results from Balanced ECLS-K Kindergarten through Fifth Grades Regressions Including Interaction Terms
Math AchievementReading Achievement
(1)(2)(3)(4)
Lag score 0.446 0.433 0.399 0.399 
 (0.016)*** (0.020)*** (0.019)*** (0.025)*** 
Total absences (TA) −0.005 −0.006 0.0003 0.0003 
 (0.007) (0.007) (0.005) (0.005) 
Total tardies −0.003 −0.003 −0.004 −0.004 
 (0.002) (0.002) (0.002)** (0.002)** 
Female −0.139 −0.134 0.066 0.066 
 (0.042)*** (0.043)*** (0.039)* (0.039)* 
Below poverty level −0.232 −0.240 −0.275 −0.275 
 (0.076)*** (0.075)*** (0.072)*** (0.072)*** 
Does not speak English at home 0.159 0.160 0.006 0.006 
 (0.090)* (0.089)* (0.126) (0.126) 
Individualized Education Plan (IEP) −0.234 −0.232 −0.434 −0.434 
 (0.120)* (0.120)* (0.135)*** (0.135)*** 
First grade*TA 0.002 0.003 0.003 0.003 
 (0.008) (0.008) (0.005) (0.006) 
Third grade*TA 0.002 0.003 0.001 0.001 
 (0.007) (0.007) (0.005) (0.005) 
Fifth grade*TA 0.003 0.005 0.003 0.003 
 (0.006) (0.006) (0.004) (0.004) 
Poverty*TA 0.006 0.007 0.002 0.002 
 (0.006) (0.006) (0.005) (0.005) 
Female*TA 0.0002 −0.001 −0.001 −0.001 
 (0.004) (0.004) (0.003) (0.004) 
No English at home*TA −0.033 −0.033 −0.008 −0.008 
 (0.007)*** (0.007)*** (0.024) (0.023) 
Student has an IEP*TA −0.003 −0.003 0.005 0.005 
 (0.007) (0.007) (0.014) (0.013) 
Lagged score*TA  0.002  −0.0001 
  (0.002)  (0.003) 
Joint significance of interactions p < 0.01 p < 0.01 p = 0.98 p = 0.99 
Adjusted R2 0.03 0.03 −0.04 −0.04 
Math AchievementReading Achievement
(1)(2)(3)(4)
Lag score 0.446 0.433 0.399 0.399 
 (0.016)*** (0.020)*** (0.019)*** (0.025)*** 
Total absences (TA) −0.005 −0.006 0.0003 0.0003 
 (0.007) (0.007) (0.005) (0.005) 
Total tardies −0.003 −0.003 −0.004 −0.004 
 (0.002) (0.002) (0.002)** (0.002)** 
Female −0.139 −0.134 0.066 0.066 
 (0.042)*** (0.043)*** (0.039)* (0.039)* 
Below poverty level −0.232 −0.240 −0.275 −0.275 
 (0.076)*** (0.075)*** (0.072)*** (0.072)*** 
Does not speak English at home 0.159 0.160 0.006 0.006 
 (0.090)* (0.089)* (0.126) (0.126) 
Individualized Education Plan (IEP) −0.234 −0.232 −0.434 −0.434 
 (0.120)* (0.120)* (0.135)*** (0.135)*** 
First grade*TA 0.002 0.003 0.003 0.003 
 (0.008) (0.008) (0.005) (0.006) 
Third grade*TA 0.002 0.003 0.001 0.001 
 (0.007) (0.007) (0.005) (0.005) 
Fifth grade*TA 0.003 0.005 0.003 0.003 
 (0.006) (0.006) (0.004) (0.004) 
Poverty*TA 0.006 0.007 0.002 0.002 
 (0.006) (0.006) (0.005) (0.005) 
Female*TA 0.0002 −0.001 −0.001 −0.001 
 (0.004) (0.004) (0.003) (0.004) 
No English at home*TA −0.033 −0.033 −0.008 −0.008 
 (0.007)*** (0.007)*** (0.024) (0.023) 
Student has an IEP*TA −0.003 −0.003 0.005 0.005 
 (0.007) (0.007) (0.014) (0.013) 
Lagged score*TA  0.002  −0.0001 
  (0.002)  (0.003) 
Joint significance of interactions p < 0.01 p < 0.01 p = 0.98 p = 0.99 
Adjusted R2 0.03 0.03 −0.04 −0.04 

Notes: Regressions are weighted by ECLS-K provided weight, C#CW0. N = 4,800, which is rounded to the nearest 50 to conform to NCES regulations. Each model controls for classroom fixed effects, child race/ethnicity, maternal education, and child redshirt status. Standard errors are robust to clustering at the school level.

*p < 0.1; **p < 0.05; ***p < 0.01.

Appendix B:  Effect of Absences

Figure B.1.

Effect of Absences on Math Achievement in the ECLS-K.

Figure B.1.

Effect of Absences on Math Achievement in the ECLS-K.

Figure B.2.

Effect of Absences on Reading Achievement in the ECLS-K.

Figure B.2.

Effect of Absences on Reading Achievement in the ECLS-K.