## Abstract

Combining data on locations with career and educational histories of mathematicians, we study how distance and ties affect citation patterns. The ties considered include coauthorship, past colocation, and relationships mediated by advisers and the alma mater. With fixed effects capturing subject similarity and article quality, we find linkages are strongly associated with citation. Controlling for ties generally halves the negative impact of geographic barriers on citations. Ties matter more for less prominent and more recent papers and have retained their quantitative importance in recent years. The impact of distance, controlling for ties, has fallen and is statistically insignificant after 2004.

## I. Introduction

MOUNTING evidence points to the importance of geographic barriers to knowledge flows. Arguing that citations provide the paper trail for knowledge flows, Jaffe, Trajtenberg, and Henderson (1993) establish that cites to patents are geographically localized.^{1} Keller (2002) shows that research spillovers on productivity decay with distance, and Comin, Dmitriev, and Rossi-Hansberg (2012) find the likelihood of adopting new technologies declines with distance to the origin of the invention. Ellison, Glaeser, and Kerr (2010) show that industries that share ideas (proxied by R&D and patent citation flows) have a stronger tendency to coagglomerate in space. While much of the literature focuses on technology diffusion, spatial separation impedes the spread of many other types of
information. For example, information frictions account for half of the distance effect in Allen's (2014) study of differences in rice prices between Philippine islands. Urbanization continues to increase despite rising land prices and congestion, a fact Glaeser (2011) attributes to the spread of innovations “from person to person across crowded city streets.”

All this evidence notwithstanding, the notion that borders or distance could prove to be practical obstacles to flows of knowledge seems hard to square with the fact that information can move anywhere without incurring either tariffs or freight costs. As Keller and Yeaple (2013) put it “Knowledge, as an intangible, seems ideally suited to overcoming spatial frictions.” Especially in the age of Google, whose self-described mission is to “organize the world's information and make it universally accessible and useful,” the microfoundations for geographic knowledge frictions are far from obvious. To the extent there is a standard explanation, it is often mentioned that tacit knowledge is easier to communicate face-to-face. However, one study shows that even the transmission of highly codified information benefits from proximity. Lissoni (2001) examined a cluster of mechanical firms in Brescia, Italy, and found they engaged primarily in the transfer of CAD-encoded designs.

In this paper, we hypothesize that the impact of distance on knowledge arises in large part due to spatially concentrated personal ties. Proximity facilitates tie formation, and those ties foster knowledge flows. The general mechanism we envision is that an agent trying to solve a problem becomes aware of potential solutions by tapping the knowledge residing in their network of personal relationships. This hypothesis can be tested only in a specific context where interpersonal ties, geography, and knowledge flows can all be tracked in a systematic way. We argue that the rich data available on mathematicians make them, despite their idiosyncrasies, an insightful group to study for this purpose. Our first key finding is that adding controls for a comprehensive set of career and educational linkages between authors of mathematics papers leads to a halving of estimated geography effects. The role of ties in attenuating the negative effect of distance on citations echoes Keller's (2001) finding that including trade flows and FDI in the equation for technological knowledge spillovers shrinks the estimated negative effect of distance. The paper proceeds to combine additional results to establish the microfoundations for why ties matter so much.

Prior work on patent citation has already pointed toward ties as an important determinant of knowledge flows. Invoking the idea of social proximity, Agrawal, Kapur, and McHale (2008) and Kerr (2008) show that inventors have a higher propensity to cite patents by those who share their ethnic origins (as revealed by their surnames). While social connections are known to be richer within ethnic groups, sharing surnames with the same ethnic origin does not imply a personal connection between citing and cited inventors. Coethnicity can reflect cultural similarities between inventors who do not know each other personally. In order to capture the effect of person-to-person ties on knowledge flows, we need data sources from which we can extract the histories of personal relationships. Patent applications provide enough information to determine past collaboration; Singh (2005) and Breschi and Lissoni (2009) find that this type of tie increases citation. Agrawal, Cockburn, and McHale (2006) investigate a second tie, past colocation. They find that inventors who change institutions are still disproportionately cited in patent applications by their former colleagues.

To capture a richer set of social ties between individuals who potentially transmit knowledge to each other, we believe it is useful to consider academics, for whom it is possible to identify ties based on educational histories. We take advantage of the fact that in mathematics, PhD institutions and advisors have been tracked globally for a long time by the Mathematics Genealogy Project (MGP).^{2} There is strong evidence from Waldinger (2010) that the quality of mathematics faculty causally increases the subsequent academic success of their doctoral students. The MGP allows us to trace the patterns of citation between advisers and advisees, classmates, and the academic extended family.

The process through which mathematicians (or anyone else) form ties is not, of course, entirely random. A concern for estimating the effect of ties is that the same unobservables that promote scholars to form ties with each other also affect the likelihood of citing each other. The educational ties we focus on have the advantage of being predetermined with respect to the citation process, since it is rare for an academic to cite or be cited prior to obtaining doctoral education. Unlike colocation and collaboration, educational linkages do not change over time in response to shocks to the interests of citing authors. While there is substantial randomness involved in determining classmates, the matching between advisers and advisees is likely to be shaped by common interests. The worry is that author A may be more likely to cite a paper by a tied author B than author C who has no tie with B because A and B write on the same topics. The way we respond to this concern is to compare citation probabilities only between authors A and C who have written papers in the same three-digit field of mathematics. We show that this control for article subject is essential. Without it, estimates of ties are substantially inflated. When three- or five-digit fields or even keywords are controlled for, ties have reduced—but still large—estimated effects on citation. With the controls that give the lowest magnitude, five-digit subject, and a cocitation indicator, a single tie on average boosts the odds of citation by 46%.

In addition to the strength of its academic genealogy data, mathematics offers two additional advantages relative to other academic fields. First, mathematics employs a common language of communication. This suggests that transmission of mathematics knowledge would be less influenced by linguistic and cultural factors. In many social sciences and humanities fields, there are journals that focus on certain regions or countries. For example, in the fields of history and literature, there are obvious reasons to expect national borders and language to influence citation patterns. A second advantage of studying mathematics comes from the citation norms of the discipline. New theorems build on previous theorems, which must be cited. There also appears to be a norm against gratuitous citation, as evidenced by the relatively low number of references in each paper. Althouse et al. (2009) report that math papers cite 18 papers on average, compared to 30 in economics, and 45 to 51 in sociology, psychology, business, and marketing.

Our first set of results establishes that ties are an important mechanism underlying estimated geography effects on citations. But what is the mechanism underlying the importance of ties? We present two lines of evidence to argue that ties matter because they transmit information. The first follows from the idea of Arrow (1969) that knowledge flows can be generally thought of as interactions between a teacher (sender) and a student (receiver). We find evidence that citations are stronger to the authors who are more likely to be senders of information. The odds of citation are seven times higher if a paper is written by the adviser of the citing author. The impact of the author being a former advisee is weaker, albeit still very large. Moving one step further apart in the adviser network, we find advisers of advisers have three times the normal odds of being cited, but there are no significant differences in their propensity to cite their advisees' advisees. The second line of evidence is that ties matter more for the types of papers where information is harder to acquire. Our estimates show that ties (and geographic separation) have stronger impacts for papers that were only recently published, or not heavily cited, or just in a different field.

The role of distance, after controlling for ties, has even become statistically insignificant in recent years. This finding of declining geographic barriers extends the results of two earlier studies using very different methodologies. Keller (2002) estimates the rate of distance decay in the benefits that one country receives from R&D conducted in another country. He finds that the distance decay rate fell by two-thirds from the period 1970 to 1982, to 1983 to 1995. Griffith, Lee, and Van Reenen (2011) analyze the number of days until the first citation of a newly granted patent. They find home inventors take fewer days on average than foreign-based inventors to be the first to cite home-invented patents. This home bias declined substantially between 1975–1989 and 1990–1999. Our study shows that distance effects fell by two-thirds from the early 1990s to the late 2000s. This extends the evidence from the previous literature to the decade in which Internet use becomes pervasive. Our investigation of time-varying coefficients also reveals that despite the advances in a scholar's ability to search for information over the Internet, the impact of personal ties remains as strong as ever.

While we do not wish to draw conclusions that stray too far from the context of our estimation, the whole rationale for studying citations in mathematics is to obtain insights with broader applicability to knowledge flows. The extent that ties facilitate transfer of valuable knowledge in one context (math) provides a prima facie case for their potential importance in all cumulative, collaborative discovery processes. Collaboration in mathematics often takes the form of tied researchers' making suggestions of previously proven theorems that could help prove new theorems. In other research contexts, from drug invention to financial engineering, there would be analogous ways that lessons one person learns could help a tied person solve a new problem.

Going beyond research, a wealth of evidence suggests that entrepreneurs learn about potential business opportunities from their web of connections. For example, Kerr and Mandorff (2015) explain the remarkable concentration of ethnic groups in certain occupations (Gujarati-speaking Indians are overrepresented in the motel industry by a factor of 108) by invoking knowledge acquired through social interactions. Learning from ties might also explain the robust empirical association between bilateral immigration stocks and trade flows.^{3} Such work generally lacks individual-level evidence on the relevant social ties. Using our person-to-person measures of ties provides insight into the processes underlying the patterns seen in aggregated data.

The remainder of the paper is organized as follows. Section II posits a simple citation model to serve as the estimating framework for relating a paper-to-paper citation indicator to the ties and geography variables measured at the author level. Section III describes our data on citations, geography, and ties and explains how we construct the estimating sample. Section IV presents the results of our regressions. In the final section, we reinterpret other research findings in light of our results. We also suggest the potential policy implications.

## II. Specification of Citation Probability Equation

To guide estimation and interpretation, we provide a simple model of the citation process, leading to a reduced-form estimating equation for the probability of one article citing another. We then specify the observed determinants of citation and a method for controlling for key unobservables.

At the article level, citation is a binary choice, and we therefore follow the standard approach of defining a latent variable $Cid*$, which leads to a realized citation, $Cid=1$ of paper $d$ by paper $i$, when a threshold $\kappa $ is exceeded. Thus, the probability of citation is $P(Cid*>\kappa )$. Articles should cite the relevant preceding work. However, author teams can cite papers only if they are aware of them. These truisms suggest that citation probabilities should be increasing in the product of relevance and awareness. We therefore model $Cid*=AidRid$, where $Aid$ denotes the level of awareness of citing team $i$ of paper $d$ and $Rid$ scores the relevance of the content of paper $d$ for paper $i$. The marginal effect of awareness is 0 for irrelevant ($Rid=0$) papers and the marginal effect of relevance is 0 under the condition of ignorance ($Aid=0$).

We model awareness as an exponential function of a vector of indicators of geographic separation, $Gid$, and of the educational and career linkages, $Lid$, between members of the two author teams. Geographic proximity matters because it increases the frequency of face-to-face interactions (from water-cooler conversations to conference meetings). Information flows can overcome geographic barriers if authors of papers $i$ and $d$ are connected by overlapping career or educational histories, or both. Past colocation or just indirect linkages such as having the same adviser at different times create a kind of connective tissue that facilitates knowledge flows. In summary, we hypothesize that $\u2202A/\u2202G(k)<0$ for all $k$ elements of geographic separation and $\u2202A/\u2202L(k)>0$ for all $k$ indicators of ties between author teams.

where $\Lambda (x)=(1+exp(-x))-1$.

We use logit as the primary estimator (and discuss linear probability model results in the online appendix, robustness section D) since it constrains predicted citation probabilities to be nonnegative. Logit coefficients provide the marginal effect on changes in the log odds. In the context of rare events such as citations, marginal effects on probabilities can be tiny. Singh (2005) multiplies his marginal effects by 1 million for reporting purposes. We find odds ratios are more intuitive, but as with rare diseases, one must keep in mind that a large odds ratio does not imply a large change in the probability of a positive outcome.

The $\alpha s(i)t(i)d$ fixed effects are a critical part of our estimation strategy since there is no reason to expect the geography and ties variables to be orthogonal to the triadic relevance term. Indeed, it is likely that authors of more important articles would be better connected. Moreover, authors who tend to work on similar subjects are more likely to be connected. That is, intellectual separation between $s(i)$ and article $d$ may be negatively related to $Lid$. We therefore estimate our model controlling for $\alpha s(i)t(i)d$, the triad of subject of $i$, year of $i$, and article $d$. While we have modeled awareness as a function of geography and ties only, we could easily introduce $s(i)t(i)d$ effects and random article-pair effects. They would simply be incorporated into $\alpha $ and $\epsilon $. This means, for example, that we allow for a completely general pattern of diffusion of awareness of article $d$ on different subjects $s$.

Estimating $\alpha s(i)t(i)d$ with a large number of articles is computationally difficult and raises concerns over the incidental parameters problem. Instead we take advantage of the logit feature that the total number of cites received by each triad is a sufficient statistic for $\alpha s(i)t(i)d$. This permits estimation in terms of a conditional density to obtain consistent estimators of the $\gamma $ and $\lambda $ parameters. Prior work has included fixed effects for time lags (Singh, 2005), cited patents (Thompson, 2006), and cited institutions (Belenzon & Schankerman, 2013). This is the first study to control for the triad of citing article subject, citing article publication year, and cited article.

The unit of observation for citations is the article pair. However, the geography and ties variables underlying $Gid$ and $Lid$ are measured at the author-pair level. For multiple-author article pairs, we aggregate geography and ties of coauthors in the citing and cited author teams under the assumption of perfect information flow within teams. Specifically, the distance between an article pair is the minimum of the author-pair distances, and the ties between an article pair are the maximum of the author-pair ties. Appendix A.5 provides greater detail and appendix D shows that averaging geographic barriers and ties leads to similar results.

## III. Data

Our data set combines four main sources. The Web of Science (WOS) provides citations, author affiliations, and keywords. The Mathematics Genealogy Project (MGP) tracks the place and time of doctoral education, as well as the names of the dissertation supervisor(s). Zentralblatt MATH classifies mathematical articles at the five-digit level. Finally, the longitudes and latitudes for 1,000 mathematics institutions used to calculate distance data between citing and cited author teams come from Google Maps. Further information on data sources is provided in appendix A.1.

### A. Citations and Distance Data

Figure 1 provides a first look at the patterns in the WOS citation data and how they relate to geographic distance between authors. It graphs survival functions for citation flows as a function of distance between the authors of the citing and cited papers. $Sc(D)$ is the share of all cites that occur with distance $\u2265D$. Citations from one nation to another are calculated by summing the citations from papers written by authors affiliated with institutions in country $j$ to papers written by authors in country $n$.^{4}

The benchmark for cites is a dartboard model that takes as given each country's outward citations, $Cj\u2261\u2211nCjn$, and inward citations, $Cn\u2261\u2211jCjn$. The international allocation of these citations is completely random; that is, each paper is equally likely to cite any other paper regardless of distance. Randomness implies that outgoing cites from $j$ go to country $n$ with probability given by $n$'s share of all received cites. Thus, the aggregate flow of benchmark cites from $j$ to $n$ is given by $Fjnc\u2261Cj(Cn/Cw)$, where $Cw$ sums all cites in the world. $Fjnc$ can be thought of as the frictionless flow of citations from $j$ to $n$. The survival curve for the benchmark is $S\xafc(D)=\u2211distjn\u2265DFjnc/Cw$. Figure 1 displays $Sc(D)$ and $S\xafc(D)$ using solid and dashed black lines. The vertical gap between $S\xafc(D)$ and $Sc(D)$ measures the frictions that divert citations away from the dartboard benchmark.

The blue lines in figure 1 permit comparison with actual and benchmark flows of trade in goods. Research using gravity equations has established that distance is a major friction impeding trade in goods.^{5} To facilitate comparisons with the citation data, we employ trade data sets that measure each origin's aggregate flows, including those that remain within that origin. Thus, trade flows to self are value-added minus exports of value-added to the rest of the world.

Figure 1a displays flows of manufacturing value-added between and within 63 countries derived from the Trade in Value Added (TiVA) data set made available through a joint effort of the World Trade Organization (WTO) and the Organization for Economic Cooperation and Development (OECD). One prominent aspect of figure 1a is that a very large share of trade takes place within countries. The precipitous drops in the survival functions for both cites and goods seen at 1,854 kilometers correspond to the CEPII internal distance of the United States (the average distance between 20 major cities).

To see what is happening within this important set of intranational flows, we display the survival functions for state-to-state citations and trade in figure 1b. Citations are aggregated up to the state level just as they were for countries in figure 1a. The value of goods transported in 2007 between and within the fifty states and Washington, DC, come from the Freight Analysis Framework (FAF) database. We display distances up to 5,000 kilometers (excluding some Hawaii & Alaska dyads) because by that distance, both benchmarks and actual flows are indistinguishable visually from 0.

What we learn from figure 1 is that distance attenuates knowledge flows in mathematics, leading them to occur over shorter ranges than one would expect in a frictionless world. This is true at international scale and also true within the United States. The gap between actual and benchmark citation flows is much smaller than what we observe for goods flows. This is consistent with the hypothesis that trade flows are attenuated by both transport costs and information decay. Moreover, distance decay effects in commercial activities may be larger than those that apply to researchers.

### B. Author Ties Based on Career and Educational Histories

The WOS contributes three indicators of ties based on past coauthors and past affiliations. Each tie variable is based on actions taken prior to the publication year of the relevant citing article. “Coauthors” indicates whether author pairs have collaborated on a paper published in one of the 255 math journals included in WOS since 1975. “Coincided past” requires colocation at the same institution in the same year but the authors no longer work at the same place. “Worked same place” indicates that two authors worked at the same institution in different years in the past.

The MGP data allow us to construct eleven additional binary ties based on three types of relationships. “Share Ph.D.” denotes author pairs who graduated from the same PhD program within a five-year period and are therefore assumed to have overlapped. There are four types of academic “relatives.” The first type are academic parents: “adviser citing,” which takes the value of 1 if the author of the citing article was the PhD adviser of the author of the cited article. For “adviser cited,” the citing author was the advisee. Academic siblings were both supervised by the same professor. Academic grandparents are the advisers of the citing or cited authors' advisers. Academic cousins are authors who share a grandparent. Academic uncles are the advisees of one's academic grandparent. The final category of educational links is with the alma mater. These indicate when the citing or cited author is affiliated with the institution where the other author received her PhD. For example, “alma mater cited” takes a value of 1 when a Princeton alumnus cites a professor currently affiliated with Princeton. The tie dummy variables are additive: if author $i$ has coauthored with author $d$ who is also $i$'s PhD adviser, there would be 1s for both coauthor and adviser cited.

### C. Realized Citations and Control Citations

The MGP sample we use in most estimations has 29,404 realized citations. The complete set of steps leading to this sample is described in appendix A.2. To estimate the regressions, we need to combine the realized-citations set with a nonrealized-citation set. A standard exogenous sampling approach would entail picking a set of citing articles and constructing the universe of papers they might cite and predicting which potential cites are actually realized. Applying such an approach in the case of citations creates both conceptual and practical problems. First, it is hard to determine the appropriate universe. Should we consider the applied math papers that might have cited a given paper, the physics papers, the economics papers? The data-gathering challenge for a true universe of potential citing papers would be formidable. There would also be computational difficulties with incorporating so many noncitation observations. Citations are an example of a rare event problem. In
the Web of Science sample (before imposing the requirement of MGP data on all authors), there are approximately 3 billion potential cites and about 269,000 realized cites. Thus, the rate of citation is only 9 per 100,000. In response to this problem, the patent citation literature has generally adopted a choice-based sampling approach following the matching methodology of Jaffe et al. (1993). For each realized citation (case), a single nonrealized citation (control) is selected at random from a larger set of matched potential controls.^{6}

We adopt the one case per control approach when using the whole WOS sample. However, the sample featuring our full set of ties has a small enough number of realized citations that we can incorporate all potentially cited papers that meet certain criteria. Our baseline matching criterion is that controls be published in the same year and the same three-digit field as the original citing paper (case). The union of the realized citations and the control group constitutes the sample that is used in the econometric analysis.^{7} The presence of triadic fixed effects means that we have effectively the full set of control observations. To see this, imagine another field $A$ in which none of the papers cite a given paper $d$. Then the $A$-$d$ part of the triadic fixed effect would be a perfect predictor for noncitation so all such observations would be automatically dropped from the fixed-effects logit estimation. Appendix A.2 shows that the differences between the realized citation set and the control set are in line with our expectations.

## IV. Regression Results

This section presents the main results regarding the effect of geography and ties on knowledge flows. All regressions are logits with fixed effects for each group defined by citing field (three-digit subject codes), citing year, and cited article. Conditional on these fixed effects, variation in geography and ties is assumed to be random, allowing for a causal interpretation of the estimates. We recognize this is a strong assumption. Appendix B provides evidence that the three-digit subject controls are effective at reducing bias due to endogenous ties. A large, highly significant association between ties and citations holds up with even the most stringent measure of subject (using the same keywords).

There are four key findings. First, the effects of distance, borders, and language differences are about half as strong once educational and career links are taken into account. Second, thirteen of the fourteen measures of ties have positive effects that are significant at the 5% level in our final specification. On average, the effect of adding a tie raises the odds of citation by 80%, with some ties having much bigger effects. Third, ties and geography affect different types of papers differently. In particular, less prominent and more recently published papers exhibit stronger effects. Finally, while the importance of distance has declined to the point of statistical insignificance in recent years, ties remain as valuable as ever. Appendix D shows the robustness of our main results to using alternative subsamples and specifications.

### A. Baseline

Table 1 reports the result of baseline logit coefficients, which have the interpretation of marginal effects on the log odds. Statistical significance is calculated using standard errors that are clustered at the cited article level to allow for correlations in the errors across potentially citing articles for the same cited article.^{8}

Sample . | (1) WOS . | (2) WOS . | (3) MGP . | (4) MGP . | (5) MGP . | (6) MGP . |
---|---|---|---|---|---|---|

Distance $>0$ | −1.008^{***} | −0.936^{***} | −1.243^{***} | −0.571^{***} | ||

ln Dist $\u2223$ Dist $>0$ | −0.073^{***} | −0.052^{***} | −0.068^{***} | Figure 2 | −0.037^{***} | Figure 2 |

Different country | −0.198^{***} | −0.140^{***} | −0.232^{***} | −0.270^{*} | −0.090^{***} | −0.103^{***} |

Different language | −0.104^{***} | −0.066^{***} | −0.082^{***} | −0.079^{*} | −0.025 | −0.025 |

Co-authors | 1.672^{***} | 1.572^{***} | 1.581^{***} | |||

Coincided past | 0.712^{***} | 0.378^{***} | 0.378^{***} | |||

Worked same place | 0.478^{***} | 0.342^{***} | 0.339^{***} | |||

Share PhD (5 years) | 0.463^{***} | 0.457^{***} | ||||

PhD siblings | 0.663^{***} | 0.666^{***} | ||||

PhD cousins | 0.365^{***} | 0.362^{***} | ||||

Adviser citing | 1.090^{***} | 1.079^{***} | ||||

Adviser cited | 1.377^{***} | 1.375^{***} | ||||

Academic grandparent citing | −0.284 | −0.254 | ||||

Academic grandparent cited | 1.028^{***} | 1.023^{***} | ||||

Academic uncle citing | 0.227^{*} | 0.236^{**} | ||||

Academic uncle cited | 0.616^{*} | 0.619^{***} | ||||

Alma mater citing | 0.239^{*} | 0.233^{*} | ||||

Alma mater cited | 0.120^{**} | 0.119^{**} | ||||

Observations | 537,054 | 537,054 | 441,792 | 441,792 | 441,792 | 441,792 |

Pseudo-R$2$ | 0.044 | 0.085 | 0.033 | 0.034 | 0.091 | 0.091 |

Sample . | (1) WOS . | (2) WOS . | (3) MGP . | (4) MGP . | (5) MGP . | (6) MGP . |
---|---|---|---|---|---|---|

Distance $>0$ | −1.008^{***} | −0.936^{***} | −1.243^{***} | −0.571^{***} | ||

ln Dist $\u2223$ Dist $>0$ | −0.073^{***} | −0.052^{***} | −0.068^{***} | Figure 2 | −0.037^{***} | Figure 2 |

Different country | −0.198^{***} | −0.140^{***} | −0.232^{***} | −0.270^{*} | −0.090^{***} | −0.103^{***} |

Different language | −0.104^{***} | −0.066^{***} | −0.082^{***} | −0.079^{*} | −0.025 | −0.025 |

Co-authors | 1.672^{***} | 1.572^{***} | 1.581^{***} | |||

Coincided past | 0.712^{***} | 0.378^{***} | 0.378^{***} | |||

Worked same place | 0.478^{***} | 0.342^{***} | 0.339^{***} | |||

Share PhD (5 years) | 0.463^{***} | 0.457^{***} | ||||

PhD siblings | 0.663^{***} | 0.666^{***} | ||||

PhD cousins | 0.365^{***} | 0.362^{***} | ||||

Adviser citing | 1.090^{***} | 1.079^{***} | ||||

Adviser cited | 1.377^{***} | 1.375^{***} | ||||

Academic grandparent citing | −0.284 | −0.254 | ||||

Academic grandparent cited | 1.028^{***} | 1.023^{***} | ||||

Academic uncle citing | 0.227^{*} | 0.236^{**} | ||||

Academic uncle cited | 0.616^{*} | 0.619^{***} | ||||

Alma mater citing | 0.239^{*} | 0.233^{*} | ||||

Alma mater cited | 0.120^{**} | 0.119^{**} | ||||

Observations | 537,054 | 537,054 | 441,792 | 441,792 | 441,792 | 441,792 |

Pseudo-R$2$ | 0.044 | 0.085 | 0.033 | 0.034 | 0.091 | 0.091 |

Significance levels (based on standard errors clustered by cited article): $***$1%, $**$5%, and $*$10%.

The first specification has only the four geographic explanatory variables: an indicator for distance greater than 0 (not being at the same institution), log distance (interacted with the positive distance indicator), and indicators for residing in different countries and from countries that have different official languages. The two-part distance function is necessary because there is no good way to directly measure the distance between two scholars at the same institution. The first of the two parts implicitly estimates this distance. The indicator for distance greater than 0 is equivalent to a “different university” dummy. The two-part formulation has a jump from 0 to positive distances, but thereafter, the elasticity of citations odds with respect to distance is constant. While a constant elasticity of distance in trade equations is the standard assumption underlying gravity equations, there is little a priori reason to expect this relationship to carry over to citations. Therefore, we reestimate specifications in columns 3 and 5 with distance-interval step functions in columns 4 and 6.

The second specification adds ties constructed from the WOS database. The third to sixth specifications restrict the sample to articles with full information from the MGP database. The overall estimating sample does not decline much because the MGP sample uses all available controls (noncitations in the same subject-year), whereas the WOS sample has just one control per case. As in the first two columns, we show the effects of geography without ties (columns 3 and 4) and then add the full set of ties available in the MGP data (columns 5 and 6).

Specification 1 presents significantly negative coefficients on distance and borders, suggesting that physical distance and borders indeed impede knowledge flows. We estimate smaller border and distance effects than those obtained by Singh and Marx (2013) using citations of U.S. patents. Whereas we find that crossing a national border reduces citation odds by $exp(-0.198)-1=-18$%, they find a 41% reduction (specification Singh and Marx's 6 of table 5). Our distance elasticity is $-0.073$, whereas theirs is $-0.137$. While it is tempting to attribute this halving of geography effects to differences between academic and commercial diffusion of ideas, other evidence on patent effects obtains similar magnitudes to our column 1. As the coefficients show the marginal effects on the log odds of citation and citation is rare, the dependent variable approximates the log probability and should therefore be proportionate to the log citation flow in aggregated data. This means we can compare our estimates directly to the results from the gravity-type regressions on patent citations that Peri (2005) and Li (2014) estimated. The different country (border) effect we estimate is $-0.198$, whereas the baseline estimate of Peri (2005) is $-0.19$. Li (2014), also estimating a patent citation gravity equation, reports distance elasticities (after controlling for subnational borders) from $-0.03$ to $-0.067$, which are slightly weaker than those reported in our column 1.

All of these results support the conclusion that border and distance decay of citations are considerably smaller than the effects typically estimated for trade in goods. Nevertheless, it may be surprising to many that geography has a significant impact on academic citations at all. We now show that the estimated effects are substantially reduced by controlling for ties.

The second specification shows that the three measures of career ties (past coauthorship, past colocation, and past work at the same institution) all have strong, positive associations with citation. As exponentiating the coefficients in a logit expresses the effects in terms of the change in citation odds ratios, the 0.712 coefficient on past colocation implies that even after colleagues have moved to separate institutions, they have 104% higher odds of citing each other ($exp(0.712)-1=104$%). Prior coauthors are even more likely to cite each other. We also see that the inclusion of career ties lowers geography effects somewhat.

Comparing columns 1 and 3, we see that estimating the same specification on the MGP-restricted sample does not change the geography coefficients by more than one would expect given the standard errors.^{9} Comparing columns 3 and 5, we see one of the headline results of this paper: controlling for ties shrinks the negative effects of geographic separation by about 50%. The ratios of the four geography coefficients in column 5 to the corresponding coefficients of column 3 are 0.46, 0.54, 0.39, and 0.30. The omitted variable bias formula tells us that this means that ties and geography are correlated and that the pure partial effect of being far away or in a foreign country is overestimated in regressions that omit controls for ties.

Columns 5 and 6 in table 1 show that with just one exception, ties have systematically positive effects on citation probability. All of the estimates are statistically significant at the 5% level except grandparent citing, which has an imprecisely measured negative effect, and uncle citing, which has a borderline significant result in column 5. The average over all fourteen ties' coefficients is 0.59, implying that the average tie raises the odds of citation by 80%. The addition of the full set of ties in column 5 dramatically increases the fit of the logit to the data: the pseudo-$R2$ nearly triples from 0.033 to 0.091.^{10}

There are three tie relationships where one can identify the more senior of the two authors: advisers, uncles, and grandparents. In each case, we observe the author in the teaching role is more likely to be cited than to cite. Advisees massively overcite their adviser's papers by a factor of 4 (the second largest impact of the fourteen types of ties). In the reverse direction, we find that advisers overcite their advisees' articles by a factor of 3. The academic nephew overcites his uncle by 86%, but the reverse direction features a bias of just 26%. The most pronounced asymmetry emerges when we skip a generation. Authors overcite their adviser's advisers (academic grandparents) by a factor of 3. Yet this intergenerational flow is not reciprocated: the grandparents' propensity to cite advisees of their advisees is not significantly different from 0. These vertical patterns support the hypothesis that citations transmit knowledge.

Figure 2 illustrates the coefficients on each of the twelve steps in the nonparametric estimation of distance effects conducted in specification 6 of table 1, represented with black circles. The vertical axis depicts the reduction in the log odds of citation associated with each step relative to working at the same institution. We also show with blue squares the corresponding estimates for the twelve-step specification omitting ties. For each set of steps, we overlay the implied reduction in the log odds of citation based on the two-part coefficients from specifications 3 and 4. The key finding illustrated in the figure is that after the dramatic fall associated with positive distance, the subsequent declines are consistent with a constant elasticity decay rate. Controlling for ties moves the decay function up (lower effect of being at different institutions) and flattens it. After controlling for
ties, the two-part prediction lies within 2 standard errors for eleven of twelve steps.^{11} Clearly there is a big discontinuity between 0 and positive distances corresponding to a same-university effect. Conditional on positive distance, the figure shows that it is hard to distinguish empirically between a decay function that is flat after 1,000 kilometers and one that exhibits regular decay with a constant elasticity of $-0.037$. Since the two-part approach adequately captures distance effects, we use it for all the subsequent estimations.

The negative effect of geographic barriers on citation probabilities is presumed to arise because these barriers reduce the frequency of face-to-face interactions. In academics (as well as other areas) coattendance at conferences provides one of the most important opportunities to meet in person with scholars doing related work. We collected data on papers presented between 1990 and 2009 at one of the most important conferences, the Joint Mathematics Meetings (JMM). Held annually in the United States, an average of 1,459 participants present 1,037 papers.

The first exercise we conduct, reported in table C.1, is to show a strong and precisely estimated negative effect of distance on the probability of attending a conference. Because the conference venue moves each year, the data exhibit substantial variation in distance for a given scholar. This permits estimation of the logit with author-specific fixed effects. The distance elasticity in this specification is $-0.136$ with a 0.016 standard error (clustered by author).^{12}

The second exercise, reported in table C.3, uses the conference data to show the impact of attendance on citation. Using the table 1, column 5 specification, we add indicators of coinciding at the same conference (as presenters or session organizers). While just coinciding has a negligible effect on citation, coinciding when the (potentially) cited paper is presented increases the odds of citation by a factor of 8.3. Table C.3 also shows a positive effect of presenting at the same session, regardless of whether it was the citing or cited paper. Contrary to our own observation of presenters being encouraged to cite the work of coattendees at a session, we find no significant evidence of a citing paper effect in these data. While these two exercises are confined to the one conference for which we could obtain long-term conference participation data, they illustrate a broader mechanism that we view as underlying distance effects on
citation. Proximate authors are more likely to present at the same conferences, and when they do so, this makes the citing authors aware of relevant new research that they build on in their own work.^{13}

Why does controlling for academic linkages lead to the large reduction in distance effects shown in table 1? It must be that ties are negatively correlated with geographic barriers. We illustrate this in figure 3a, which shows that linked authors tend to be closer to each other than authors who have no ties. For example, about 33% of tied authors are more than 5,000 kilometers apart, compared to almost 60% of nontied authors. Similarly, tied authors are much more likely than nontied authors to reside in the same country (51% versus 16%) or countries that share a common language (65% versus 32%).

Figure 3b reveals the phenomenon that helps to understand our baseline results: mathematicians tend to remain close to the university where they obtained their doctorates. Thirty percent either do not leave or have returned, and only 18% move more than 5,000 kilometers away.^{14} Proximity to the alma mater is likely to beget proximity to one's adviser (and his or her adviser), former classmates, and others. The story underlying our baseline results is a simple but important illustration of omitted variable bias. Ties are very important for citation, but they are negatively correlated with distance. Thus, a failure to control for ties leads to the inference that distance has a greater direct impact on knowledge flows than is truly the case. Authors are unlikely to cite papers written by faraway authors partly because they are less likely to have interacted at conferences; an equally important
factor is that they are less likely to have an academic or career tie with each other.

### B. Evidence for Information Mechanisms

The results we have obtained so far point to an important role for educational and career ties in fostering citations. The underlying mechanism we imagine is one of communication along the network of ties that causes one set of authors to become aware of useful theorems and conjectures provided by other authors. This information transfer mechanism predicts that the presence of ties should matter more for certain types of papers than others. Specifically, we conjecture that authors rely more on their ties to find out about work that is less widely known, more recently written, and further from the author's expertise. To the extent that face-to-face interactions matter more for such papers, geographic barriers should be stronger as well.

We develop three proxies for papers that researchers are less likely to know about. First, we categorize papers as “obscure” if they receive less than or equal to the median number of cites (three). Our second proxy for low awareness is the gap in time between when the citing and potentially cited papers were published. A paper is “recent” if the gap is less than or equal to the median gap in our data (nine years). The third awareness measure follows from the observation that authors are more familiar with work in their own fields than in other subject areas. We classify papers as in a different field if their two-digit mathematical subject classifications (MSC) differ (e.g. 11 [number theory] versus 14 [algebraic geometry]). As we show in table D.4 in the appendix, these specific rules for categorizing obscure, recent, and different field are not critical for the results.

Table E.2 in the appendix provides summary statistics on these variables. Not surprisingly, there are lower average numbers of cites for obscure papers and recent papers. We see approximate balance between the average number of cites to the same and to different fields. There are more observations in total featuring cites within the same field, so this suggests that cross-field citations go mainly to more prominent papers. In terms of ties, on average the differences between obscure and recent papers are small. The fact that ties are higher for same-field papers probably reflects greater ties within the same field. This is an important reason why our fixed effects control for the citing paper's three-digit subject code.

Table 2 reports the detailed results for the three awareness proxies. To reduce the number of parameters to be displayed and discussed, we report the average of thirteen ties indicators, followed in the next column by the averages over thirteen interaction terms.^{15} Column 1 reports the corresponding regression without interactions for comparison purposes.

. | (1) . | (2) . | (3) . | (4) . | (5) . | (6) . | (7) . |
---|---|---|---|---|---|---|---|

. | . | Obscure . | Recent . | Different Field . | |||

Specification . | . | Base . | Interact . | Base . | Interact . | Base . | Interact . |

Geography | |||||||

Distance $>0$ | −0.571^{**} | −0.541^{**} | −0.011 | −0.321^{***} | −0.339^{**} | −0.804^{***} | 0.147 |

(0.073) | (0.083) | (0.166) | (0.107) | (0.137) | (0.136) | (0.198) | |

ln Dist $\u2223$ Dist $>0$ | −0.037^{***} | −0.031^{***} | −0.037^{*} | −0.021^{*} | −0.028^{*} | −0.026^{*} | −0.004 |

(0.008) | (0.009) | (0.019) | (0.011) | (0.015) | (0.014) | (0.021) | |

Different country | −0.090^{***} | −0.082^{**} | −0.061 | −0.095^{**} | 0.002 | −0.072 | 0.022 |

(0.031) | (0.034) | (0.079) | (0.044) | (0.060) | (0.056) | (0.088) | |

Different language | −0.025 | −0.028 | 0.014 | −0.017 | −0.015 | −0.037 | −0.075 |

(0.026) | (0.029) | (0.064) | (0.037) | (0.048) | (0.047) | (0.072) | |

Ties | |||||||

Average effect of ties | 0.652^{***} | 0.619^{***} | 0.135^{***} | 0.543^{***} | 0.176^{***} | 0.572^{***} | 0.223^{***} |

(0.018) | (0.020) | (0.059) | (0.027) | (0.036) | (0.036) | (0.057) | |

Observations | 441,792 | 441,792 | 441,792 | 225,768 | |||

Pseudo-$R2$ | 0.091 | 0.092 | 0.093 | 0.100 |

. | (1) . | (2) . | (3) . | (4) . | (5) . | (6) . | (7) . |
---|---|---|---|---|---|---|---|

. | . | Obscure . | Recent . | Different Field . | |||

Specification . | . | Base . | Interact . | Base . | Interact . | Base . | Interact . |

Geography | |||||||

Distance $>0$ | −0.571^{**} | −0.541^{**} | −0.011 | −0.321^{***} | −0.339^{**} | −0.804^{***} | 0.147 |

(0.073) | (0.083) | (0.166) | (0.107) | (0.137) | (0.136) | (0.198) | |

ln Dist $\u2223$ Dist $>0$ | −0.037^{***} | −0.031^{***} | −0.037^{*} | −0.021^{*} | −0.028^{*} | −0.026^{*} | −0.004 |

(0.008) | (0.009) | (0.019) | (0.011) | (0.015) | (0.014) | (0.021) | |

Different country | −0.090^{***} | −0.082^{**} | −0.061 | −0.095^{**} | 0.002 | −0.072 | 0.022 |

(0.031) | (0.034) | (0.079) | (0.044) | (0.060) | (0.056) | (0.088) | |

Different language | −0.025 | −0.028 | 0.014 | −0.017 | −0.015 | −0.037 | −0.075 |

(0.026) | (0.029) | (0.064) | (0.037) | (0.048) | (0.047) | (0.072) | |

Ties | |||||||

Average effect of ties | 0.652^{***} | 0.619^{***} | 0.135^{***} | 0.543^{***} | 0.176^{***} | 0.572^{***} | 0.223^{***} |

(0.018) | (0.020) | (0.059) | (0.027) | (0.036) | (0.036) | (0.057) | |

Observations | 441,792 | 441,792 | 441,792 | 225,768 | |||

Pseudo-$R2$ | 0.091 | 0.092 | 0.093 | 0.100 |

Robust standard errors clustered by cited article in parentheses. Significance: $***$1%, $**$5%, and $*$10%. Average effect of ties is the mean of the base and interaction coefficients of thirteen ties (three WOS and ten MGP). “Obscure” indicates that total citations received for this article are less than or equal to the median number of citations received among all articles, “recent” corresponds to citation lags less than or equal to the median, and “different field” equals 1 if citing article and cited article belong to different two-digit MSCs.

The first set of interactions in table 2 shows the results of interacting geography and ties with an indicator for obscure papers. Column 2 shows the base effects corresponding to nonobscure papers, and column 3 shows the coefficient on each corresponding interaction. We find that the more prominent papers (more than three cites) have an 18% ($=0.135/(0.619+0.135)$) smaller coefficient on the average of ties than the lesser-known papers. This is consistent with the interpretation that ties facilitate awareness. Papers that are big successes require less help from networks to promote transmission. The coefficient on log distance is about 1.2 times as large for obscure papers.

When the interaction is changed to distinguish recent versus older papers, the results are similar, as shown in columns 4 and 5. Recent papers have a 24% higher coefficient on the average effects of ties. Distance decays are estimated at $-0.021-0.028=-0.049$ for papers in their first nine years after publication (the median age of papers in our sample) and $-0.021$ thereafter. These numbers are remarkably similar to those reported by Li (2014) in a gravity-style study of intercity patent citation flows. She finds that the distance elasticity declines monotonically with age from a $-0.028$ in the first five years to $-0.014$ for patents granted twenty or more years before. These combined findings of significantly higher geographic concentration of new knowledge are intuitively appealing and provide some guidance for models of knowledge diffusion.^{16}

Ties also have larger impacts for papers in different fields, with a coefficient, reported in column 7 that is 28% larger than for same-field papers. None of the different-field geographic interactions are statistically significant, suggesting that face-to-face communication matters more for obscure and recent papers than for different fields.

All three sets of interactions therefore support the premise that scholars draw more heavily on their connections when obtaining less familiar information. The positive interactions between ties and information proxies have similar magnitudes and strong statistical significance in five alternative specifications described in the appendix section D. These robustness regressions also find statistically significant (10% or better) negative effects for the geography-information interactions in eleven of thirty estimates. The remaining estimates are mainly negative but not statistically different from 0.

Tables 2 and D.4 show strong and robust evidence that ties matter more for three types of papers where awareness poses a more serious challenge. We also find that geographic barriers pose a greater impediment to citation for recent papers in nearly every specification. This evidence supports the interpretation of ties as facilitating information transfer rather than an alternative mechanism involving “citation cliques.” Under this alternative, scholars have perfect awareness of the relevant research in their field but choose to cite specific prior work because it was written by the scholars for whom they have some kind of social affiliation. If ties are just proxies for intragroup loyalties, it is not obvious why such forces should be relatively more important specifically for the types of papers where the awareness gap is predictably larger.^{17}

A third mechanism, combining elements of information and affiliation, is also consistent with our results. In this story, mathematicians are aware of relevant work but uncertain of whether the proofs those papers contain are all correct. Since the validity of one's own results hinges on the correctness of the proofs of the cited theorems, the mathematicians we have spoken to claim to check all proofs, regardless of the author. In practice, this may not always occur. There could be cases where, for example, an author would cite her adviser's papers because she knows those proofs have always stood up to scrutiny. This trust mechanism would likely be stronger for lesser-known and more recent papers because they are less likely to have been thoroughly checked by others. Trust could also matter more for papers outside one's field because those involve unfamiliar techniques that make it difficult for an outsider to verify the proof.

We see the awareness and trust mechanisms as emphasizing ties as conduits of information. In the first case, the information is about the existence of a useful theorem; in the second case, the information is about the reliability of the theorem. This echoes the situation in international trade, where Rauch (2001) summarizes a number of studies showing that “transnational business and social networks promote international trade by alleviating problems of contract enforcement and providing information about trading opportunities.” Thus, networks help exporters by making them aware of the specific needs of foreign buyers, while also promoting trust that buyer and seller will comply with the terms of their contract with each other.

### C. Time-Varying Effects of Distance and Ties

The estimates presented so far pool citations made from 1980 to 2009. This section investigates whether the effects of distance and ties on more recent citations differ from the past. Keller (2002) and Griffith et al. (2011) show a decline in the importance of geographic separation between the 1980s' and late 1990s. We extend the investigation these authors initiated by including more recent data and also estimating the time-varying effect of ties. We examine changes since 1990 because our 1980s' citation data are too sparse. Estimation of time-varying coefficients from 1990 to 2009 is of great interest given the many relevant advances observed over this period.

To investigate whether the impact of distance and ties has been changing, we estimate regressions based on a moving sample window. We construct the estimation windows by first restricting the citing papers to be published within a five-year period centered around year $t$. This implies citing years, $tc$, in the interval $t+2\u2265tc\u2265t-2$. To make the sample size in later years comparable to that of earlier years, we impose a fixed maximum citation lag $L$ set equal to five or ten years. This implies cited years, $td$, in the interval $tc\u2265td\u2265tc-L$. The first midyear $t$ we use is 1990, and the last is 2007 (since our data set runs to 2009).

Figure 4 shows the effects of distance in panel a and ties in panel b. In both panels, we use blue squares to depict the point estimates for ten-year maximum citation lags. A solid LOWESS smoother passes through the point estimates. The dashed smoother line depicts the results for a five-year citation lag and the dot-dash line corresponds to an estimation with no restriction on citation lag (all years). The points in panel a are estimated distance elasticities, that is, the marginal effect on the log odds of citation of increasing log distance between citing and cited authors. The time pattern of distance effects depends on whether the regression controls for ties. We depict these differing results using blue for estimates that control for the sum of fourteen ties and red for those that do not. Ninety-five percent confidence intervals (as before standard errors are clustered at the cited article level) are shaded blue and red for
estimates that do and do not (respectively) control for the sum of ties.^{18}

All the specifications plotted in figure 4a show absolute distance elasticities becoming much smaller over time since the early 1990s. In the geography-only specification shown in red, distance remains a statistically significant impediment to citation up to and including the final interval, 2005–2009, when its elasticity is $-0.06$ (standard error: 0.013). However, the magnitude falls by two-thirds from its 1990 value of $-0.18$. The confidence intervals also shrink over time, since increasing numbers of digitalized articles raise $N$ in the standard error calculation. Controlling for the sum of ties, we see that the absolute elasticities are uniformly smaller in all periods, with the largest gap between the smoother lines appearing in the last estimation windows. Starting around 2005, the confidence intervals mainly include 0. The final estimated distance elasticity controlling for ties is $-0.017$ (standard error: 0.012).

Figure 4b shows the evolution of the coefficient on the sum of ties. The impact of ties on citation has been mainly rising over the 1990s and 2000s. The increase in citation odds from adding a tie rises from 72% in 1990 to 94% in 2007.^{19}

In all the results presented to this point, we have used a worldwide sample. This contrasts with much of the work we cited in section I on the geography of knowledge flows that uses citations within the United States. It is therefore worth investigating whether the patterns shown in figure 4 reflect global phenomena or whether the United States is special.

Figures 4c and 4d graph the results for a moving-window specification similar to those depicted in figures 4a and 4b, except they estimate separate distance$>0$ and (sum of) ties coefficients for pairs where both $i$ and $d$ are US residents and others (1$-$both US$id$). Figure 4c reveals that the shrinking distance effects depicted in figure 4a derive from author pairs where at least one set is not U.S. based. Distance effects between U.S. pairs have not been significantly different from 0 throughout the period of study. Another difference between U.S. pairs and others is that the latter exhibit rising effects of ties, becoming significantly larger than those between U.S. pairs since the 2000s.

It is obviously tempting to try to explain the temporal patterns in the coefficients with reference to the technological advances we have observed since the 1990s. However, it is not possible to identify one cause or the other with so many trends at work during this period. Advances affecting information flows include the rise of web browsers in the mid-1990s, and the introduction of the Google search engine in 1998 and Google Scholar in 2004. Of particular importance to scientists was the creation of arXiv.org, a repository of preprints, which has included mathematics since 1992.

Figure 5a plots the growth of the number of arXiv papers in mathematics over time and compares (it on a second scale) with the spectacular increase in Google searches in the 2000s. Panel a also depicts the introductions of Skype and Google Scholar. The combination of all these technologies would be expected to have reduced the importance of face-to-face interactions, implying declining geographic separation effects since 1990.

The smoother line for the distance elasticity in figure 4a begins to trend up in the late 1990s, coinciding with the rise of arXiv shown in figure 5a. However, the stable importance of ties between U.S. authors and the growing role of ties elsewhere shown in figure 4d is not consistent with the view that arXiv and Google searches have been making all information universally accessible. Furthermore, the rise of Internet article depositories and search engines cannot explain why distance effects between U.S. author teams have been insignificantly different from 0 during the whole period.

While Internet advances capture the most attention, other contemporaneous changes could reasonably affect the importance of geography and ties in knowledge transmission. Figure 5b shows the dramatic decline in the costs of making international calls to and from the United States. In real terms, international calls fell by 95% between 1990 and 2007, compared to a 75% decline over the same period for interstate calls. Because we cannot find comparable series on domestic and international air fares, we use data on the volume of travel as a proxy. Figure 5c shows that air travel has been rising relative to population size.^{20} The rate of growth outside the United States has been much larger, partly because the United States started from a much higher base. In 1990, the United States had 1.3 air passengers per capita compared to 0.11 for the nine other
countries that comprise the top ten countries in mathematics (measured by number of citing authors in 2009). Two decades later, the U.S. ratio had risen by 20%, whereas the other countries rose by 140%.

The data shown in figures 5b and 5c suggest an alternative interpretation of advances since the 1990s. Perhaps cheaper phone calls and improved air travel make it easier for scholars to stay in touch with their ties. Improved contact allows them to share the kind of complex knowledge that is hard to procure via Google searches. Thus, communication cost reductions lower the need for face-to-face interactions but raise the opportunities for drawing on one's ties.^{21} Similarly lower costs for flying to conferences or visiting collaborators could also contribute to the explanation of why distance matters less but ties matter more.

The greater drop in the effect of distance and the larger increase in the effect of ties for non-U.S.-based authors is in line with the big decline in call costs between the United States and other countries shown in figure 5b and the rise of air travelers per capita in the rest of countries relative to the United States shown in figure 5c. While we find this story linking the coefficient patterns in figure 4 to the trends in figure 5 to be plausible, future work with different identification strategies would be needed to confirm it.

## V. Conclusion

Our results add further evidence to the diverse strands of the literature finding that geographic separation impedes knowledge flows. Geography matters in large part because of its role in shaping the personal ties between citing and cited scholars. In the full sample, including fourteen linkages based on career and educational histories as controls cuts geography coefficients approximately in half. For the subsample where both citing and cited authors reside in the United States, a region where communication and travel costs have long been relatively low, the marginal effect of greater distance between institutions is insignificantly different from 0. The distance effect also disappears in the most recent five years of the worldwide sample. These 0 partial effects of distance are obtained only in regressions that control for ties.

Despite the increase in global access to knowledge provided by the Internet, the strength of the impact of ties on citation probabilities has not been declining. Because ties matter most for papers where awareness gaps are most acute (recent, obscure, and different-field articles), we infer that ties matter because connected scholars transmit knowledge to each other. This view is further supported by the finding that scholars whose formal role is to impart knowledge (advisers and the academic parents and siblings of advisers) have larger impacts on subsequent academic generations than vice versa. In sum, the evidence suggests that what you know depends a great deal on whom you know. It is increasingly unrelated to where you work—except insofar as where you work influences whom you know.

To the extent that we can generalize from the study of mathematicians, our study suggests novel interpretations of existing empirical findings. Cities may be valuable not just because of daily face-to-face interactions but also because they are good places to build networks. Such a view points to a different interpretation of the De La Roca and Puga's (2017) finding that wages rise with experience in big cities but retain much of this growth even when the individual returns to a smaller city. While the authors attribute the wage premium to increased ability, our framework suggests it might also have arisen via an expanded set of professional ties. Since ties created at short distances can be maintained over longer distances, the ties explanation is also consistent with the continued prosperity of city leavers.

In trade, Feyrer (2009) estimates that changes in distance caused by the Suez Canal closure have much lower impacts than cross-sectional differences in distance. Our interpretation would be that the lengthening of the shipping route has no impact on the ties between importer and exporter that predated the closure. The puzzle posed by Head and Mayer's (2013) calculations that observable barriers such as tariffs and freight charges can explain only less than half the estimated magnitudes of border and distance effects has a simple resolution in light of our results. Traders depend on their networks, and those networks are nationally and spatially biased.

It bears repeating that the broader lessons we draw from observing mathematicians are necessarily tentative; they beg for corroboration in other contexts. This is especially true when it comes to policy implications. However, a ties-centered view of knowledge flows does suggest that certain types of government actions could be fruitful. To promote more geographically dispersed networks, universities could be strongly discouraged from hiring their own students straight out of graduate school. Another policy to broaden ties of researchers is for the government to fund and promote doctoral study abroad. Invitations to faculty in other countries for short- and long-term visits often lead to the formation of new collaborative ties. Analogous versions of these policies can expand the networks for nonacademics. For example, easing visa requirements to facilitate medium-run stays by employees of multinationals should thicken the set of connections between foreign and domestic knowledge creators.

## Notes

^{2}

Borjas and Doran (2012) use the MGP to identify immigrant mathematicians who received the Ph.D. from Soviet institutions.

^{4}

For papers with multiple authors from different countries, citations are allocated fractionally. Thus, a paper coauthored by two scholars from countries A and B to a paper written by two other authors from countries C and D would generate four international citation flows of 0.25 each. This fractional accounting of citations ensures that the sum of all citations in the world, $Cw$, is the same regardless of whether one sums across paper dyads or country dyads—that is, $Cw=\u2211jnCjn=\u2211idCid$.

^{6}

Singh (2005) uses five controls per realized citation in his weighted estimator.

^{7}

Kerr and Kominers (2015) use an alternative method that randomly samples patent distances to calculate expected citations within a fixed ring.

^{8}

The table is replicated with standard errors in parentheses in appendix table E.1. The entire table is reestimated using a linear probability model in appendix table D.3, with the results compared in appendix section D.

^{9}

Additional investigation of the possibility of MGP sample selection bias is reported in the robustness section D in the appendix.

^{10}

Pseudo-$R2$ is measured as $1-L1/L0$, where $L0$ is the likelihood of the constant-only model. Hence, it rises with the number of estimated parameters. It is therefore worth noting that the inclusion of ties reduces the Akaike information criterion (AIC) by 7,995 points compared to column 3, indicating that the rise in the likelihood from adding ties is large enough to offset the penalty AIC imposes for adding 14 parameters.

^{11}

The exceptional case is the 25 to 50 kilometer bin, which is driven by the dyad Rutgers-CUNY (45 kilometers apart). Both of these math departments are very active in the set theory three-digit code, but they do not cite each other's papers. The apparent cause is that while Rutgers' papers span the field, CUNY authors specialize in two subfields, Consistency and Independence Results and Large Cardinals, which comprise 52 of CUNY's 58 papers.

^{12}

This estimation includes only authors who attended at least one meeting but not every one (no perfect predictors). Appendix C also presents an estimation without author fixed effects that includes all potential attendees. The distance effect in this estimation is not as strong ($-0.05$), but negative and significant border and language effects show up in this specification.

^{13}

Our results align with the finding of Iaria, Schwarz, and Waldinger (2018) that the ban on Central scientists from participating at international conferences during and after World War I was associated with a drop in citations between Allied and Central scientists.

^{14}

There is substantial heterogeneity in the tendency to work at the PhD-granting institution, with just 20% of U.S.-educated authors staying or returning compared to 46% in Spain. The sample comprises 2,213 MGP authors who published in pure mathematics journals in 2009.

^{15}

We drop grandparent citing in this table because of a logit perfect predictor problem. In the different-field specification, there were only nine grandparent citing instances, and all of them were for control observations rather than realized cites. We reinstate grandparent-citing back in a robustness check (table D.4), where it is a component in a sum of ties variable.

^{16}

A recent paper studying patents finds corroborating results. Packalen and Bhattacharya (2015) show that denser cities are responsible for patents that make use of newer knowledge, as measured by textual analysis of the patent applications.

^{17}

The fixed effects control for the overall tendency to cite each article $d$ so the interactions measure how ties boost the relative tendency to cite specific types of papers.

^{18}

The purple area corresponds to the intersection of the two intervals.

^{19}

Exponentiate the ten-year lag coefficients shown in Figure 4, and subtract 1 to obtain these amounts.

^{20}

Data from World Development Indicators series “Air Transport, Passengers Carried.”

^{21}

This story is consistent with the model of complementarity between proximity and communication technology in Gaspar and Glaeser (1998).

## REFERENCES

## Author notes

We thank Mitch Keller at the Mathematics Genealogy Project and Nicolas Roy from zentralblatt-math.org for data. Yao Amber Li acknowledges financial support from the Research Grants Council of Hong Kong, China (General Research Funds Project 643311), and Asier Minondo from the Spanish Ministry of Economy and Competitiveness (MINECO ECO2016-79650-P cofinanced with FEDER), the Spanish Ministry of Science, Innovation and Universities (RTI2018-100899-B-100, confinanced with FEDER), and the Basque Government Department of Education, Language Policy and Culture (IT885-16). Seminar participants at Dartmouth, LSE, Oxford, UBC, Glasgow, Birmingham, and Nottingham made helpful suggestions. We also thank Andrew Bernard, Teresa Fort, Joshua Gottlieb, Bob Staiger, Bronwyn Hall, Wolfgang Keller, Anthony J. Venables, Quoc-Anh Do, Edwin Lai, Jim MacGee, Andrés Rodríguez-Clare, Daniel Sturm, Dave Donaldson, Ben Faber, and Pablo Fajgelbaum for valuable discussions. We thank Michal Fabinger for alerting us to the importance of arXiv.org for knowledge dissemination over the Internet. We are particularly grateful to Andrei Levchenko for raising issues that led to the results in section IVB. Ho Yin Tsoi, Bo Jiang, Yiye Cui, and Song Liu provided excellent research assistance during this project. Finally, we thank three anonymous referees and Amit Khandelwal for very helpful suggestions.

A supplemental appendix is available online at http://www.mitpressjournals.org/doi/suppl/10.1162/rest_a_00771.