Abstract

Debate about the epistemic prowess of historical science has focused on local underdetermination problems generated by a lack of historical data; the prevalence of information loss over geological time, and the capacities of scientists to mitigate it. Drawing on Leonelli’s recent distinction between ‘phenomena-time’ and ‘data-time’ I argue that factors like data generation, curation and management significantly complexifies and undermines this: underdetermination is a bad way of framing the challenges historical scientists face. In doing so, I identify circumstances of epistemic scarcity where underdetermination problems are particularly salient, and discuss cases where legacy data—data generated using differing technologies and systems of practice—are drawn upon to overcome underdetermination. This suggests that one source of overcoming underdetermination is our knowledge of science’s past. Further, data-time makes agnostic positions about the epistemic fortunes of scientists working under epistemic scarcity more plausible. But agnosticism seems to leave philosophers without much normative grip. So, I sketch an alternative approach: focusing on the strategies scientists adopt to maximize their epistemic power in light of the resources available to them.

1. Data-Time and Information Destruction

Disputes about the legitimacy and power of scientists’ capacity to gain knowledge of the deep past is often framed using underdetermination. I’ll argue that if we take scientific practice seriously, then underdetermination is a bad way of framing many such investigations. Happily, there are alternative framings. To show this, I want to focus on a circumstance historical scientists often find themselves in—epistemic scarcity—and explore the consequences of a distinction Sabina Leonelli has recently drawn between phenomena-time and data-time. Let’s start with that distinction.

The historical sciences, unsurprisingly, are intimately concerned with time. Cosmologists, archaeologists, geologists and paleontologists provide wide-scale temporal sequences; ages, epochs, eons. Across these sequences they stretch narratives: the formation of galaxies, mountain-ranges, societies and species; tales of transformation, destruction, and extinction. The inferences underwriting these narratives rely on the remnants of those past events. These temporal dimensions concern what Leonelli has called phenomena-time, the lifetimes of the natural processes, entities and events that scientists seek to understand. She distinguishes this from data-time: “the temporal dimension of data practices used to prepare and manage data so that they can be subjected to inferential reasoning” (Leonelli 2018a, pp. 742–43). Paleontologists, for instance, must extract, prepare, analyze and store fossils; their access to the deep past depends in part on practices of data generation, management and analysis. Philosophers have insufficiently attended to the epistemic consequences of data-time.

Phenomena-time draws our attention to information-loss due to natural historical processes1, such as the patchiness of the fossil record. By contrast, data-time highlights information-destroying processes within science itself. Not every fossil is extracted, and extraction often involves mistakes, breakages, and so forth. Cataloguing inevitably highlights some details (often phylogenetic and temporal) while backgrounding others, and errors are frequent. Storage—especially for larger fossils—is expensive, often inadequate, and specimens are lost, mislabeled and degrade. Fossil preparation, particularly in vertebrate paleontology, is a careful art which can involve as much information-loss as extraction itself (Wylie 2015, 2019).

My aim in this paper is to connect data-time with philosophical discussion of underdetermination. In brief, many (perhaps most, perhaps all) scientific hypotheses are underdetermined by their evidence. That is, the available evidence is insufficient to support going hypotheses over rivals. Views on the future success of science, then, partly turns on our confidence in the capacity of scientists to overcome underdetermination. Pessimists point to information destruction; optimists to the creativity and tenacity of scientists, as well as technology’s transformative capacities. Data-time complexifies these discussions by highlighting both new sources of underdetermination and new routes to overcoming it.

I’ll focus on legacy data, data extracted in historical contexts using different technologies, systems of practice, and so forth. As we’ll see, scientific progress in the face of epistemic scarcity doesn’t simply turn on new knowledge and technology pertaining to the science’s subject (the fossil record, say) but also on our knowledge of the history of science itself. That is, data-time is just as crucial as phenomena-time for understanding scientific underdetermination and whether it can be overcome. I’ll argue this puts pressure on the idea that underdetermination itself is a useful framing for understanding science under epistemic scarcity.

The central problem is that underdetermination framings lead to bets about whether particular sets of knowledge will be had, while the strategies adopted under epistemic scarcity are typically open: as opposed to tackling a particular piece of knowledge or hypothesis, scientists explore what can be done with the resources they have. I’ll suggest, then, an alternative framing of science under epistemic scarcity that focuses on resources, that is, the technologies, data, funding, and so forth, that afford scientific work; and strategies, that is, the approaches and methods that scientists adopt to do work with those resources. My hope is that this retools debate into more fruitful territory.

In the first two sections I’ll provide conceptual grounding: analyzing epistemic scarcity (section 2) and then underdetermination (section 3). In the two sections following, I’ll argue that knowledge of science’s history is a way of overcoming underdetermination despite epistemic scarcity (section 4), especially when used in combination with new technologies (section 5). In section 6, I’ll use the preceding discussions to argue that underdetermination is a bad way of framing science under epistemic scarcity and sketch my alternative.

Of course, that legacy data and technological capacities matter for understanding scientific progress is no surprise to historians and sociologists of science, nor indeed scientists themselves.2 My aim, in a sense, is to show how lessons from scientific practice and its history are transformative of philosophical approaches. In my small way, then, I’m attempting to bring science’s history and sociology into further contact with its philosophy. Throughout these discussions I’ll be returning to work on the Paluxy River Trackways. I’ve chosen this case study as an excellent illustration of epistemic scarcity, as well as how legacy data and technological progress together often provide routes to progress in the face of it. Thus, future scientific success—stepping forwards—can be founded in understanding the practices and data of past science—looking back.

2. Epistemic Scarcity

Although some discussions of historical science attempt general treatments (e.g., Cleland 2001, 2002; Kleinhans, Buskes, and de Regt 2005; Turner 2007), this is a mug’s game: epistemic fortune differs too widely across sciences concerned with the deep past to support generality about method or epistemic fortune. Sometimes our access to the past is impoverished, but historical data can be plentiful, our understanding of some historical processes are extraordinarily detailed, our technological and analytic resources sometimes powerful, and some historical phenomena leave unambiguous, dramatic traces. Part of my aim is to re-jig philosophical analysis from general disciplinary categories like ‘historical science’ towards more epistemically relevant categories. We can still have systematic analysis of science if we can identify the right domains. My hunch is that epistemic scarcity is one such domain, which makes sense of philosophical argument concerning underdetermination. So, although our guiding case study—the Paluxy River Trackways—is from vertebrate ichnology, I don’t mean to claim that it is representative of historical science: it is instead representative of epistemic scarcity. Let’s turn to the case.

A decent chunk of Texas, stretching from Dallas westward beyond Fort Worth, provides a remarkable inroad to twenty-million years of Cretaceous fossils (Farlow et al. 2015). Although there are bountiful body fossil deposits to be had, we’ll focus on ichnology, or the fossilized remains of animal activity: burrows, eggs and trackways. Regarding trackways, the Paluxy River is an important site. The river is hard to work: subject to frequent flooding, underwater trackways are tricky to extract or photograph, and dry spells frequently turn river to mud. Despite this, since the early twentieth century it has been a rich source of ichnological finds, particularly of sauropod dinosaurs.

Most well-known is the so-called ‘chase sequence’, which records tracks from multiple sauropod individuals and at least one therapod. In the 1930s and ‘40s, Roland T Bird interpreted the sauropod and therapod tracks in terms of predatory behavior and avoidance, painting a one hundred-thirteen million year old story.

… one set of footprints, from a two-footed carnivorous dinosaur … ran parallel to the trail left by an even larger, four-footed herbivore … which was apparently traveling in a herd. And he later noticed that the carnivorous dinosaur seemed at one point to have taken a strange skipping stride, leaving two consecutive right footprints in the mud. (Thomas and Farlow 1997, p. 76)

Here’s a tempting interpretation: the therapod was chasing the sauropod and that strange skipping stride captures the moment where the predator lunges. What are the interpretation’s epistemic credentials? That is, what reason do we have to think the prints were in fact made at the same time? Or that something as subtle as a change in gait signals a predatory strike? Considering information loss over data-time, these questions are daunting.

In 1940, Bird was dispatched by the American Museum of Natural History on a “herculean” task (Farlow et al. 2015, p. 14): to extract the main chase sequence for analysis and display. The trackways, over nine meters in length, couldn’t be taken in one piece. Bird extracted three connected segments, and these were distributed to various museums: one to his employers, another to the Texas Memorial Museum, and the third has been lost. In addition to the lost part of the trackways, the Texas Memorial Museum’s specimen has since degraded while on display. The splitting, deterioration, and loss of the trackways are a significant epistemic loss. Body fossils and trace fossils record different kinds of information, and tracks from multiple individuals across species are rare.

“Tracks, traces and footprints can offer us insights that are unlikely, or even impossible, to preserve in the osteological fossil record. Information about trackmaker anatomy, behavior, motions and ecology is tied up in the three dimensional morphology that we ultimately call a track” (Falkingham et al. 2018, p. 470). The breaking up, degradation, and loss of the Paluxy River Trackways themselves seem to mean that we cannot re-analyze Bird’s primary data. Considering the epistemic scarcity of vertebrate paleontology highlights this. Epistemic scarcity concerns data, so I’ll begin by drawing on Leonelli’s account of that notion.

Where data is often discussed interchangeably with evidence, Leonelli has convincingly distinguished them (Leonelli 2016). Her account is relational: in short, we can think of data as potential evidence. It is generated by various scientific procedures, produced by an experiment and recorded, say, or extracted from a site and prepared. The data is subsequently deployed as evidence downstream. In the context of paleontology data refers to (among other things) both fossil specimens3 and to various representations of fossils. These representations, and the fossils, take varied forms and are transported—undertake journeys—across contexts. Data-journeys involve “the movement of data from their production site to many other sites in which they are processed, mobilized and repurposed” (Leonelli 2020, p. 17). Leonelli emphasizes how these recording and journeying practices involve decontextualizing data from the context of its generation, thus allowing it to be catalogued and mixed. Employing data as evidence then involves its recontexualization. Meta-data records information about the data’s generation and subsequent journeying, and thus affords scientists the capacity to recontextualize. Nora Boyd makes a similar point, in terms of data enrichment: the process of combining data with its meta-data and surrounding knowledge to form an evidential claim. As she puts it, “data can never be permanently decoupled from their associated enriching information and retain epistemic utility” (Boyd 2018, p. 413).

So, we can consider the Paluxy trackways as paleontological data. Processes of extraction, preparation, cataloguing and management decontextualized the trackways. To be used as evidence, say, in inferring from the trackways to past animal behavior, the data must be enriched. Paleontologists recontexualize the trackways with details of their data-journey and with relevant knowledge about processes of fossilization and animal behavior. But fossils themselves are not the only relevant paleontological data: so too are sketches, photographs, descriptions and other ways of representing them. As we’ll see, Bird took photographs, sketches and video of the trackways, and these are put to use by modern paleontologists. So, we can understand paleontological data, such as the fossils themselves or recordings of them, as potential evidence which, if suitably enriched, can be employed to support hypotheses. With our relational account of data in hand, we can turn to epistemic scarcity.

Epistemic scarcity is a feature of a domain’s data, not its evidence, and includes four features. Scarcity occurs when data is rare, non-renewable, fragile and demanding. The first two are features of phenomena-time, the second two of data-time. Recall that phenomena-time concerns the processes effecting data-sources prior to their becoming data, while data-time concerns scientific activities of data-preparation, management and journeys. Phenomena-time refers to data’s sources (occurring naturally or being generated in an experiment) while data-time refers to the data’s life. The trackways decaying due to geological subduction or water damage is phenomena-time, the trackways being split into three parts happens in data-time. The distinction is no doubt blurry at times, and indeed whether a process will count as occurring in phenomena-time or data-time will be sensitive to what phenomena we’re interested in, but I think it nonetheless extremely useful for picking apart various aspects of epistemic scarcity.

Rarity refers to the likelihood of the data’s occurrence. Both paleontology and particle physics often deal with rare data. The conditions of fossilization, and the conditions under which fundamental particles might be detected, are specific and peculiar, meaning data is rare. Data being non-renewable refers to scientists’ capacity to generate relevant data. Where particle physicists can bring about the conditions where collisions might be detected, data from the fossil record cannot (on the face of it) be similarly generated. Thus, paleontological data is often non-renewable while some physical data is not (assuming that the knowhow, resources and desire to build particle accelerators remains). By contrast, some (but only some!) cosmological data—background radiation from the big bang for example—is common, but not renewable.

Rarity and being non-renewable are not independent: our capacity to generate data makes a difference to its rarity. However, they’re worth pulling apart because they refer to pertinent differences between phenomena. Some phenomena do not often naturally occur but can be generated—as we see in particle physics—while other phenomena are both rare and are also not amenable to being generated, for instance the behaviors of extinct critters. Non-renewability is often emphasized in vertebrate paleontology. Consider for instance a recent letter by the Society of Vertebrate Paleontology:

These fossils, remains and traces of prehistoric life, are inherently non-renewable. This means that reproducibility of paleontological research rests on the premise of permanency and accessibility of examined fossil specimens permanently accessioned and deposited in stable repositories within the public trust, each with a unique permanent catalogue number. (2020, p. 2)

Where rarity and non-renewability concern phenomena-time, fragility and being demanding are features of data-time. Fragility refers to the likelihood of information being lost on data-journeys. Although many vertebrate fossils are solid rock, their extraction and preparation are often risky procedures with breakage and distortion being common. They are also demanding: data-management is an enormous challenge. Vertebrate fossils are often extremely heavy, require a lot of space, and are subject to on-going recategorization as ideas change. As with rarity and being non-renewable, fragility and demandingness are not independent. Clearly, the fragility of fossils is part of what makes them so tricky to manage as data. However, again, these do not strictly overlap. In addition to challenging data management, fragility makes data generation (extraction and preparation) particularly risky.

The Paluxy River trackways are a paradigm example of epistemic scarcity (Table 1). First, trackways are rare: the conditions under which footprints might fossilize are highly specific, and the chances of any particular set of tracks surviving over geological time and then being found are vanishingly small. Second, trackways are non-renewable: although experimental ichnology can generate partial simulacra of past trackways, we cannot literally make more trackways in the fossil record. Third, trackways are fragile: over data-time the Paluxy river trackways were broken up, lost and degraded. Fourth, trackways are demanding: expensive to store, hard to categorize and so forth.

Table 1. 
Epistemic Scarcity & Trackways
 Phenomena-TimeData-Time
RarityRenewabilityFragilityDemandingness
Definition The likelihood of data’s occurrence. The capacity of scientists to generate data. The likelihood of data being lost on journeys. The challenges of managing the data. 
E.g., Trackways are rare because the likelihood of their being preserved and then surviving geological time are miniscule. Trackways are non-renewable because time constraints mean they cannot be artificially constructed. Trackways are fragile because extraction requires decisions about what information to preserve and pieces are often broken. Trackways are often large, heavy, pieces of rock. 
 Phenomena-TimeData-Time
RarityRenewabilityFragilityDemandingness
Definition The likelihood of data’s occurrence. The capacity of scientists to generate data. The likelihood of data being lost on journeys. The challenges of managing the data. 
E.g., Trackways are rare because the likelihood of their being preserved and then surviving geological time are miniscule. Trackways are non-renewable because time constraints mean they cannot be artificially constructed. Trackways are fragile because extraction requires decisions about what information to preserve and pieces are often broken. Trackways are often large, heavy, pieces of rock. 

There’s a lot to be said about conditions of epistemic scarcity, far more than I can manage in this paper. For one thing, various practices of data management amongst paleontologists can be understood as means of mitigating the fragility and demandingness of their data. Increasing use of 3D printing, non-destructive ways of getting at fossil structure such as CT scans, as well as historical practices of fossil illustration are all ways of producing less demanding, robust data. For another, data is often intended for multiple uses, and this can be a source of its demandingness. These uses sometimes include partially clashing scientific aims. For instance, using a fossil for phylogenetic analysis versus using it to order strata likely benefit from different cataloguing practices. But non-scientific aims are also a frequent concern: fossil preparation is often carried out with an eye to presentation in museums.

We should further distinguish epistemic scarcity from Latour’s “tragically local” data (1999). Given the variability and contingency of historical processes, the kinds of evidential claims historical scientists can produce are often quite limited in scope, tied to, say, particular cultures, technologies, lineages or clades4. The projectability of evidence across the cases of interest, and epistemic scarcity, can come apart. It doesn’t follow from epistemic scarcity that the resultant hypothesis cannot be projected across domains, and indeed highly contingent, heterogenous systems might nonetheless have common, renewable, robust and manageable data. Some questions concerning the Paluxy trackways, then, might be doubly tricky: involving both epistemic scarcity and non-projectability.

The important aspect for our purposes is the relationship between underdetermination and epistemic scarcity. I’ll now suggest that when philosophers have framed historical science in terms of underdetermination they’ve had something like epistemic scarcity in mind.

3. Underdetermination Thus Far

The notion of underdetermination underwrites various philosophical disputes about the status of scientific knowledge, especially in recent discussion of historical science (better understood as, I’ll suggest, conditions of epistemic scarcity). Let’s start with definitions.

Underdetermination is a relationship between a set of hypotheses and a set of evidence. It holds when the evidence-set is epistemically insufficient to decide between the hypotheses. By identifying different sets, different kinds of underdetermination are generated. First, we’ve in principle forms of underdetermination. These consider all possible hypotheses and all possible evidence, or perhaps current hypotheses against all possible evidence. It might be that, for any particular hypothesis, there are necessarily other possible hypotheses which accommodate the relevant evidence just as well; that is, any hypothesis has empirically equivalent rivals. Views on the pervasiveness of this kind of underdetermination underwrite various challenges to the possibility of scientific knowledge (e.g,, Goodman 1955; Kukla 1996). For instance, if current hypotheses pertaining to the nature of unobservables are in principally underdetermined by any evidence we could have, then we might deny that scientists can establish the truth (or justification) of hypotheses about those unobservables (e.g., Clendinnen 1989, Quine 1975, Van Fraassen 1980). The thought that any given theory in principle will have equally well supported possible rivals has, in Laudan and Leplin’s terms:

… wrecked havoc throughout twentieth-century philosophy. It motivates many forms of relativism, both ontological and epistemological, by supplying apparently irremediable pluralism of belief and practice. It animates epistemic skepticism by apparently underwriting the thesis of underdetermination. In general, the supposed ability to supply an empirically equivalent rival to any theory, however well supported or tested, has been assumed sufficient to undermine our confidence in that theory and to reduce our preference for it to a status epistemically weaker than warranted assent. (Laudan and Leplin 1991, p. 450)

The thinking is straightforward. We should prefer a scientific theory insofar as it outperforms its rivals on evidential grounds. But if for any theory there are in principle other theories which are equally-well supported by the same evidence, then we haven’t grounds for preferring it (at least vis-à-vis those rivals). As with any long-standing philosophical dispute, there are multiple routes to avoiding these skeptical conclusions (e.g., Churchland 1985; Devitt 2002; Psillos 1999). Laudan and Leplin’s approach is relevant for our discussion, so is worth sketching (although see Kukla 1993; Okasha 1997; Sarkar 2000; Stanford 2006).

At base, Laudan and Leplin emphasize how scientific knowledge evolves and develops. What counts as evidence is a shifting target, depending as it does on technological capacities and background theory. That is, evidential equivalence is not a simple relationship between a set of evidence and a set of hypotheses. It relies on (1) the stability of that set of evidence, as what was once unobservable sometimes becomes observable as technology and understanding evolves; (2) the stability of auxiliary hypotheses which connect observations to the theories at hand. The upshot for in-principle treatments of underdetermination is that the rubber doesn’t hit the epistemic road: we “… must defer, in part, to scientific practice. It undercuts any formalistic program to delimit the scope of scientific knowledge by reason of empirical equivalence, thereby defeating the epistemically otiose morals that empirical equivalence has been made to serve” (p. 454; see Turner 2016 for similar points).

Following arguments like Lauden and Leplin’s, as well as general turns towards practice-oriented, contextual analyses, recent discussion of underdetermination has focused on local variants. Here, a set of current hypotheses are underdetermined by the available evidence. That is, a set of hypotheses which are currently, as it were, on the scientific table, are not preferred (or not sufficiently preferred) on the basis of the evidence scientists have produced and are currently producing (Currie 2018; Godfrey-Smith 2008; Laudan 1990; Sklar 1977; Stanford 2009). In contrast with in-principle underdetermination, cases of local underdetermination are more-or-less robust. Some might be easily resolvable, if only the right evidence was found. Others are more intransigent: we might have reason to think the right evidence is unlikely to show up. In the philosophy of historical science, a debate about the epistemic power of our access to the past turns on the resilience of otherwise of local underdetermination (Currie 2015, 2017, 2018; Forber 2009; Kleinhans, Buskes, and de Regt 2005; O’Malley 2016; Tucker 2011; Turner 2005, 2007, 2009a, 2009b, 2016; Wylie 2019).

Derek Turner claims that intransigent local underdetermination is rife in the historical sciences (2005, 2007). First, he argues that historical processes are typically information destroying. For instance, we know the conditions of trackway fossilization are highly specific, and that they’re likely to degrade or be destroyed over time. As such, underdetermination which would be resolved by discovering new trackways is unlikely to be so. Second, he compares the capacities of experimental science with that of historical science. Where for historical science primary evidence is from the historical record, in experimental science new primary data may be generated by, well, experiment. To the extent that historical records are degraded and incomplete, then, our hands are tied vis-à-vis improving our epistemic situation.

Thus, we have an argument that in historical science local underdetermination will be intransigent. We should, then, be pessimistic about scientific progress in such fields.5 I think it clear that Turner has (or should have!) something like epistemic scarcity in mind: there are plenty of examples across the historical sciences of robust, abundant, renewable, easily managed data and here his arguments are weaker. We can, then, construe the debate about underdetermination not about historical science per se, but about science under conditions of epistemic scarcity (which may include non-historical science…). In response, more optimistic philosophers have pointed to various sources of overcoming underdetermination.

First, evidence might not be as rare as Turner argues. Our expectations of making new finds are tempered by our knowledge of the processes by which information is preserved or lost. And, as Laudan and Leplin emphasized, this knowledge evolves: information which we once thought destroyed turns out to not be (Currie 2018; Jeffares 2010). The field of ancient DNA is a case in point. Where previously it was thought that molecular structures are unrecoverable, we now know that molecular signals do leave traces, and the temporal window by which they might be detected is increasingly pushed back. Second, and relatedly, our technological capacities continually evolve, and this often means that new, often unexpected information can be extracted from traces (Novick et al. 2020; Tamborini 2020; we’ll return to technology below). Third, the non-renewable nature of historical data may be challenged. Our capacity to create simulacra of past objects and processes, it has been argued, can mitigate information destruction over phenomena-time (Currie 2018 chap. 9; Jeffares 2008).

So, whether we think local underdetermination will be overcome is the product of how likely it is that (1) our knowledge of historical processes will change (at least in ways that grant access to new data-sources), (2) our technological capacities to analyze historical processes will change, and (3) whether our capacity to generate simulacra will increase. And these three sources are not independent, often bolstering one another. This leads to a debate about, on the one hand, how information destroying or preserving we should think the processes shaping the historical record over phenomena-time are, and on the other hand, the capacities of scientists to develop new ways of extracting information from the historical record and to generate new evidence (e.g., Currie 2019b; Finkelman 2019; Havstad 2019).

As the number of clashing considerations in favor of optimism and pessimism increase, agnosticism becomes plausible. Joyce Havstad has argued that we ought to be agnostic about optimism and pessimism, because we simply lack the epistemic resources for reliable bets:

… the agnostic position stems from a lack of confidence: it is the inability to say whether historical processes are disposed to preserve or to destroy, and what sort of evidential position historical scientists are thereby in. (Havstad 2019, p. 5)

Havstad’s argument focuses on phenomena-time: it is unclear how to ascertain just what information has been preserved and what information has been destroyed as the historical record has formed. Patrick Forber takes a similar position:

We cannot be in an epistemic position now to assess whether two incompatible rivals are empirical equivalent relative to all present and future evidence … we can make no reliable inferences about how our epistemic position will persist into the future. (Forber 2009, p. 255)

Turner has characterized agnosticism as “adopt[ing] a policy of abstaining from any further predictions about which questions scientists will or will not be able to answer in the future” (2016, p. 64). He provides two arguments against agnosticism. First, just as general optimism or pessimism across historical science is an odd position, so too is general agnosticism. For we do have some grip on what historical knowledge is easier to be had than others. Second, “scientists routinely draw inferences about the future of their own fields” (2016, p. 65). The necessity of scientists making bets undermines the thought that such bets shouldn’t be made. As we’ll see, taking data-time more seriously increases the complexity of the considerations here, and potentially bolsters agnosticism; however, I’ll suggest a different framing of these debates. To get there, let’s turn to an example of how scientists can overcome underdetermination despite epistemic scarcity.

4. Legacy Data

The loss of the Paluxy-River trackways was not total. In addition to his extraction efforts, Bird used various mediums to record the trackways as they originally appeared: he made field sketches, took multiple photographs and a film recording. These historical documents, legacy data, potentially allow us to revisit the site prior to extraction.

As we’ve seen, data is well-understood as potential evidence which is generated in a particular context, then journeying through various processes of cleaning, cataloguing, and so forth, before being ‘enriched’ to support evidential claims. Data, then, form lineages: “historical entities which—much like organic beings—evolve and change as their life unfolds and merges with elements of their environment” (Leonelli 2020, p. 16). Across the life of a discipline, different best-practices, techniques, and mediums are used both in data generation and in the recording of metadata. An upshot of this is that data from different times—produced using different practices—might well be unamenable to employment now. Legacy data, then, refers to data produced in historical contexts6.

The epistemic problems arising from the use of legacy data are not conceptually distinct from data generally, but in legacy contexts some challenges are particularly acute. The context of data’s generation and subsequent journeying makes a difference to what kinds of claims it can underwrite and how it relates to other data. Metadata can be insufficient; data can come from profoundly different sources; we might lack sufficient understanding of past practices. Because data-journeys always matter for data’s employment as evidence, there is a sense in which all data is legacy data. All data has been produced in the past, and that past matters for what evidential uses it can be put to (Currie 2019a; Leonelli 2018b). However, in cross-disciplinary cases, for instance, or when practices are not standardized, or when—our focus here—data has been produced in the past using different techniques, technologies and best practices, these challenges are amplified. So, although legacy data is not special conceptually, it is a useful focus for considering epistemic challenges arising from underdetermination.

Legacy data is critically important under epistemic scarcity. Because data is rare and non-renewable, scientists haven’t the luxury of simply finding or generating new specimens. Specimens like the Paluxy trackways are thus precious, and there is good reason to try to overcome the challenges involved with integrating and reanalyzing data produced under different contexts using different practices. This reanalysis often involves reconstructing the past conditions of science itself: in a sense, our phenomena becomes past scientific work. As Alison Wylie puts it:

In the case of legacy data this appraisal sometimes involves quite literally retracing the steps of those who originally recovered a sample back into the field or to the repositories and laboratories to which finds and records were dispersed, reconstructing a record of the context from which it was drawn and the processes by which it was transformed… (Wylie 2020, p. 271)

A common feature of legacy data is the role of luck: often highly contingent events lead to recovery, perhaps analogous to the rareness of fossilization over phenomena-time. For the Paluxy Trackways, a series of contingent events led to the rediscovery of Bird’s recordings. As Thomas and Farlow report, in the 1980’s one of them (Farlow) acted as referee for a biography of Bird’s life, and was surprised to find references to various charts, photographs and even a film of the excavation.

Interviews with Bird’s wife and sister revealed that he had stashed away quite a bit of unpublished information about the Paluxy River trackway. Bird’s nephew soon discovered a canister with the lost film of the excavation; it was neatly stored in a basement refrigerator. A box in Bird’s attic provided countless notes, along with some large charts of the footprints in question. (Thomas and Farlow 1997, p. 76)

Working with Bird’s rediscovered reference material, Farlow was able to closely analyze the footfalls and argue that the predator’s movements closely paralleled those of the sauropod. Comparisons with the behavior of mammalian predators such as lions found further parallels in the trackways’ skipping stride.

… it now appears perfectly clear that about 100 million years ago, on a limey mudflat in what is now Texas, at least one swift carnivore singled out and possibly attacked a huge, lumbering herbivore. It seems that Bird was not only lucky enough to find remarkable evidence of this incident of natural history but that he was also wise enough to recognize, document and excavate part of the record of this ancient hunt left on a sodden plain, now turned to stone. (Thomas and Farlow 1997, p. 79)

So, analysis of Bird’s own reference material provides new evidence supporting his original hypothesis. We might say that Bird’s photos are precious similarly to the trackways: if they had been lost or degraded over data-time, then reconstructing the trackways in situ would be a much more daunting prospect. Our story of the Paluxy River trackways is a classic case of putting legacy data—Bird’s photographs—to new evidential use.

However, just as the fidelity of information over phenomena-time crucially matters for the epistemic support of hypotheses, so too does the trustworthiness information over data-time. That is, should we trust Bird’s material? How well did he record the relevant information? To what extent did his own interpretation influence how he recorded data? Here, technology interweaves with legacy data to provide new inroads into the past.

5. Technology & Legacy

As we’ve seen, the recovery of Bird’s legacy data enabled new analysis of the trackways despite the loss and degradation of the fossils themselves. However, open questions still remained as to the trustworthiness of that legacy data. Recently, technological developments have allowed scientists to further reconstruct the trackways and gain insights to exactly that question.

Ichnology has made technological strides in the last decade, particularly as various scanning and photogrammetry techniques have become affordable and portable. This has led to a concerted effort to document the Paluxy River site in situ (Farlow et al. 2012). Building on this, Falkingham et al. use modern technology and Bird’s photographs to reconstruct the trackways as they were in 1940, which in turn serve as data to reconstruct the trackways as they were in the Cretaceous, along the way providing inroads into the trustworthiness of Bird’s data.

Photogrammetry is a way of drawing a 3-dimensional topography from 2-dimensional photographs. Basically, photogrammetric techniques map spots on a two dimensional plane and represents them in three dimensions. Photogrammetry is used, for instance, to represent a landscape in three dimensions on the basis of satellite photographs. Using seventeen of Bird’s photographs, Falkingham et al. applied photogrammetry to the trackways. The resultant 3D representation is imperfect: the photographs are grainy, tools and other objects on the site obscure some of the tracks, and the shots are mostly from the same orientation. However, the photogrammetry results can be compared both to Bird’s original sketches and to the still existing parts of the original trackways. They conclude, “the entire sequence observable from the photographs was reconstructed in 3D, measuring over 45 m in length. Although the tracks lack fine detail, their locations are obvious on the textured model, and as such an aerial-view of the site, as it was prior to excavation, can be produced” (Falkingham et al. 2014, p. 3).

Falkingham et al. then compared their results with Bird’s two maps of the site. This allowed them to identify one as being more accurate, as well make various inferences about how Bird went about measurements and extraction by contrasting distances in the reconstructed site with Bird’s tabulated data (they suspect he used string, and took individual, independent measurements rather than using a grid). Most impressively, they appear to detect therapod trackways which Bird’s sketches missed. The work gives us some grip on the epistemic credentials of Bird’s original recordings of the trackways. Further, the new trackways might suggest the predator was not acting alone. The techniques, then, allowed an inference to what the trackways were like prior to excavation while providing insights to the practices and techniques Bird used.

It is an exciting prospect to think that many palaeontological or archaeological specimens that have been lost to science, or suffered irreparable damage, may be digitally reconstructed in 3D using free software and a desktop computer. We envisage that historical photogrammetry will become a powerful, common, tool in the future, particularly as advances in photogrammetric techniques enable software to compensate for the difficulties inherent in using old photographs. (Falkingham et al. 2014, p. 5)

So, it seems that the Paluxy River trackways have not been lost after all; they’ve been reanimated via a combination of new technology and old recovered records (see Lallensack et al. 2015 for another example). The epistemic license is iterative: Falkingham et al. navigate between the photogrammetric techniques, the photos, Bird’s sketches, the surviving trackways, and Bird’s tabulated data.

The role of technological progress is a common theme in discussion of local underdetermination (Chapman and Wylie 2016, chap 4; Jeffares 2008; Wylie 2019). My own arguments for optimism appeal to a strategy called ‘methodological omnivory’, at base the idea that historical scientists adopt, co-opt and construct multiple imperfect techniques for recovering the past (Currie 2017, 2018). I’ve emphasized the local and targeted aspects of these techniques: as opposed to generally applicable technologies, these are built specifically to work with the particular materials, background knowledge and questions the scientists are faced with. Falkingham et al.’s use of photogrammetry techniques are tailored specifically towards probing the original conditions of Bird’s extraction efforts, and their conclusions vis-à-vis Bird’s legacy data is based on their understanding of the limitations of the technology they use and the data they’re working with. They no doubt emphasize the possibility of such technology being adopted and applied across various ichnological contexts (as we’ll see, they draw on this kind of work to argue for a new system of best practice), but their particular adoption and deployment of photogrammetry is tailored towards their local context: it is an example of methodological omnivory.

Marco Tamborini takes things further, arguing that the historical sciences are best understood in technoscientific terms. These are disciplines where “science and technology cannot be easily separated” (2020, p. 57). Tamborini’s point, as I see it, is that historical scientists’ attempts to understand the past shouldn’t be understood as primarily in the business of veridical representation, they’re rather interested in finding ways to probe and represent their data:

[Their] main aim was not to correctly represent what it actually was, but rather to technically manage, experiment with, visualize, produce, and efficiently work with the natural historical phenomena they struggled to understand. (Tamborini 2020, 58)

On this view, instead of trying to answer, say, whether a therapod dinosaur lunged at a sauropod millions of years ago, or even whether Bird took reliable data, Falkingham et al. are primarily interested in finding ways to work with their data or natural phenomena. On the face of it, this claim is somewhat mysterious: it is unclear how we should understand ‘phenomena’ in Tamborini’s terms. First, if we understand ‘phenomena’ as including events in the past, then it is unclear how we can contrast this with representing what actually was, as these are—we might think—the same thing. Second, if we understand ‘phenomena’ in terms of scientists’ representations of data (data models, etc…) this seems to fly in the face of what the scientists we’ve been discussing were clearly trying to do: use data to make inferences about what happened in the past. Below, I’ll suggest this can be resolved by taking a more relaxed conception of scientific aims or goals.

Regardless, both Tamborini and myself emphasize the role of technological progress in investigating the deep past. The important point for current purposes is how technological progress works in data-time: new technology doesn’t simply allow new information to be drawn from new data, but also from old data. Legacy data that was once irretrievable is now retrievable. It is in light of Bird recording his data, this then being discovered by folks with sufficient understanding of past scientific practices to put it to work, and the development and availability of photogrammetry techniques that Falkingham et al. managed the recovery work they did. Thus, in the face of epistemic scarcity, scientists went forwards by looking back. This interweaving of legacy data, reanalysis, and the use of new technology further complexifies discussion of underdetermination under epistemic scarcity.

6. Beyond Underdetermination

Philosophers often frame questions about the epistemic credentials of historical science (or scientists facing epistemic scarcity) in terms of local underdetermination problems: given that current evidence does not decide between conflicting hypotheses, should we bet that underdetermination will be overcome or not? We’ve seen that increasing focus on scientific practice has led to such questions becoming increasingly complex, making agnostic positions more attractive. My aim in this section is to first, articulate how focus on data-time, legacy data and technological progress makes these questions still yet more intractable. Second, suggest that instead of adopting agnosticism, or arguing for optimism or pessimism about particular kinds of knowledge, we should adopt a different framing: one in terms of resources and strategies. Third, I’ll use these considerations to argue against the idea that scientists under epistemic scarcity are passive: they adopt active strategies geared towards mitigating epistemic scarcity.

6.1. Data-Time and Underdetermination

As I’ve mentioned, epistemic scarcity refers to data, not evidence. This is because we shouldn’t infer directly from epistemic scarcity to evidential impoverishment. My having fragile, demanding, rare and non-renewable data doesn’t entail that evidential claims made on that basis will be locally underdetermined, unfounded or weak. The historical sciences are replete with examples of powerful evidential claims in the face of epistemic scarcity. However, conditions of epistemic scarcity do make underdetermination problems particularly pertinent.

Whether we can resolve debates between pessimism and optimism turns on how opaque science’s epistemic fortunes are. Considerations of information-loss and recovery over data-time doesn’t simply make the debate more complex by increasing the number of sources of underdetermination and resources for overcoming it, they also increase the opaqueness of future scientific success and failure. Insofar as future epistemic fortune is indeed opaque, we philosophers face an impasse. This opacity only increases as we incorporate data-time.

Caitlin Wylie has made steps in this direction. She notes that underdetermination occurs in data. She focuses on fossil preparation, the highly skilled business of generating a scientifically useable specimen from extracted rock. There is no one best or determined output in this process, results turning on differing aims, the skills of the preparator, and luck. “Potential data from fossils, therefore, are underdetermined by the surrounding rock” (Wylie 2019, p. 24). Thus, in addition to underdetermination between hypotheses and evidence, we should also consider underdetermination occurring between data and its sources. This is one step toward making more complex how philosophers think about underdetermination. The next step is to explicitly incorporate data-time.

Turner’s arguments for pessimism focus on rarity and non-renewability, that is, phenomena-time. Such arguments could be bolstered by pointing to the fragility and demandingness of historical data. Although the Paluxy trackways were a remarkable example of survival over phenomena-time, as we’ve seen they fared badly over data-time: extraction was rough, the trackways were split in three, one part became significantly degraded, another was lost. The recovery of Bird’s records was serendipitous, a happy-happenstance contingent on Bird making the recordings, their being kept, then Farlow refereeing Bird’s biography. Their survival over data-time is as remarkable as the trackway’s survival over phenomena-time. As such, Turner could expand the sources of underdetermination he draws on: not only is information-destruction a common feature of historical processes, but the fragility and demandingness of data under epistemic scarcity makes information-destruction ubiquitous over data-time.

Due to epistemic scarcity, then, the sources of underdetermination are much richer and more complex than philosophers have recognized. And yet, in our case-study we’ve seen apparent information-loss over data-time be partly overcome. By drawing on photogrammetry techniques and exploiting records of the tracks prior to extraction, Falkingham et al. recovered much of the Paluxy trackways thought lost, as well as the practices and techniques used in that past context. Just as focusing on data-time increases the sources of underdetermination, it also increases our means of overcoming it. I’ve emphasized how legacy data, especially in light of new technological capacities, can be re-interpreted, re-analyzed and re-deployed. The strength of our inferences to the Paluxy River’s past involved going forwards by looking back. So, data-time also reveals new sources of overcoming underdetermination.

Consideration of data-time strengthens agnosticism. Where Havstad focused on our ignorance concerning information preservation over phenomena-time, we can now also emphasize our ignorance concerning information preservation over data-time. Attending to whether historical processes preserve or destroy information is not the only thing that matters, so also is whether scientific practices of data collation, management and deployment preserve or destroy information. As the range of factors determining investigative success becomes increasingly heterogenous and complex, epistemic fortunes become increasingly opaque, and thus bets on success or failure become increasingly risky.

Taking data-time seriously stretches the use of framing debate in terms of underdetermination to breaking point. Agnosticism becomes more attractive, but Turner has made convincing points against agnosticism: sometimes surely we do have a good grip on what kinds of things we’ll find out about the past, and—more pressingly—scientists themselves have to make judgments about which research strategies they adopt. Agnosticism, then, on the face of it, leaves philosophical analysis unable to inform what scientists should do (although see Currie 2019a). Let’s sketch a preliminary alternative.

6.2. Retooling the Debate

Happily, abandoning underdetermination as a way of framing epistemic questions concerning epistemic scarcity needn’t leave us rudderless, nor unable to provide normative analyses of science’s epistemic standing. My aim here is to illustrate the kind of alternative I have in mind. The framework is not intended to be fully realized and has similarities with Chang’s notion of ‘systems of scientific practice’ (2013), Leonelli and Ankeny’s ‘repertoires’ (2015) and my ‘epistemic situations’ (2018). The point of the sketch is to show that moving beyond underdetermination needn’t undermine epistemic analysis. Instead of asking whether some open question about the past will be resolved or not, we should ask whether the strategy scientists adopt in light of it is likely to be productive given the available resources. Let’s get clear on these notions.

At base, the resources available to scientists are what they have to work with. I understand resources extremely broadly, including their available technologies, data and background theory but also their economic opportunities, social standing and so on. Epistemic scarcity focuses on a set of resources: namely, data. As we’ve seen, sometimes scientists operate with non-renewable, rare, demanding and fragile data. But this does not exhaust their resources. They also have myriad technological affordances available to record, store, analyze, and interpret that data. These range from expert analysis of fossils, to statistical techniques, to data bases, to fossil storage facilities and best-practices, to various kinds of data analysis and transformation such as photogrammetry. But they also have socio-economic resources, such as placement in universities and museums, and various social structures provisioning funding and interest. Elizabeth Jones, for instance, has explored how appeal to celebrity provided a crucial set of complex cultural resources for the development of ancient genomics (Jones 2019).

In light of available resources, scientists adopt differing strategies. A strategy, at base, is a means of either knowledge production or increasing the resource base with which scientists work. In light of epistemic scarcity, scientists like Farlow and Falkingham et al. adopt strategies which integrate re-analysis of legacy data with new technological development. Why they do this as opposed to some other kind of study is made sense of in light of their available resources. If, say, trackways were not so scarce then spending time developing techniques which allow us to work with legacy data would be odd; as legacy data would not be epistemically precious. Similarly, the use of celebrity in paleontology is often geared towards garnering attention and funding for research, namely, increasing the available resource-base.

Such questions can be asked across multiple scales. We might consider disciplinary or even cross-disciplinary resources and strategies. For instance, paleontology generally lacks a clear institutional home, spread across museums, as well as geology, biology and medical departments in universities. On the one hand, this is challenging as it can lead to disciplinary fragmentation. But on the other, it can well place paleontologists to interact with a wide variety of different disciplines and their resources. On smaller scales, we might consider the resources and strategies available to particular labs, or even individual researchers. A paleontologist with access to a CT scanner has different affordances than one without, and this difference in resource will make a difference to what strategies they should follow.

So, I suggest philosophers ask whether a scientific investigation is a good strategy given available resources. Where underdetermination leads us to make bets about whether some piece of knowledge or other will be had by scientists—and potentially leads to an inert agnosticism—analyzing resources and strategies allows philosophers to both make sense of why scientists conduct investigations as they do and potentially make judgements as to whether they ought to adopt the strategies they have.

It is tempting to add a third element to this initial sketch: scientists’ aims. Here, at least when considering epistemic scarcity, I’m unsure whether a focus on aims is particularly productive. First, we might worry about how to actually identify scientists’ aims—a tricky discussion I’ll leave for later work. Second, recall my earlier worries about Tamborini’s view, namely whether we should understand historical scientists as attempting to reconstruct the past or develop technoscientific methods to explore phenomena. It seems to me that the strategies scientists adopt in light of epistemic scarcity are too open-ended for identifying a particular goal or set of goals. Falkingham et al. are trying to correctly (if approximately) represent the Paluxy trackways prior to excavation, as well as Bird’s extraction and data recording techniques, as well as test hypotheses about what in fact happened millions of years ago, as well as develop new ways of representing and exploring phenomena. It may be that in some contexts the particular aims of investigation are crucial for understanding scientific strategies, but the very opportunism and openness of scientists operating under epistemic scarcity undermines this as a general rule.

Further, the notions of resource and strategy can make some headway on a pressing objection to my argument against the underdetermination framing. Surely, often we at least care—and it is often critically important—that we do get the right answer to scientific questions. That is, some knowledge matters for how we should respond to threats, organize our society, and so forth. In those contexts we certainly want to overcome underdetermination. No doubt: in the right context underdetermination looms large. In fact, we could view the (let’s call it) societal value of getting to the truth in terms of resources and strategies. I tend to view the relative unimportance of getting things right in some areas of historical science (does it really matter whether or not a sauropod was attacked by a therapod millions of years ago?) as a resource: it partly enables the open-ended and opportunistic strategies such scientists adopt. Of course, we can also view societal value as a resource: researching pressing questions (say, the efficacy of public health interventions) enables scientists to gather funding, interest, and passion. As such, the importance of underdetermination can be captured within the kind of framework I’ve suggested and—indeed—what counts as a resource will often be a complex question requiring intimate examination of historical and social context.

So, it is my contention—no doubt a sketchy one—that by focusing on resources and strategies philosophers are better placed to understand and inform scientific practice than by focusing on underdetermination. To close, I want to illustrate the importance of data-time in one more way, by pushing against to a hitherto little-remarked feature of how some philosophers think about historical science: passivity.

6.3. The Passivity of Historical Science

Turner’s original argument for pessimism turned in part on a distinction between limiting and promoting background theory. The latter, which he thinks is common in experimental science, tells us how to generate new data; the former, common in historical science, tells us how our data sources will degrade. In the parlance of this paper, experimental data is renewable and historical data is not. This is an example of what I’ll call the passivity of historical science.7 Where experimenters control their epistemic fates, historical scientists must work with what time’s information-destroying processes have left. Carol Cleland captures the idea evocatively:

… [in historical science] one is at the mercy of what nature just happens to leave in her wake; sometimes she is generous and sometimes she is stingy, but the bottom line is that you can’t fool with her. (Cleland 2002, p. 485)

Focusing on phenomena-time, the motivation is clear: the rarity and non-renewability of data under epistemic scarcity seems to fundamentally limit what can be known. If the trackways were erased over geological time, so be it: they’re lost to us. And one might insist that the epistemic fate of historical reconstruction is beholden to what information traces hold (e.g., Havstad 2019). But turning to data-time, things look different.

In conditions of epistemic scarcity, data is fragile and demanding. But as we’ve seen, various strategies can and do mitigate this. A common aspect of data journeys is their being transformed into more robust and easily managed forms. Paleontologists take casts of trackways, perform measurements, photograph them, and model them in digital environments. Indeed, consideration of legacy data can matter for science’s future success by providing lessons about how data should be managed now. Falkingham’s reconstruction of the trackways in situ turned in part on a happy happenstance. But this is not always the case: careful data storage and management can increase later scientists’ capacities to reuse and breathe new life into old data. For instance, new best-practices for managing ichnological data are being developed. Falkingham et al. complain that in ichnology

… communication has been almost exclusively limited to printed papers and books. This 2D medium restricted the recording of tracks to sketches and lithographs, and later with the rise of the camera, photographs. Most ichnological literature, perhaps until only a few years ago, continued to rely solely on photos and drawings. Workers have thus spent the majority of their time reporting linear measurements in the horizontal plane (e.g., length, width and interdigital angle (IDA, or digit divarication); occasionally supplementing such metrics with a single measure of depth. (Falkingham et al. 2018, p. 470; references removed)

These older practices don’t capture the critical three-dimensional aspects of tracks. Tracks are not molds of their maker’s feet: the medium, impact of footfall, gait and so forth, must be understood (not to mention the distortions introduced over geological time). In light of modern scanning and photogrammetry techniques (and their relative cheapness), Falkingham et al. recommend “all reports of traces include 3D data collection, especially when new ichnotaxa are being erected” (2018, p. 472). Echoing the sorry state of the Paluxy trackways over data-time they argue that “where possible, digitization should be carried out prior to any physical replication (e.g., molding or casting) as the physical replication process may alter the fossil either physically or chemically” (2018, p. 472; reference removed). They further give advice for how data should be presented and stored. The point of this standardization is to make the data amenable for travel, thus mitigating its demandingness and fragility. They, then, recommend a strategy aimed at increasing the resources available to ichnology.

The development of new practices, standards, technologies and theories of data generation and journeying is a fundamentally active process (or we might say in Turner’s parlance, they are promoting rather than limiting). Consider the data-time analogue of information loss over phenomena-time. Where for the latter we point to, say, the information destruction of fossilization as revealed by our background theories, for the former we point to our background theories about current and past data generation, recording, and management practices. For the latter, scientists are at nature’s mercy, but for the former, information loss turns on what scientists do: the decisions they make, practices they adopt, and so forth. And these practices are anything but passive.

Now, recall my suggestions for re-tooling the debate. Instead of considering epistemic fate simplicita, we should ask what strategies are likely to contribute to success given particular resources (and, perhaps, aims). Given epistemic scarcity, scientists develop and adopt practices aimed at preserving the scarce remains of the past. This can mean standardizing digital data recording techniques, storage, and dissemination as well as developing new technologies aimed at further analysis of legacy data. By considering the connection between available resources and strategies, philosophers can explain scientific success and failure in a way sensitive to local conditions. No doubt this example is rather shallow: obviously good data management practices are critical for scientific success. But notice how turning from questions about underdetermination to instead considering resources and strategies reveals a hitherto unnoticed philosophical mistake—the apparent passivity of historical science—and opens further questions. For example, we might ask to what extent Falkingham et al.’s suggestions could restrict ichnology. For one thing, if the standard were adopted as a requirement for publication, only scientists with access to the relevant technologies, know-how and time would be able to meaningfully work in the area (see Leonelli 2018b for this kind of discussion). For another thing, potentially emphasizing some aspects of retrievable information could come at the cost of other aspects. So, alternative framing along these lines allows for productive, normative philosophical analysis of scientific practice.

7. Conclusion

No scientist would be surprised to hear that data extraction and management matters, being as it is such a fundamental aspect of their practices and success. But philosophers have often missed this, likely due to our focus on abstract considerations of hypotheses, evidence and information. As we’ve seen, however, considering relational notions of data, tracking data journeys, and generally understanding evidential reasoning as a complex temporal process, can be transformative of philosophical debates. Regarding the obstinateness or otherwise of local underdetermination, we see that there are no simple answers or general arguments to be had. Regarding debates about the epistemic prowess of science, we see that categories like epistemic scarcity are more useful than others, such as historical science. Regarding the notion that historical scientists can only work with what nature has provided, we see that this relies on abstracting from data-time: modulo that abstraction, they are just as active, in control of their fates, as any other discipline. This is especially acute under conditions of epistemic scarcity, where data is precious, and here we’ve seen how both the data provided in legacy contexts, and our understanding of those practices themselves, can be a rich source of learning about the deep past.

Notes

1. 

The notion of information-destroying processes is owed to Sober (1991).

2. 

For recent historical and sociological discussion of data production and management in the historical sciences see Jones (2019), Nieuwland (2019), Rieppel (2012), and Sepkoski and Tamborini (2018).

3. 

You might question whether fossil themselves (as opposed to representations of them) should be considered data. By my lights they clearly are: fossils are extracted, processed, and often directly examined in forming paleontological claims. They go on data-journeys just as much as their representations.

4. 

See, for instance, Matthewson’s discussion of heterogeneity (2011), as well as Elliott-Graves’ development and Currie and Walsh’s developments of it (Elliott-Graves 2016; Currie and Walsh 2018). Also, my discussions of local knowledge (Currie 2019a, Currie forthcoming).

5. 

Philosophers of science often connect these pessimistic arguments to questions of scientific realism: if underdetermination is rife and intransigent, then we’ve not sufficient reason to think that the hypotheses in question are true. In this paper I’ll be focusing on pessimism and optimism rather than realism.

6. 

Most discussions of legacy data are in the context of computational data management (e.g., Roth and Schwarz 1997; Turau 1999). My usage is more general than these.

7. 

Passive views of historical science are not held by all philosophers: Ben Jeffares, Alison Wylie and my views, as well as Turner’s recent publications (e.g., Turner 2009a, 2009b, 2016), are certainly exceptions.

References

Boyd
,
N. M.
2018
. “
Evidence Enriched
.”
Philosophy of Science
85
(
3
):
403
421
.
Chang
,
H
.
2013
.
Water is not H2O: Evidence, Realism and Pluralism
.
Springer
.
Chapman
,
R.
, and
A.
Wylie
.
2016
.
Evidential Reasoning in Archaeology
.
Bloomsbury Publishing
.
Churchland
,
Paul M.
1985
.
The Ontological Status of Observables: In Praise of the Superempirical Virtues
. Pp.
35
47
in
Images of Science: Essays on Realism and Empiricism
. Edited by
Paul M.
Churchland
and
Clifford A.
Hooker
.
Chicago
:
University of Chicago Press
.
Cleland
,
C. E.
2001
. “
Historical Science, Experimental Science, and the Scientific Method
.”
Geology
29
(
11
):
987
990
.
Cleland
,
C. E.
2002
. “
Methodological and Epistemic Differences between Historical Science and Experimental Science
.”
Philosophy of Science
69
(
3
):
447
451
.
Clendinnen
,
F. J.
1989
. “
Realism and the Underdetermination of Theory
.”
Synthese
81
:
63
90
.
Currie
,
A. M.
2015
. “
Marsupial Lions and Methodological Omnivory: Function, Success and Reconstruction in Paleobiology
.”
Biology & Philosophy
30
:
187
209
.
Currie
,
A. M.
2017
. “
Hot-Blooded Gluttons: Dependency, Coherence, and Method in the Historical Sciences
.”
The British Journal for the Philosophy of Science
68
(
4
):
929
952
.
Currie
,
A. M.
2018
.
Rock, Bone, and Ruin: An Optimist’s Guide to the Historical Sciences
.
Cambridge, Mass.
:
MIT Press
.
Currie
,
A. M.
2019a
.
Scientific Knowledge and the Deep Past: History Matters
.
Cambridge University Press
.
Currie
,
A. M.
2019b
. “
Epistemic Optimism, Speculation, and the Historical Sciences
.”
Philosophy, Theory, and Practice in Biology
11
:
7
.
Currie
,
A. M.
(
forthcoming
).
Comparative Thinking in Biology
.
Cambridge
:
Cambridge University Press
.
Currie
,
A.
, and
K.
Walsh
.
2018
. “
Newton on Islandworld: Ontic-Driven Explanations of Scientific Method
.”
Perspectives on Science
26
(
1
):
119
156
.
Devitt
,
M.
2002
. “
Underdetermination and Realism
.”
Philosophical Issues
12
:
26
50
.
Elliott-Graves
,
A.
2016
. “
The Problem of Prediction in Invasion Biology
.”
Biology & Philosophy
31
(
3
):
373
393
.
Falkingham
,
P. L.
,
K. T.
Bates
,
M.
Avanzini
,
M.
Bennett
,
E. M.
Bordy
,
B. H.
Breithaupt
, … and
A. R.
Fiorillo
.
2018
. “
A Standard Protocol for Documenting Modern and Fossil Ichnological Data
.”
Palaeontology
61
(
4
):
469
480
.
Falkingham
,
P. L.
,
K. T.
Bates
, and
J. O.
Farlow
.
2014
. “
Historical Photogrammetry: Bird’s Paluxy River Dinosaur Chase Sequence Digitally Reconstructed as It Was Prior to Excavation 70 Years Ago
.”
PLoS One
9
(
4
):
e93247
.
Farlow
,
J. O.
,
K. T.
Bates
,
R. M.
Bonem
,
B. F.
Dattilo
,
P. L.
Falkingham
,
R.
Gildner
, … and
J.
Whitcraft
.
2015
.
Early-and Mid-Cretaceous Archosaur Localities of North-Central Texas
.
Guidebook for the Field Trip Held October 13, 2015 in Conjunction with the 75th Annual Meeting of the Society of Vertebrate Paleontology in Dallas, Texas
. https://www.researchgate.net/publication/283711331_Early-_and_Mid-Cretaceous_Archosaur_Localities_of_North_Central_Texas
Farlow
,
J. O.
,
M.
O’Brien
,
G. J.
Kuban
,
B. F.
Dattilo
,
K. T.
Bates
, et al
2012
. “
Dinosaur Tracksites of the Paluxy River Valley (Glen Rose Formation, Lower Cretaceous), Dinosaur Valley State Park, Somervell County, Texas
.”
Proceedings of the V International Symposium about Dinosaur Palaeontology and their Environment
:
41
69
.
Finkelman
,
L.
2019
. “
Betting & Hierarchy in Paleontology
.”
Philosophy, Theory, and Practice in Biology
11
:
9
.
Forber
,
P.
2009
. “
Spandrels and a Pervasive Problem of Evidence
.”
Biology & Philosophy
24
(
2
):
247
.
Godfrey-Smith
,
P.
2008
. “
Recurrent, Transient Underdetermination and the Glass Half-Full
.”
Philosophical Studies
137
:
141
148
.
Goodman
,
N.
1955
.
Fact, Fiction, and Forecast
.
Indianapolis
:
Bobbs-Merrill
.
Havstad
,
J.
2019
. “
Metaphorical Ripples
.”
Philosophy, Theory, and Practice in Biology
11
:
10
.
Jeffares
,
B.
2008
. “
Testing Times: Regularities in the Historical Sciences
.”
Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences
39
(
4
):
469
475
.
Jeffares
,
B.
2010
. “
Guessing the Future of the Past
.”
Biology & Philosophy
25
(
1
):
125
142
.
Jones
,
E. D.
2019
. “
Ancient Genetics to Ancient Genomics: Celebrity and Credibility in Data-Driven Practice
.”
Biology & Philosophy
32
(
4
):
27
.
Kleinhans
,
M. G.
,
C. J.
Buskes
, and
W. W.
de Regt
.
2005
. “
Terra Incognita: Explanation and Reduction in Earth Science
.”
International Studies in the Philosophy of Science
19
(
3
):
289
317
.
Kukla
,
A.
1993
. “
Laudan, Leplin, Empirical Equivalence and Underdetermination
.”
Analysis
53
(
1
):
1
7
.
Kukla
,
A.
1996
. “
Does Every Theory Have Empirically Equivalent Rivals?
Erkenntnis
44
:
137
166
.
Lallensack
,
J. N.
,
P. M.
Sander
,
N.
Knötschke
, and
O.
Wings
.
2015
. “
Dinosaur Tracks from the Langenberg Quarry (Late Jurassic, Germany) Reconstructed with Historical Photogrammetry: Evidence for Large Theropods soon after Insular Dwarfism
.”
Palaeontologia Electronica
18
(
2
):
31A
.
Latour
,
Bruno
.
1999
.
Circulating reference: Sampling the soil in the Amazon forest
. Pp.
24
79
in
Pandora’s Hope
.
Cambridge, MA
:
Harvard University Press
.
Laudan
,
L.
1990
. “
Demystifying underdetermination
.” Pp.
267
297
. In
C. W.
Savage
(Ed.),
Scientific Theories
.
Minneapolis
:
University of Minnesota Press
.
Laudan
,
L.
, and
J.
Leplin
.
1991
. “
Empirical Equivalence and Underdetermination
.”
The Journal of Philosophy
88
(
9
):
449
472
.
Leonelli
,
S.
2016
.
Data-Centric Biology: A Philosophical Study
.
Chicago
:
University of Chicago Press
.
Leonelli
,
S.
2018a
. “
The Time of Data: Timescales of Data Use in the Life Sciences
.”
Philosophy of Science
85
(
5
):
741
754
.
Leonelli
,
S.
2018b
. “
Global Data Quality Assessment and the Situated Nature of “Best” Research Practices in Biology
.”
Data Science
16
(
32
):
1
11
.
Leonelli
,
S.
2020
.
Learning from Data Journeys
. Pp.
11
31
in
Data Journeys in the Sciences
. Edited by
S.
Leonelli
and
N.
Tempini
.
Berlin
:
Springer
.
Leonelli
.
S.
, and
R. A.
Ankeny
.
2015
. “
Repertoires: How to Transform a Project into a Research Community
.”
BioScience
65
(
7
):
701
708
.
Matthewson
,
J.
2011
. “
Trade-Offs in Model-Building: A More Target-Oriented Approach
.”
Studies in History and Philosophy of Science Part A
42
(
2
):
324
333
.
Nieuwland
,
I.
2019
.
American Dinosaur Abroad: A Cultural History of Carnegie’s Plaster Diplodocus
.
Pittsburgh
:
University of Pittsburgh Press
.
Novick
,
A.
,
A. M.
Currie
,
E. W.
McQueen
, and
N. L.
Brouwer
.
2020
. “
Kon-Tiki Experiments
.”
Philosophy of Science
87
(
2
):
213
236
.
Okasha
,
S.
1997
. “
Laudan and Leplin on Empirical Equivalence
.”
The British Journal for the Philosophy of Science
48
(
2
):
251
256
.
O’Malley
,
M. A.
2016
. “
Histories of Molecules: Reconciling the Past
.”
Studies in History and Philosophy of Science Part A
55
:
69
83
.
Psillos
,
Stathis
.
1999
.
Scientific Realism: How Science Tracks Truth
.
New York
:
Routledge
Quine
,
W. V.
1975
. “
On Empirically Equivalent Systems of the World
.”
Erkenntnis
9
:
313
328
.
Rieppel
,
L.
2012
. “
Bringing Dinosaurs Back to Life: Exhibiting Prehistory at the American Museum of Natural History
.”
Isis
102
:
460
490
.
Roth
,
M. T.
, and
P. M.
Schwarz
.
1997
. “
Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources
.” In
VLDB Endowment
97
:
25
29
.
Sarkar
,
H.
2000
. “
Empirical Equivalence and Underdetermination
.”
International Studies in the Philosophy of Science
14
(
2
):
187
197
.
Sepkoski
,
D.
, and
M.
Tamborini
.
2018
. “
“An Image of Science”: Cameralism, Statistics, and the Visual Language of Natural History in the Nineteenth Century
.”
Historical Studies in the Natural Sciences
48
:
56
109
. DOI:https://doi.org/10.1525/hsns.2018.48.1.56
Sober
,
E.
1991
.
Reconstructing the Past: Parsimony, Evolution, and Inference
.
Cambridge, Mass.
:
MIT Press
.
Sklar
,
L.
1977
. “
What Might Be Right about the Causal Theory of Time
.”
Synthese
35
(
2
):
155
171
.
Stanford
,
P. K.
2006
.
Exceeding Our Grasp: Science, History, and the Problem of Unconceived Alternatives
(
Vol. 1
).
Oxford
:
Oxford University Press
.
Stanford
,
P. K.
2009
.
Underdetermination of Scientific Theory
. The
Stanford Encyclopedia of Philosophy
. Edited by
E. N.
Zalta
. https://plato.stanford.edu/entries/scientific-underdetermination/
Tamborini
,
M.
2020
. “
Technoscientific Approaches to Deep Time
.”
Studies in History and Philosophy of Science Part A
,
79
:
57
67
.
Thomas
,
D. A.
, and
J. O.
Farlow
.
1997
. “
Tracking a Dinosaur Attack
.”
Scientific American
277
(
6
):
74
79
.
Tucker
,
A.
2011
. “
Historical Science, Over-And Underdetermined: A Study of Darwin’s Inference of Origins
.”
The British Journal for the Philosophy of Science
62
(
4
):
805
829
.
Turner
,
D
.
2005
. “
Local Underdetermination in Historical Science
.”
Philosophy of Science
72
(
1
):
209
230
.
Turner
,
D.
.
2007
.
Making Prehistory: Historical Science and the Scientific Realism Debate
.
Cambridge
:
Cambridge University Press
.
Turner
,
D. D.
(
2009a
).
How Much Can We Know about the Causes of Evolutionary Trends?
Biology & Philosophy
24
(
3
):
341
.
Turner
,
D.
(
2009b
).
Beyond Detective Work: Empirical Testing in Paleontology
. Pp.
201
214
in
The Paleobiological Revolution: Essays on the Growth of Modern Paleontology
. Edited by
David
Sepkoski
and
Michael
Ruse
.
Chicago
:
The University of Chicago Press
.
Turner
,
D. D.
(
2016
).
A Second Look at the Colors of the Dinosaurs
.
Studies in History and Philosophy of Science Part A
55
:
60
68
.
Van Fraassen
,
Bas C.
1980
.
The Scientific Image
.
Oxford
:
The Clarendon Press
.
Wylie
,
C. D.
2015
. “
‘The Artist’s Piece Is Already in the Stone’: Constructing Creativity in Paleontology Laboratories
.”
Social Studies of Science
45
(
1
):
31
55
Wylie
,
A.
2020
.
Radiocarbon Dating in Archaeology: Triangulation and Traceability
. Pp.
263
279
in
Data Journeys in the Sciences
. Edited by
S.
Leonelli
and
N.
Tempini
.
Berlin
:
Springer
.
Wylie
,
C. D.
2019
.
Overcoming the Underdetermination of Specimens
.
Biology & Philosophy
34
(
2
):
24
.

Author notes

I’m grateful to Nora Boyd, Sabina Leonelli and two anonymous reviewers for useful feedback on drafts. I’d also like to thank Kirsten Walsh, Caitlin Wylie, Lucy Osler, readers of the Extinct blog, and attendees of the University of Exeter’s ‘Data Crunch’ meetings for discussion.