The role of the expert witness in trials is a paradox. Judges and jurors need help with matters beyond their understanding, and judges are expected to act as gatekeepers to ensure that jurors are not fooled by misleading expert testimony. Yet, as gatekeepers, judges might not effectively distinguish sound from unsound expert testimony. As factfinders, judges and jurors both might have difficulty comprehending expert evidence, intelligently resolving conflicts between experts, and applying the scientific and technological evidence they hear to the larger dispute before them. This essay explores those problems and a variety of possible solutions, ranging from more effective ways parties might present technical information at trial, to educational interventions supervised by the court, to making juries more effective in performing their task, to more controversial measures, such as replacing conventional juries with special juries and replacing generalist judges with expert judges.

The fundamental paradox of the use of expert evidence in litigation is that those with the power and duty to evaluate expert testimony possess less knowledge of the specialized subject matter at issue than do the experts whose testimony they are evaluating. Judges experience this paradox not only when they are performing as factfinders in bench trials, but also when they are acting as gatekeepers of expert testimony. As one prominent judge observed:

Though we are largely untrained in science and certainly no match for any of the witnesses whose testimony we are reviewing, it is our responsibility to … resolve disputes among respected, well-credentialed scientists about matters squarely within their expertise, in areas where there is no scientific consensus.1

The paradox also exists for juries. As Judge Learned Hand asked in 1901, “How can the jury judge between two statements each founded upon an experience confessedly foreign in kind to their own? It is just because they are incompetent for such a task that the expert is necessary at all.”2

Despite this central paradox, trials by generalist judges and representative juries have much to recommend them as vehicles for the rational resolution of factual disputes involving scientific and technical issues. In other fact-finding settings, decision makers often have strong preferences or prior commitments, and even if they do not, they might be subjected to an array of pressures applied by interested parties. Consider government and industry review panels; or think about the situation facing legislators attempting to integrate into the laws they draft diverse interpretations of scientific facts pressed upon them by constituents or lobbyists. Contrast these settings with trials, where jurors and judges are expected to have no biases regarding which party prevails and what facts are found to be true. Prospective jurors are ideally to be excluded before trial if they hold beliefs or attitudes that favor one party over another, or if their own interests are linked to a side in the case. Judges are expected to recuse themselves from cases, allowing other judges to preside, if they have or might reasonably be perceived as having close ties to a party or an attorney who would appear before them or a financial or other interest in the outcome of a case.

During trials, the system uses tools for informing decision makers about relevant facts that are, by design, fundamentally concerned with guaranteeing the relevance and reliability of information. To this end, the architecture of the adversary system promises the opportunity to make counterarguments for every important claim made by an opposing advocate. Ideally, the judge and the jury hear the parties' accounts, consider the competing factual claims and interpretations urged upon them, and then do their best to reach the verdict that best fits the facts they deem most likely correct. Compared to many other settings for fact-based dispute resolution, including those involving scientific facts, courtroom trials-notwithstanding their imperfections-are among the most rationally constructed.

In trials where expert scientific evidence bears on the heart of a dispute, the key problem is not the absence of factfinder neutrality, but rather that the decision makers arrive at their task without the knowledge, and perhaps without the intellectual skills, needed to complete their assignment effectively. Thousands of trials take place in federal and state courtrooms nationwide each year, often deciding significant cases with far-reaching implications. If the trial process is to serve the parties and the larger society well, the law must find means to overcome the inherent limitations that arise when scientific expertise is needed to resolve disputes. In this essay, we offer a range of suggestions for how judge and jury fact-finding in trials with scientific evidence might be improved.

Before discussing how trials might be made to work better, it is worth illustrating challenges likely to arise. Judges have long been the gatekeepers of evidence, screening proffered testimony under rules that evolved to prevent false or misleading evidence, including expert evidence, from leading jurors astray. The admissibility decision is key: if plaintiffs cannot use scientific evidence to make their case, the case may be resolved through summary judgment or collapse on its own. Yet this arrangement applied to experts is paradoxical at its core: expert evidence must be prescreened for nonexpert jurors by nonexpert judges.

Because jurors typically (though not invariably) are laypersons lacking the expertise to evaluate scientific and other technical evidence, they are offered the guidance of experts. On occasion, courts appoint neutral expert witnesses for this purpose.3 But, typically, experts are provided and employed (literally) by parties who wish to lead the jurors to particular conclusions. The filter interposed to protect jurors from being misled by invalid or misleading expert testimony consists of another nonscientist, the judge, who is generally not much better situated than the jurors to decide whether what is being received is sound or not. Some judges might have the benefit of experience with similar scientific evidence or can draw on their clerks' knowledge. Nonetheless, as we suggest below, in some circumstances, judges may lack strengths jurors have in evaluating scientific evidence.

In recent years, dna exonerations of innocent defendants have called attention to the long-standing and consequential failures of judges as gatekeepers in relation to various forensic sciences.4 For more than a century, judges have assessed the proffered testimony of witnesses who claimed to be able to identify the source of fingerprints, bite marks, hair, handwriting, footprints, tool marks, and the like found at crime scenes. These witnesses were typically allowed to testify with little or no vetting, and they have been extraordinarily persuasive in both bench and jury trials. Consider the case of Cameron Todd Willingham.5 The state's arson experts concluded that “arson indicators” established that a fire was intentionally set, making murders of the deaths of Willingham's children in the fire. The court admitted the expert testimony about what their so-called arson indicators implied even though there had been no empirical tests that showed that these indicators could distinguish accidental from purposefully set fires. The jurors accepted as sound the expert claims that had passed judicial muster. They convicted Willingham and sentenced him to death; he was subsequently executed.6

A month after Willingham's conviction, a major publication of the leading fire and arson investigation organization summarized ongoing empirical testing that found that the “indicators” relied on in the trial were unable to distinguish arson fires from accidental ones.7 Over the next twelve years, until Willingham's execution in 2004, in the course of numerous appeals, no court was ever asked to reconsider the (in)validity of the expert testimony that had been offered at trial.8

Courts have rarely excluded the findings and testimony of expert forensic scientists, but in recent years, interdisciplinary bodies of scientists have reviewed those forensic offerings and declared some of them, like the arson indicators, not only to be largely or completely lacking in empirical validation, but also to be almost certainly invalid.9 The National Research Council, the research arm of the National Academy of Sciences, which established a subcommittee to review the forensic sciences, concluded:

The bottom line is simple: In a number of forensic science disciplines, forensic science professionals have yet to establish either the validity of their approach or the accuracy of their conclusions, and the courts have been utterly ineffective in addressing this problem.10

Although it came too late to help Mr. Willingham, the field of fire and arson examination removed nearly two dozen “arson indicators” from its corpus of supposed knowledge when they were tested empirically and found unable to distinguish arson fires from accidental blazes.11 Two other forensic disciplines (voiceprint identification and comparative bullet lead analysis) closed up shop after being found by scientific review bodies – but not by the courts – to lack sound bases for their claims.12 A fourth technique, bite mark identification, seems to be next in line to be discredited, though, to date, no court has ever found it inadmissible.13 It is unlikely to be the last forensic discipline to be shelved for failing the test of empirical validation.

Judicial gatekeepers have been unable to distinguish pseudoscience from science even after the U.S. Supreme Court, in Daubert v. Merrell Dow Pharmaceuticals, clarified the test of admissibility to emphasize that the touchstone for the admissibility of scientific claims is demonstrated validity.14 These are hints that knowledge is not enough: if judges are unwilling to follow the evidence where it leads when it leads to unfamiliar destinations or unwelcome acquittals, then nonjudicial institutions will have to come to the rescue.15 Until they become better informed about a subject, neither average judges nor average citizens are likely to have more than a limited understanding and stereotypical impressions of the multitude of scientific and technical fields, and little ability to critically evaluate those fields' claims.16

When judges decide to admit scientific evidence, they risk putting an unintended thumb on the scale. Consider psychologists N. J. Schweitzer and Saks's research finding evidence of a “gatekeeper effect.”17 Participants evaluated expert evidence presented within or outside of a trial context. Those who reviewed evidence they believed had successfully passed through a judicial filter regarded the evidence as being of higher quality and more persuasive than participants who evaluated evidence presented outside the trial context. Apparently, participants assumed that evidence that survives the law's seemingly rigorous gatekeeping can be regarded as sound science.

Of course, even if a judge conscientiously and correctly admits only acceptably sound science, problems can remain, for some scientific issues are legitimately disputed between equally knowledgeable and sincere experts. How is the jury to referee such a dispute? Making matters even more difficult, because the great majority of cases are disposed of before trial, and because pretrial settlement tends to remove the clearest and easiest cases, what lands in court are the cases that the parties and their lawyers were unable to resolve, sometimes because of profound disputes over the facts. Thus, what the legal process delivers to judges and juries tends to be the most unclear, ambiguous, and challenging of the mass of cases initially filed.

Research indicates that when people are motivated and able to do so, they engage in central, or “System 2,” processing: that is, they process information thoughtfully in an effort to solve the problem confronting them.18 But when they are unmotivated or unable, perhaps due to lack of ability or information overload, they tend to engage in peripheral, or “System 1,” processing, relying on superficial features of the information before them, such as the number of arguments or the characteristics of the witnesses and attorneys.19 Thus, in trials in which jurors (or judges) might be overwhelmed by unfamiliar scientific evidence and confused or frustrated by testimony beyond their comprehension, shallow System 1 thinking may seriously endanger sound fact-finding.

Recent research suggests that even when expert testimony is presented in a relatively straightforward fashion, laypeople may be insensitive to the empirical support for a proposition (or lack thereof), although scientists see empirical tests as the touchstone for resolving scientific disputes. Instead, they may rely more on the background and experience of the witness presenting the evidence as a measure of the testimony's value. Although credentials can be informative, lawyers for both parties may seek out and succeed in hiring expert witnesses with similarly impressive credentials. If they do, the evaluation of experts' credentials will supply an even less reliable means of determining which opposing expert is the more competent.

Quantitative, statistical, and probability evidence can be especially confusing and potentially misleading. For example, students in a college economics class were up-set because their grades averaged 72 (out of a maximum 100 points), even though they were graded on a curve and the distribution of A's, B's, and C's was a predetermined constant. On the next exam, the professor employed a raw scale maximum of 137, on which the average score was now 96 (actually implying poorer performance by the class as a whole). Again, earned grades reflected the students' relative position in the class and the same number of A's, B's, and C's were given as before. This time, however, the students were much happier. A class average of 96 felt better than 72. The students were influenced emotionally by the superficial impression made by the raw scores, even though they understood cognitively that what mattered was their relative rank on the raw scale, whatever the scale happened to be.20

Jurors try to fit the evidence they hear into stories, narrative accounts that make sense of the facts of a case and imply particular case outcomes. Like most of us, they struggle hard to understand statistical and probability evidence and to infer its implications for a case. Typically, people underutilize such evidence in their decision-making and are more influenced by clinical evidence than they are by more diagnostic actuarial evidence.21 In trials, there is reason to think the problem is especially acute. Expert evidence, especially of the statistical kind, is difficult to incorporate into a story of the case, thus inviting undervaluation in comparison with other, more case-specific, narrative kinds of testimony.

Even when people understand the relevance of probability evidence, they can make “misaggregation errors,” causing them to underutilize the evidence. A misaggregation error occurs “when a person's subjective belief in the validity of a hypothesis [e.g., the defendant is guilty] is not updated to the extent that is logically warranted based on prior beliefs and the probative value of a new piece of probabilistic evidence.”22 Relatedly, people under-adjust for laboratory error rates when assessing the meaning of a forensic test's results.23

In some contexts, however, probability data can be overweighted. A well-known example is the “prosecutor's fallacy,” which confuses the frequency of a trait in the population (for example, one person in a million has dna that matches the crime scene dna) with the probability that someone other than the defendant left the evidence showing that trait (that there is only one chance in a million that crime scene dna came from someone other than the defendant). Further illustrating the confusion that probabilistic evidence can cause, if the same data are presented as frequencies rather than as probabilities (such as one out of every million people has dna that would match the crime scene dna), this can produce the opposite effect: undervaluing the probative value of the evidence given the other evidence in the case.24

Civil cases present another broad range of challenges for factfinders.25 Jurors and judges alike can easily become confused by material presented during expert testimony in civil trials, such as the meaning of statistical significance, practical significance, confidence intervals, relative versus absolute risk, or regression models.

In addition to the difficulties of dealing with statistics, most if not all of the heuristics and biases made famous by psychologists Daniel Kahneman and Amos Tversky can foster distortions in the rational interpretation of information and lead to error. Even experts are susceptible to such sources of error. For example, physicians who regularly counsel patients on the results of screening tests like mammograms sometimes make erroneous inferences about the meaning of a positive test result, even when they have all the information needed to reach a correct interpretation.26

Many of the problems of comprehending, evaluating, and using unfamiliar technical evidence to make important decisions are not peculiar to jurors. They are problems for most people in most situations, certainly including judges, and sometimes or often including trained specialists, who should have a fighting chance to get things right, but who are incompletely schooled in the evidence or fall prey to misleading cognitive heuristics.

The situation is not, however, entirely bleak. Even in cases with extensive scientific evidence, some factual disputes do not demand expert analysis. Instead, their resolution turns on credibility or related judgments. These may reflect not just the credentials of rival experts but the consistency of their claims, or the way witnesses hold up under cross-examination, or judgments about facts in dispute that the experts mutually acknowledge to be dispositive. In the medical malpractice area, for example, various studies have assessed the reasonableness of jury verdicts. Some studies have compared jury verdicts with confidential assessments of the same cases made by neutral physicians. These studies generally find agreement between the physicians and the juries.27

Even when testimonial or other evidence is unfamiliar and complex, jurors and judges can absorb and ponder the evidence deeply (central processing), even if mixed with other, more superficial thinking (peripheral processing).28 Thus, a reasonable goal for improving the use of expert evidence is to find ways to facilitate an increased ratio of central to peripheral processing of trial information.29

Trials offer fact-finding benefits as well as challenges. The advantages might be leveraged for further improvement. We offer specific suggestions below, some modest, others more controversial. Some are based on findings derived from empirical research; others are in need of testing. These include:

  • Presenting expert evidence to maximize understanding;

  • Restructuring the trial to maximize understanding;

  • Implementing trial procedure reforms that promote understanding;

  • Educating judges;

  • Educating juries;

  • Ensuring diverse juries and robust deliberation; and

  • Changing the factfinder to special juries and expert judges.

Trials are inherently educational forums. The whole exercise is about communicating relevant information to factfinders for decision-making. Trial procedures can be tweaked so that their capacity for educating is improved. Judges have considerable discretion to manage evidence before and during the trial, so long as they do not unduly burden the fundamental right of the parties to assemble and present the evidence.

Where there is a battle of experts, jurors may end up skeptical of both sides, undermining their use of relevant expert evidence in the decisions.30 But smart and capable lawyers on one or both sides, with the cooperation of the judge, should be able to find ways to work with their experts to provide factfinders with sound and comprehensible information, such that the case rooted in sounder science helps itself while facilitating better decision-making.

There are ways to present unfamiliar or complex information so that it can be better understood and used in trial decisions. Attorneys and their expert witnesses can and should adopt these methods. Psychologist Gerd Gigerenzer and colleagues, for example, have put much energy into finding ways to make statistical presentations more intuitively understandable. Their suggestions include: use numbers, not just words to describe quantities and risks; present numbers in data tables; use natural frequencies rather than conditional probabilities; use frequencies rather than single-event probability statements; and report absolute risks, not relative risks.31 Researchers have also recommended communicating numerical information using visual aids such as bar graphs, pie charts, 2×2 tables, and Venn diagrams. Well-crafted visual displays can help jurors understand probabilities and magnitudes and can help them avoid framing effects (a form of cognitive bias resulting from how information or questions are presented).32

Applying these and other educational approaches to the courtroom context has thus far generated mixed results. In one study, participants from a county jury pool had great difficulty inferring causality from the data in a 2×2 contingency table representing evidence in a toxic tort case (one in which the claim is that exposure to a toxic substance caused some person or persons to suffer adverse health effects). None of the various explications by an epidemiologist expert witness (how contingency tables work, how relative risk and odds ratios are calculated, how to properly interpret contingency table data) improved the participants' ability to reach correct inferences about causation or its absence.33 More testing of suggested techniques is needed in trial settings, but we are optimistic that research will find ways that enable attorneys and their expert witnesses to make more comprehensible the evidence they present to juries. Courts might also consider sharing experts' reports with juries, whether or not the parties request it.

In addition to clearer presentations, experts could help jurors by conveying more about areas of consensus in their fields. Some experts are criticized for advocating idiosyncratic views at odds with the majority view in their field, but judges and jurors without specialist knowledge have little ability to determine how common or infrequent the allegedly idiosyncratic views are. Though being in the mainstream is no guarantee of correctness, survey studies of experts about where the consensus lies regarding various phenomena could help factfinders put a trial expert's assertions in context.34

Judges have more power to regulate trial structure and proceedings than they typically exercise. Before a trial begins, courts could work harder with the parties to help them resolve disputes and stipulate to the conclusions to some, if not all, of the highly technical issues that might arise in a case, thereby removing them from controversy at trial (with jurors instructed on what the agreed-upon conclusions were). “Hot-tubbing,” a procedure used in Australia and Canada that begins with experts meeting together without the parties or their lawyers before the trial, could aid in identifying areas of agreement and disagreement, as Nancy Gertner and Joseph Sanders discuss in their contribution to this issue.35

At trial, judges might improve their own as well as jurors' comprehension by requiring that opposing evidence on difficult scientific or technical issues be offered back-to-back, juxtaposing expert witnesses with competing views on the same topic.36 Thus, instead of hearing from a plaintiff's expert witness and not hearing from the defense's rebuttal expert until much later, a court could order that the direct and cross-examination of the defense witness occur immediately following the direct and cross of the plaintiff's expert. This procedural reform, however, is not without its challenges, as Gertner and Sanders discuss in greater depth.

On the criminal side, resources are often so imbalanced that special funding or other procedures are needed so that the defense is able to present expert evidence and prevent gatekeeping judges and factfinders from reaching decisions based on incomplete or distorted pictures of the state of the science. Our discussion of forensic sciences that lack scientific validity and the large role that forensic science evidence has played in producing wrongful convictions provides more than enough cautionary tales to justify wariness about one-sided presentations by interested experts.37

Optimally educating jurors will require changes in the way courts do things. If trials are to serve the parties and the larger society, means must be found to overcome inherent limitations that exist at the outset of the trial process. Traditional jury trials operate with the assumption that jurors are empty vessels who passively receive the evidence presented by the parties, refrain from forming even preliminary opinions, and wait until the trial has concluded to deliberate and decide the case. In most courtrooms, jurors are not allowed to ask questions of the witnesses or to talk with one another until the end of the trial. In a traditionally conducted complex trial, juror confusion and mistakes in interpreting scientific testimony during the case presentation can neither be detected nor corrected as they occur.

The American Bar Association's 2005 report Principles for Juries and Jury Trials advocates “active jury” trial practices to promote juror understanding.38 Allowing jurors to clarify evidence and issues by permitting them, under carefully controlled conditions, to submit questions for witnesses, and allowing jurors to talk to one another during the trial so they can discuss scientific evidence while it is fresh in their minds, could promote better understanding and use of scientific evidence.

There is now a modest body of research on active jury reforms, including note-taking, question-asking, and juror discussions. Jurors who serve in trials in which they are able to ask questions and to talk with other jurors during breaks have provided generally positive feedback about these changes, and few if any negative effects have been detected.39 Jurors who have the opportunity to take notes also typically perform better.40 One experiment assessing how well mock jurors understood scientific evidence found that those using checklists and jury notebooks performed better than jurors not allowed to employ these innovations.41

Judges and lawyers often greet active jury reforms with skepticism, but most change their views after participating in a trial in which the reforms are employed. The scientists and engineers surveyed by Shari Diamond and Richard Lempert and reported on in this issue also appear to prefer a more educational approach: 57.7 percent said they would be more likely to participate as expert witnesses if they could answer jurors' questions following their testimony.42

Judges have a variety of educational programs in law and science from which to choose. These range from panels lasting an hour or two in continuing judicial education programs, to day-long focused sessions, to four-to-six-week summer courses at universities such as Duke and Virginia. The potentially most useful of these efforts seek to teach judges how to be more thoughtful, critical consumers of specialized knowledge. We are not, however, aware of any systematic empirical attempts to see whether these efforts have enabled judges to better understand the science and technology issues that arise when they preside over trials.

Other programs focus on substantive science. For example, the Federal Judicial Center's (fjc) Education Division collaborates with universities to offer short courses on such topics as neuroscience and law, law and the biosciences, and the economics of antitrust law.43 Former Education Division Director Bruce Clark told us that judicial education programs at the fjc and elsewhere are increasingly using more active, engaged methods of teaching, which seems promising.

The Reference Manual on Scientific Evidence, now in its third edition, also attempts to educate judges on science. It provides well-informed guides to specific scientific fields, written by experts in those fields.44 The goal is to aid judges in managing cases with scientific and technical evidence. Chapters review and explain the science that commonly arises in legal cases, including such matters as dna analysis, engineering, mental health evidence, survey methodology, epidemiology, and statistics.

Researchers have also suggested tutorials on technical and scientific topics for judges. Litigators Jeffrey Snow and Andrea Reed have outlined an approach to using tutorials to educate judges in patent cases: “The technical tutorial has few common ground rules. In its most general form, the technical tutorial is a non-evidentiary presentation for the educational benefit of the district court judge.”45 They distinguish between an adversarial approach to construction of the tutorial, in which each party has its own experts explain the underlying science, and the possibility of having both parties agree on a neutral court-appointed expert to provide technical background as a witness. Alternatively, the parties might collaborate on a report or video that the judge can review on his or her own. Although tutorials seem useful and judges request them, we know of no research on the effectiveness of technical tutorials in patent cases.

Jurors, too, might receive pretrial education and training through tutorials tailored to the science they are likely to encounter. However, although the idea has been floated and used on at least a few occasions, we know of no jurisdiction where it has been implemented as a routine practice when scientific evidence is involved.46

Research suggests that brief yet effective education in specific intellectual skills is possible. Social psychologist Richard Nisbett and colleagues have developed and tested a training intervention that attempts to teach laypeople the statistical concept of the “law of large numbers.” The intervention consists of two parts: “rule training” involves reading a description of the law of large numbers, and “example training” involves a worksheet containing three sample problems that highlight the various principles of the law of large numbers, followed by a written explanation and analysis of the problems. The greatest improvement in statistical reasoning was achieved by those participants who received both rule- and example-based training.47

Using the rule-plus-example approach, Schweitzer and Saks tried to improve upon past (unsuccessful) efforts to train jurors to understand scientific causation.48 The brief, non–case-specific intervention aimed to teach jurors to understand and identify the three requisites of causal inference: temporal precedence, covariation, and nonspuriousness. Jurors' grasp of the concepts was tested by presenting a videotaped mock toxic tort trial. The critical evidence was a study, presented by an expert witness, that tested the causal relationship between the defendant's product and lung disease through either a properly designed experiment or one in which one or another of the key elements of causal inference was absent. Untrained jurors were unable to distinguish the well-designed experiment from any of the defectively designed experiments. Trained jurors were better able to assess the quality of the research, and their verdicts reflected their sounder understanding.

Jonathan Koehler would go further and provide jurors with a “comprehensive pretrial training program” that would teach logical inference, how to distinguish between weak and strong evidence, how to combine pieces of evidence, and how to apply law to facts; test the jurors' performance; and exclude from service those who are not up to par.49 Excluding jurors on these grounds might well undermine the jury's ability to represent the community, however.

Perhaps the most ambitious study to date of jury tutorials is an Australian project that gave only some mock jurors hearing a dna case a dna tutorial as part of the expert evidence in a case.50 The tutorial, developed in consultation with scientific and forensic experts, devoted twelve minutes to the science of dna profiling and five minutes to understanding random match probabilities, a key concept in assessing the meaning of a dna match. Some participants heard an expert orally deliver the tutorial, while others heard an expert give the same talk accompanied by multimedia displays. Still others served in a control condition, receiving no expert evidence. Mock jurors then decided a case in which the dna evidence was crucial. Most participants began knowing little about dna. Those who started knowing the least about dna knowledge tended to express undue belief in dna evidence; those knowing more about dna were more skeptical at the start of the trial. The expert evidence that included the dna tutorial significantly improved jurors' understanding. Compared with those in the control condition, who received no tutorial, those hearing any version of the tutorial showed greater comprehension of dna identification.

In this study, the multimedia presentation of evidence did not significantly improve comprehension beyond the gains produced by the oral presentation alone, though it did more to close the gap between less knowledgeable jurors and those with greater knowledge. Whether the same would be true of such dramatically new media forms as virtual reality and augmented reality cannot be known, but these applications might turn out to be unusually effective and efficient teaching tools.51

The fjc has developed tutorials for use in patent jury trials.52 Roderick McKelvie, then a district court judge in Delaware, encouraged the fjc to prepare a tutorial video to educate juries in patent trials. He joined a group of patent lawyers and judges who contributed to the text for the video, which was then reviewed by the fjc and other experts. The first video, seventeen minutes long, was released in 2002 and updated in 2013. The videos did not seek to educate jurors on the scientific matters at issue in a case, but rather offered background information about what a patent is, the place of patents in society, and the work of the U.S. Patent and Trademark Office (pto). The fjc aimed “to present a balanced view of the patent process” but cautioned judges to “review it carefully and consult with counsel before deciding whether to use it in a particular case.”53

Some patent lawyers criticized the 2002 video as unbalanced.54 The script did not concern them, but the images did. The visual portrayal of “conscientious, hard-working examiners” seemed to favor patentees, although other images of the “piles” of patent applications and “endless rows” of files seemed to suggest overworked and overwhelmed patent examiners, favoring defendants. One jury-consulting firm presented the 2002 video in mock jury exercises in five venues across the United States.55 Mock jurors' responses before and after seeing the tutorial were compared, showing dramatic improvements in reported understanding of patents. For example, before watching the video, a majority (57 percent) said they did not understand what a patent claim was, but that number dropped to 4 percent after the video. Just 24 percent initially knew that a patent granted by the pto could be invalidated by a judge or a jury; afterward, that number jumped to 63 percent. The consultant concluded that the video was effective in educating juries about both pro-plaintiff and pro-defense perspectives. A repeat of the study using the 2013 fjc patent video produced similar results.56 Research that examines whether patent tutorials improve juror understanding of expert evidence in patent trials would be of substantial value.

The fact that juries engage in group decision-making allows juries to bring more intellectual resources to their task than any one person, including a judge, can deliver. Indeed, juries have the potential, depending upon the methods used to recruit them, to possess knowledge, experience, and analytic capacity that exceeds that of most judges. The sheer fact that juries are groups provides advantages. Where all citizens are required to serve, with very limited excuses granted, juries will be composed of people from all kinds of educational and occupational backgrounds. This means they will not infrequently include people with scientific, technical, and quantitative capabilities that few judges possess.57

The more jurors on the jury, the greater the chances of having some who are able to understand difficult subject matter. If the trend toward smaller juries of six or eight cannot be reversed entirely, complex cases at least ought to be tried to twelve jurors because deliberations are likely to be richer with greater educational potential. An individual juror who has a better grasp of the scientific evidence presented at a trial can explain the meaning and significance of the evidence to the other jurors, increasing their ability to properly weigh the scientific information.58

In a mock jury experiment in which mitochondrial dna (mtdna) was the focus of expert testimony, researchers examined the impact of deliberation on jurors with lower and higher levels of comprehension.59 Jurors' prior knowledge, as evidenced by science and mathematics courses they had taken, increased their ability to benefit from deliberation. However, mock jurors with lower initial levels of comprehension gained the most from deliberations.

Judges (or special masters appointed by judges to initially hear cases and report back on their findings) are sometimes suggested as an alternative to the jury in complex cases. Several decades ago, there were cases in which lawyers asked courts to recognize a “complexity exception” to the right to a jury trial, arguing that where it was thought that juries could not adequately understand the evidence, the case must be tried to a judge. Appellate courts divided on whether such an exception should exist.60 Regardless of whether a party might be denied a jury, it is worth noting that generalist judges may be no more able to master the intricacies of complex, expert scientific testimony than a representative jury. Reviewing a set of complex cases, Lempert concluded that when judges were competent and well organized, the juries they supervised were effective as well.61 If judges are to be used as an alternative to juries, they might do better if drawn from panels specially chosen for having relevant knowledge or if they sat as three-judge courts.

Complex cases might also be tried by special juries, drawn from pools of people with more formal education or particularly relevant experience or training. Special or “blue ribbon” juries have a long history in England and the United States. The earliest documented special jury convened in England in 1351: a jury of cooks and fishmongers for a defendant charged with selling bad food.62 Other special juries in early England included juries of matrons tasked with determining whether a woman defendant was with child, and a jury of business-people in a business contract case. In the United States, there was a time when almost half the states had special jury statutes for use in cases of high importance or great difficulty, although that number has dwindled. Special juries in the United States are rare today, owing partly to statutory requirements of cross-sectional representation on jury panels, but also to increased appreciation of the fact-finding benefits and symbolic significance of representative juries.63 Even without using a special jury, judges and lawyers could employ voir dire questions to explore scientific competence in an effort to increase the proportion of highly numerate or better-educated jurors.

Numerous studies have found that people with higher educational attainment generally and greater familiarity with mathematics and science in particular are better able to understand scientific and other technical information and to apply that understanding to solving problems.64 People high in numeracy have been found better able than their low-numeracy peers to comprehend and apply numerical principles, and they are somewhat less susceptible to being influenced by framing and other irrelevant factors.65 Research on the dynamics of juries with one or a few such members is limited. But, clearly, the juries they are on have the potential to benefit from their more knowledgeable members. Some studies have found that jurors with relevant knowledge are recognized by their peers and placed in leadership positions.66 To what extent their oversized influence is beneficial or not remains to be discovered.

We raise three caveats about these special juries. First, it is clear that numeracy and advanced education are not panaceas.67 Judges and highly numerate individuals make processing mistakes and are influenced by common heuristics and biases.68 Second, recent research finds that in controversial areas of science, people with substantial backgrounds and advanced education in a field may be more biased in their evaluations than those who are less knowledgeable.69 Relatedly, these highly knowledgeable jurors tend to be disproportionately influential in the jury deliberation, as others defer to their superior knowledge. Third, selecting jurors using one attractive characteristic may have unexpected negative consequences, since individual characteristics do not exist in isolation. More men than women major in science, for example. Educational attainment is linked to race, income, and political affiliation. Blue ribbon juries are likely to fail to adequately reflect the attitudes and experiences of the community, particularly in deciding on matters like damages. Moreover, scientific matters may not be the only matters in dispute; correctly resolving a purely scientific question may be only one part of the decision. As we discussed above, diverse juries composed of people from different parts of the community have their own fact-finding advantages, which could be lost if we selected jurors mainly for their educational attainment.

Generalist judges and lay juries face considerable challenges in trials with scientific evidence. Yet the adversary trial provides us with opportunities to modify procedures or educate or select factfinders to maximize the ability of judges and juries to understand expert scientific evidence and to use it effectively to resolve a case. We have suggested a number of reforms, but more study of possible changes is needed. We must collect data and run experiments; that is, we should take a scientific approach to deciding on those reforms that will best enable judges and juries to cope with modern scientific evidence.


Daubert v. Merrell Dow Pharmaceuticals, Inc., 43 F.3d 1311, 1316 (1995). The common law allowed expert evidence to be introduced only when the expert was testifying to matters beyond the ken of the average juror. This requirement was incorporated into Federal Evidence Rule 702, which allowed for expert evidence in circumstances in which the knowledge would assist the trier of fact in understanding the evidence or determining a fact in issue.


Learned Hand, “Historical and Practical Considerations Regarding Expert Testimony,” Harvard Law Review 15 (1) (1901): 54.


Daniel L. Rubinfeld and Joe S. Cecil, “Scientists as Experts Serving the Court,” Dœdalus 147 (4) (Fall 2018).


Brandon Garrett, Convicting the Innocent: Where Criminal Prosecutions Go Wrong (Cambridge, Mass.: Harvard University Press, 2012).


John Lentini, “Fires, Arsons, and Explosions,” in Modern Scientific Evidence: The Law and Science of Expert Testimony, ed. David L. Faigman, David H. Kaye, Michael J. Saks, et al. (Toronto: Thomson Reuters, 2010).


Steve Mills and Maurice Possley, “Texas Man Executed on Disproved Forensics,” Chicago Tribune, December 9, 2004 [describing Willingham case].


National Fire Protection Association, NFPA 921: Guide for Fire and Explosion Investigations (Quincy, Mass.: National Fire Protection Association, 1st ed., 1992; 2nd ed., 1995).


Rachel Dioso-Villa, “Scientific and Legal Developments in Fire and Arson Investigation Expertise in Texas v. Willingham,” Minnesota Journal of Law, Science, and Technology 14 (2013).


National Research Council, Strengthening Forensic Science in the United States: A Path Forward (Washington, D.C.: National Academies Press, 2009); and President's Council of Advisors on Science and Technology, Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature-Comparison Methods (Washington, D.C.: Executive Office of the President, 2016).


National Research Council, Strengthening Forensic Science in the United States, 1–14 [see note 9].


John Lentini, “Fires, Arsons, and Explosions,” in Faigman et al., eds., Modern Scientific Evidence [see note 5].


National Research Council, On the Theory and Practice of Voice Identification (Washington, D.C.: National Academies Press, 1979); National Research Council, Committee on Scientific Assessment of Bullet Lead Elemental Composition Comparison, Forensic Analysis: Weighing Bullet Lead Evidence (Washington, D.C.: National Academies Press, 2004); and Paul C. Giannelli, “Forensic Science: Daubert's Failure,” Case Western Reserve Law Review (forthcoming),


Michael J. Saks, Thomas Albright, Thomas L. Bohan, et al., “Forensic Bitemark Identification: Weak Foundations, Exaggerated Claims,” Journal of Law and the Biosciences 3 (2016).


Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993).


When he was a federal district judge deciding Daubert on remand from the U.S. Supreme Court, Alex Kozinski (later, chief judge of the U.S. Ninth Circuit Court of Appeals) suggested that forensic science should be exempted from the validity test that he was required to apply to the epidemiological analysis he found inadequate in Daubert. At that time, he believed unreservedly in the claims of forensic science experts, without expecting them to demonstrate empirically the soundness of their assertions. Daubert v. Merrell Dow Pharmaceuticals, Inc., 43 F.3d 1311 (1995). Kozinski, after criticizing experts who acquire their knowledge for purposes of litigation, states that “there are, of course, exceptions. Fingerprint analysis, voice recognition, dna fingerprinting and a variety of other scientific endeavors closely tied to law enforcement may indeed have the courtroom as a principal theatre of operations…. As to such disciplines, the fact that the expert has developed an expertise principally for purposes of litigation will obviously not be a substantial consideration.” Two decades later, after working with a White House advisory council on the forensic sciences, his views had changed dramatically: “Virtually all of these methods are flawed, some irredeemably so.” “Why,” he asked, “trust a justice system that imprisons and even executes people based on junk science?” See Alex Kozinski, “Rejecting Voodoo Science in the Courtroom,” The Wall Street Journal, September 19, 2016.


Richard Posner and Frank Easterbrook, who at the time were both eminent judges of the U.S. Seventh Circuit Court of Appeals, famously excoriated both judges and lawyers for their fear of science. Jackson v. Pollion, No. 12–2682 (7th Cir. 2013).


N. J. Schweitzer and Michael J. Saks, “The Gatekeeper Effect: The Impact of Judges' Admissibility Decisions on the Persuasiveness of Expert Testimony,” Psychology, Public Policy, and Law 15 (1) (2009): 1–18.


Daniel Kahneman, Thinking, Fast and Slow (New York: Farrar, Straus and Giroux, 2011).


See, for example, Special Committee on Jury Comprehension of the American Bar Association Section of Litigation, Jury Comprehension in Complex Cases (Chicago: American Bar Association, 1989); and Joseph Sanders, Shari S. Diamond, and Neil Vidmar, “Trial Lawyer Perceptions of Expert Knowledge,” Psychology, Public Policy, and Law 8 (2) (2002): 139–153. Richard Lempert, “Civil Juries and Complex Cases: Taking Stock After Twelve Years,” in Verdict: Assessing the Civil Jury System, ed. Robert E. Litan (Washington, D.C.: Brookings Institution Press, 1993), finds that jury performance is more likely to be a problem in those cases that are “complex” because of the technical nature of the evidence; and Joseph Sanders, “The Jury Deliberation in a Complex Case: Havner v. Merrell Dow Pharmaceuticals,” Justice System Journal 16 (2) (1993): 45–67, reaches a similar conclusion based on jurors' answers to researcher questions about testimony in a complex evidence case involving the drug Bendectin, concluding that the jurors had a weak grasp of the science resulting in an indefensible verdict.


Richard H. Thaler, Misbehaving: The Making of Behavioral Economics (New York: W. W. Norton and Company, 2015).


Joseph Sanders, Michael J. Saks, and N. J. Schweitzer, “Trial Factfinders and Expert Evidence,” in Faigman et al., eds., Modern Scientific Evidence [see note 5]; Daniel A. Krauss and Bruce D. Sales, “The Effects of Clinical and Scientific Expert Testimony on Juror Decision Making in Capital Sentencing,” Psychology, Public Policy, and Law 7 (2) (2002): 267–310; and Shari Seidman Diamond, Jonathan D. Casper, Cami L. Heiert, and Anna-Maria Marshall, “Juror Reactions to Attorneys at Trial,” Journal of Criminal Law and Criminology 87 (1) (1996): 17–47.


Jason Schklar and Shari Seidman Diamond, “Juror Reactions to DNA Evidence: Errors and Expectancies,” Law and Human Behavior 23 (2) (1999): 159–184.


Jonathan J. Koehler, “When Are People Persuaded by dna Match Statistics?” Law and Human Behavior 25 (5) (2001): 493–513; and William C. Thompson, “Are Juries Competent to Evaluate Statistical Evidence?” Law & Contemporary Problems 52 (1989): 9–41.


Koehler, “When Are People Persuaded by dna Match Statistics?” [see note 23].


Richard Lempert, “Befuddled Judges: Statistical Evidence in Title VII Cases,” in Legacies of the 1964 Civil Rights Act, ed. Bernard Grofman (Charlottesville: University of Virginia Press, 2000).


Gerd Gigerenzer, Calculated Risks: How to Know When Numbers Deceive You (New York: Simon and Schuster, 2002).


Neil Vidmar, “Are Juries Competent to Decide Liability in Tort Cases Involving Scientific/Medical Issues? Some Data from Medical Malpractice,” Emory Law Journal 43 (1994): 885–911; and Neil Vidmar, Medical Malpractice and the American Jury: Confronting the Myths about Jury Incompetence, Deep Pockets, and Outrageous Damage Awards (Ann Arbor: University of Michigan Press, 1997).


Shari Seidman Diamond and Jonathan D. Casper, “Blindfolding the Jury to Verdict Consequences: Damages, Experts and the Civil Jury,” Law and Society Review 26 (3) (1992): 513–564.


For example, see Neil J. Vidmar and Regina A. Schuller, “Juries and Expert Evidence: Social Framework Testimony,” Law and Contemporary Problems 52 (4) (1989): 133–176.


Diamond and Casper, “Blindfolding the Jury to Verdict Consequences” [see note 28]; Mitchell Pacelle, “Contaminated Verdict,” The American Lawyer, December 1986, 75–80; Molly Selvin and Larry Pinkus, The Debate over Jury Performance: Observations from a Recent Asbestos Case (Santa Monica, Calif.: The rand Corporation, 1987); and Daniel Shuman and Anthony Champagne, “Removing the People from the Legal Process: The Rhetoric and Research on Judicial Selection and Juries,” Psychology, Public Policy, and Law 3 (2–3) (1997): 242–258.


Gerd Gigerenzer, Wolfgang Gaissmaier, Elke Kurz-Milcke, Lisa M. Schwartz, and Steven Woloshin, “Helping Doctors and Patients Make Sense of Health Statistics,” Psychological Science in the Public Interest 8 (2) (2008): 53–96; and Gerd Gigerenzer and Adrian Edwards, “Simple Tools for Understanding Risks: From Innumeracy to Insight,” British Medical Journal 327 (7417) (2003): 741–744.


Rebecca Helm, Valerie P. Hans, and Valerie F. Reyna, “Trial by Numbers,” Cornell Journal of Law and Public Policy 27 (107) (2017): 107–143.


Molly Treadway, “An Investigation of Juror Comprehension of Statistical Proof of Causation” (unpublished Ph.D. diss., John Hopkins University, 1990), 54–58.


Saul M. Kassin, V. A. Tubb, H. M. Hosch, and A. Memon, “On the ‘General Acceptance’ of Eyewitness Testimony Research: A New Survey of Experts,” American Psychologist 56 (5) (2001): 405–416. For further discussion, see Shari Seidman Diamond, “Reference Guide on Survey Research,” in Stephen G. Breyer, Margaret A. Berger, David Goodstein, et al., Reference Manual on Scientific Evidence, 3rd ed. (Washington, D.C.: Federal Judicial Center and National Academy of Sciences, 2011), 359–423.


Edie Greene and Natalie Gordon, “Can the ‘Hot Tub’ Enhance Jurors' Understanding and Use of Expert Testimony?” Wyoming Law Review 16 (2) (2016): 359–386. See also Nancy Gertner and Joseph Sanders, “Alternatives to Traditional Adversary Methods of Presenting Scientific Expertise in the Legal System,” Dœdalus 147 (4) (Fall 2018).


G. Thomas Munsterman, Paula L. Hannaford-Agor, and G. Marc Whitehead, Jury Trial Innovations, 2nd ed. (Williamsburg, Va.: National Center for State Courts, 2006). See also Gertner and Sanders, “Alternatives to Traditional Adversary Methods of Presenting Scientific Expertise in the Legal System” [see note 35].


Michael J. Saks and Jonathan J. Koehler, “The Coming Paradigm Shift in Forensic Identification Science,” Science 309 (5736) (2005): 892–895.


American Bar Association, Principles for Juries and Jury Trials (Chicago: American Bar Association, 2005); and Valerie P. Hans, “U.S. Jury Reform: The Active Jury and the Adversarial Ideal,” St. Louis University Public Law Review 21 (1) (2002): 85–97.


Hans, “U.S. Jury Reform” [see note 38]; and Valerie P. Hans, David H. Kaye, B. Michael Dann, Erin J. Farley, and Stephanie Albertson, “Science in the Jury Box: Jurors' Comprehension of Mitochondrial dna Evidence,” Law and Human Behavior 35 (2011): 60–71.


B. Michael Dann and Valerie P. Hans, “Recent Evaluative Research on Jury Trial Innovations,” Court Review 41 (2004): 12–19 [reviewing studies of juror notetaking]; and Lynne ForsterLee and Irwin Horowitz, “The Effects of Jury-Aid Innovations on Juror Performance in Complex Civil Trials,” Judicature 86 (4) (2003): 184–189 [research study showing improved performance for jurors allowed to take notes].


B. Michael Dann, Valerie P. Hans, and David H. Kaye, “Can Jury Trial Innovations Improve Juror Understanding of dna Evidence?” Judicature 90 (2007): 152–156.


Shari Seidman Diamond and Richard O. Lempert, “When Law Calls, Does Science Answer? A Survey of Distinguished Scientists and Engineers,” Dœdalus 147 (4) (Fall 2018).


Federal Judicial Center, “Programs and Resources for Judges, Special Focus Programs,” (accessed May 26, 2017).


Breyer et al., Reference Manual on Scientific Evidence, 3rd ed [see note 34].


Jeffrey L. Snow and Andrea B. Reed, “Technical Advisors and Tutorials: Educating Judges,” Intellectual Property Litigation 21 (1) (2009): 1, 21–22.


Munsterman et al., Jury Trial Innovations [see note 36].


Richard E. Nisbett, Geoffrey T. Fong, Darrin R. Lehman, and Patricia W. Cheng, “Teaching Reasoning,” Science 238 (1987): 625–631; and Geoffrey T. Fong, David H. Krantz, and Richard E. Nisbett, “The Effects of Statistical Training on Thinking about Everyday Problems,” Cognitive Psychology 18 (1986): 253–292.


N. J. Schweitzer and Michael J. Saks, “Jurors and Scientific Causation: What Don't They Know, and What Can Be Done About It?” Jurimetrics Journal 52 (2012): 433–455.


Jonathan J. Koehler, “Train Our Jurors,” Northwestern University School of Law Faculty Working Paper 141 (Chicago: Northwestern University School of Law, 2006),


Jane Goodman-Delahunty and Lindsay Hewston, “Improving Jury Understanding and Use of Expert dna Evidence,” Australian Institute of Criminology Technical and Background Paper Series No. 37 (Canberra City: Australian Institute of Criminology, 2010), The tutorials may be found at [dna]; and [rmp].


Reality Technologies, “How Reality Technology is Used in Education,”


Federal Judicial Center, “The Patent Process: An Overview for Jurors,” video presentation, January 1, 2013,




Heather N. Mewe and Darren E. Donnelly, “Going to the Videotape: An Introduction to the Patent System” (Mountain View, Calif.: Fenwick and West, 2006),


John Gilleland, “The Debate is On: Is the Federal Judicial Center's Patent Tutorial Video Too Pro-Plaintiff?” Illinois State Bar Association Newsletter, March 2012, section on intellectual property law.


John D. Gilleland e-mail to Valerie P. Hans, June 6, 2017.


Shari Seidman Diamond, Mary R. Rose, and Beth Murphy, “Embedded Experts on Real Juries: A Delicate Balance,” William and Mary Law Review 55 (2014): 885–933.


Diamond and Casper, “Blindfolding the Jury to Verdict Consequences” [see note 28].


Hans et al., “Science in the Jury Box,” 68 [see note 39].


Richard O. Lempert, “Civil Juries and Complex Cases: Let's Not Rush to Judgment,” Michigan Law Review 80 (1981): 68–132.


Richard O. Lempert, “Civil Juries and Complex Cases: Taking Stock after Twelve Years,” in Verdict: Assessing the Civil Jury System, ed. Robert E. Litan (Washington, D.C.: Brookings Institution Press, 1993), 181–247.


Neil Vidmar and Valerie P. Hans, American Juries: The Verdict (Amherst, N.Y.: Prometheus Books, 2007), 68–69 [on the blue ribbon jury].


Ibid., 69, 74–76. Nonetheless, jury pools still fall short of fully representing the community in many jurisdictions. See ibid., 76–81.


Hans et al., “Science in the Jury Box” [see note 39]; David H. Kaye, Valerie P. Hans, B. Michael Dann, Erin Farley, and Stephanie Albertson, “Statistics in the Jury Box: How Jurors Respond to Mitochondrial dna Match Probabilities,” Journal of Empirical Legal Studies 4 (2007): 797–834; and B. Michael Dann, Valerie P. Hans, and David H. Kaye, “Can Jury Trial Innovations Improve Juror Understanding of dna Evidence?” National Institute of Justice Journal 255 (2006): 2–6. See also Darrin R. Lehman, Richard O. Lempert, and Richard E. Nisbett, “The Effects of Graduate Training on Reasoning: Formal Discipline and Thinking about Everyday-Life Events,” American Psychologist 43 (6) (1988): 431–442; Darrin R. Lehman and Richard E. Nisbett, “A Longitudinal Study of the Effects of Undergraduate Training on Reasoning,” Developmental Psychology 26 (6) (1990): 952–960; Michael P. Weinstock, “Cognitive Bases for Effective Participation in Democratic Institutions: Argument Skill and Juror Reasoning,” Theory & Research in Social Education 33 (1) (2005): 73–102; and Valerie F. Reyna and Charles J. Brainerd, “The Importance of Mathematics in Health and Human Judgment: Numeracy, Risk Communication and Medical Decision Making,” Learning & Individual Differences 17 (2) (2007): 147–159.


Ellen Peters, Daniel Vastfjall, Paul Slovic, et al., “Numeracy and Decision Making,” Psychological Science 17 (5) (2006): 407–413.


Diamond and Casper, “Blindfolding the Jury to Verdict Consequences” [see note 28].


Valerie F. Reyna, Wendy L. Nelson, Paul K. Han, and Nathan F. Dieckmann, “How Numeracy Influences Risk Comprehension and Medical Decision Making,” Psychological Bulletin 135 (6) (2009): 943–973.


Kahneman, Thinking, Fast and Slow [general; see note 18]; Helm et al., “Trial by Numbers” [jurors; see note 32]; and Chris Guthrie, Jeffrey J. Rachlinski, and Andrew J. Wistrich, “Blinking on the Bench: How Judges Decide Cases,” Cornell Law Review 93 (2007): 1–43 [judges].


Rebecca Helm and James P. Dunlea, “Motivated Cognition and Juror Interpretation of Scientific Evidence: Applying Cultural Cognition to Interpretation of Forensic Testimony,” Penn State Law Review 120 (1) (2016); and Dan Kahan, Ellen Peters, Maggie Wittlin, et al., “The Polarizing Impact of Scientific Literacy and Numeracy on Perceived Climate Change Risks,” Nature Climate Change 2 (10) (2012): 732–735.