Abstract
Seventeenth-century “chance combinatorics” was a self-contained theory. It had an objective notion of chance derived from physical devices with chance properties, such as casts of dice, combinatorics to count chances and, to interpret their significance, a rule for converting these counts into fair wagers. It lacked a notion of chance as a measure of belief, a precise way to connect chance counts with frequencies and a way to compare chances across different games. These omissions were not needed for the theory’s interpretation of chance counts: determining which are fair wagers. The theory provided a model for how indefinitenesses could be treated with mathematical precision in a special case and stimulated efforts to seek a broader theory.
1. Introduction
The writing of the history of probability theory has been controlled by one question: How close were earlier ideas to modern probability theory? The traditional starting point is the correspondence between Fermat and Pascal in 1654 on the problem of points, posed by the Chevalier de Méré. It asks how to divide fairly the stakes between two gamblers in an interrupted game of dice.
The question is worth answering. Problems arise if it is the only question asked. Then we are led to a distorted picture of the historical development of ideas of chance. In it, earlier ideas of chance are incomplete or defective and their value lies only in fostering the views we now hold. We should ask a different question: What conception of chance did these earlier figures have?
My goal in this paper is to answer this second question for a conception of chance that came to maturity in the seventeenth-century. The main elements of the theory are recounted in Sections 2 and 3. Its application was limited to traditional games of chance that are played with simple, physical devices, such as cast dice, shuffled cards, and lottery drawings. They supply players and gamblers with a discrete set of equal chance outcomes. The theory’s notion of chance was objective and secured by the physical properties of the device realizing the chance behaviors. While this notion can be subsumed by later probabilistic accounts, it is a cogent notion in its own right. The major computational elements of the theory were the use of combinatorics to determine the chances of compound outcomes and a rule for determining which wagers in a game are fair. This last rule was the principal means of interpreting the chance counts. It enabled gamblers to determine which games favored them and which did not; and it enabled a practical understanding of the significance of various counts of chances.
What might it mean that one outcome has seven chances and another five? It is that a stake of five on the first to seven on the second is a fair wager. That understanding is conveyed without any actual need to place wagers. The theory deserves a name and I will use “chance combinatorics” for it. Its leading expositors were Girolamo Cardano and Christiann Huygens and it was a theory widely known and applied in the seventeenth-century.
Section 4 recounts what was missing from chance combinatorics. The theory lacked a notion of chance as a measure of belief; it had no precise connection between chances and frequencies; and it lacked a direct means to compare chances across different games. The theory counted equally like chances and did not form as the fundamental quantity the later notion of probability as the ratio of favorable to all chances. The omission of this ratio was not an oversight. It had no foundational role. That changed when Jacob Bernoulli sought a way to use Huygens’ analysis to assign chances to situations outside the tidy realm of games of chance. His version of the law of large numbers provided the means to recover probabilities a posteriori from frequencies. In the absence of this connection, frequencies were only seen to be loosely connected with chance counts and could not be used to compare chances in different games or to justify the rule for identifying fair wagers.
These omissions are no reason to dismiss chance combinatorics as an incomplete or defective theory. For the theory did not need these missing components to serve its function of assessing the relative chances of outcomes and discerning which are the fair wagers. When Huygens’ 1657 De Ratiociniis in Ludo Aleae presented a game theoretic derivation of the rule for fair wagers, it provided a satisfactory completion to chance combinatorics without consideration of frequencies. It was a self-contained theory, successful in its limited goals. It was stretched to its limits in 1693 when Samuel Pepys tasked Newton with a problem that required chance comparisons across different outcome spaces. The inability of chance combinatorics to make such comparisons explains the otherwise puzzling convolutions of Newton’s analysis.
Section 5 argues that it is hard to see why the 1654 correspondence of Fermat and Pascal has such prominence in our histories. Their analysis was mathematically sophisticated but proceeded fully within chance combinatorics. It used known methods on a known problem and did not add anything of foundational importance. Section 6 examines how historical literature came to elevate the importance of Fermat and Pascal’s correspondence and reports the later literature’s efforts to rectify this overestimation. Since this literature has been too concerned with finding modern probabilistic ideas in earlier writings, it left no place for chance combinatorics, which became the theory that history forgot.
Finally, Section 7 proposes an alternative historiography for probability. The seventeenth-century theory of chance combinatorics provided a model for later theorizing. It showed how indefinitenesses could be analyzed with mathematical precision in the narrow case of games of chance. Its success inspired the project of extending chance combinatorics to a wider range of indefinitenesses and eventually led to modern probabilistic analysis. The mathematical methods of modern analysis, based on the theory of additive measures, vastly outstrips the relatively meager powers of the mathematics of the seventeenth-century. However, interpretations of probability in the present analysis are fragmented. They have failed to provide a univocal understanding of probability that matches the simplicity of seventeenth-century chance combinatorics. In this one aspect, modern analysis falls short of its seventeenth-century model.
2. Chance Combinatorics
In the seventeenth-century, a serviceable theory of chance was brought to completion. It applied specifically to games of chance in which a physical device chose with equal favor among a finite set of outcomes. The theory’s goal was to compute the chances of different outcomes and to discern which bets are equally fair to all players. The notion of chance employed could later be subsumed by that of a uniform probability measure over a finite outcome space. However, since this modern theory was not available in the seventeenth-century, it could not then be understood in this way. In this sense, it was a non-probabilistic theory. The chance of an outcome was assessed by counting the number of primitive, equal chance cases that comprised it. The numerical notion of probability as the ratio of favorable to all chances was absent or merely introduced as an intermediate in other computations.
The pertinent notion of chance could only be loosely connected with frequencies. The precise application came in the theory’s identification of which are the fair wagers. It was the result of practical value to gamblers. Fair wagers could be accepted knowing that no player was advantaged. Deviations from them would be sought if they favored the player and avoided if they did not. It also gave a meaning to chance assessments that extend beyond the confines of the theory. A computation within the theory might tell us that one outcome arises with seven chances and another with five. The import of that difference could be conveyed by identifying a corresponding fair wager: a stake of five on the first to seven on the second. This import could not be recovered precisely using frequencies, since the theory asserted no precise connection to them. It could not interpret equal chance outcomes as those that arise equally often, near enough. It could interpret them as those in which one could take either side of an equal stakes bet without either gaining an advantage. The components of the theory were:
Setting: A finite set of primitive cases with selection among them by a physical process, such as die casts, whose mechanical operation assured the equality of their chances.
A mathematical component: The combinatorics of counting primitive cases. It answered question such as “how many combinations of casts of two dice give a sum of seven?”
An interpretive component: for a given game where equal chance outcomes are known, which are the wagers that are equally fair to all players.
Long ago were determined, in the simplest games, the ratios of the chances which are favorable or unfavorable to the players; the stakes and the bets were regulated according to these ratios.
3. What Is in Chance Combinatorics
Two self-contained treatises on chances embody this theory of chance combinatorics. The first is Cardano’s Liber de Ludo Aleae (Book on Games of Chance). The text was, as reported by David (1955, p. 43), written and rewritten in fragments between 1525 and 1565. It was discovered among his papers posthumously and first printed in his collected works as Cardano (1663). Ore (1953) gives an extended analysis of it and includes an English translation, Cardano ([1663] 1953). It is a boisterous work, written by someone who himself gambled extensively and had considerable mathematical abilities. The work is a mixture of practical advice from an experienced gambler, raucous anecdotes, and the mathematical analysis of chance. It includes all the elements of the theory of chance combinatorics and additional constructions, idiosyncratic to Cardano. It is a challenging work to read since it is not always internally consistent and, for this reason, authors like Todhunter (1865, p. 3) are impatient with the text.
Huygens’ (1657) De Ratiociniis in Ludo Aleae (On Reasoning in Games of Chance) was, by contrast, a work of disciplined analysis by a careful mathematician. Its novel contribution was to provide a non-frequentist, game-theoretic justification of the rule that specifies which are the fair wagers. This justification perfected the theory of chance combinatorics. At least two translations into English were prepared within the ensuing half century (Arbuthnot 1692; Brown 1714). It was an influential work.
3.1. Setting: Equal Chance Cases
3.1.1. Cardano’s Principle
The analyses of Cardano’s Liber require that the casting of dice and related devices produce equal chance outcomes. This he makes clear in his Chapter VI ([1663] 1953, p. 189):
6. The Fundamental Principle of Gambling
The most fundamental principle of all in gambling is simply equal conditions, e.g., of opponents, of bystanders, of money, of situation, of the dice box, and of the die itself.
3.1.2. The Archeology of Physical Devices With Chance Properties
Dice and other similar chancy devices have been found in virtually all eras of history, including many centuries BCE.1 Their forms can provide some indication of the concepts of chance entertained by their users.
Since nature does not supply ready-made cubes, their ancient presence suggests that dice makers found it worth the effort to craft a shape that guarantees equal chances for each of the six sides in a fair cast. This presumes that they had a local concept of the chance of each face. We have anecdotal reports of ancient successes in forming fair cubical dice. Hacking (2006, p. 4) found dice in the Cairo Museum of Antiquities “exquisitely well balanced.” David (1955, pp. 6–7) reported less success in her investigations: “Many dice of the classical period have been thrown by the writer and they were nearly all biased but not all in the same way.”2
What complicates matters is that many forms of irregular dice-like devices were also recovered from ancient sites. They are dice that visibly deviate from cubical and irregularly shaped astragali or tali, now commonly known as knucklebones. Their popularity shows that many users were indifferent to whether each face arose with equal chances. Norton (2022) argues that these users would have found these irregular devices fit for their purposes if they approached the chance properties of the overall system globally. In gambling this means that the rules of play were such that each player has equal prospects of winning, independently of the probabilities of the individual device outcomes. In a popular ancient game, four tali are cast and the player takes the pool with a Venus (all four tali different); or must contribute to the pool with a “dog” (all ones).3 That the sides of the tali arise with different chances4 gave no player an advantage. They were all subject to the same rules.
What is important for our purposes is that dice makers gradually abandoned irregular shapes. In Artioli et al. (2011) examination of Etrurian dice in the eighth to third centuries BCE, 74 are described as “cube” and 17 as “parallelepiped.” Eerkens and de Voogt (2017, p. 169) examined 110 cubical dice recovered in the Netherlands and found that dice became more regular with time in the past two millennia:
… die symmetry increases steadily over time. Nearly 90% of the dice in our database that date before 650 CE have maximum sides that are more than 5% larger than the minimum (max/min > 1.05). After 1450 CE, less than 40% of the dice are similarly lopsided.
They judged that dice exceeding their 5% limit would be visibly irregular. This study shows that, by the end of the middle ages, there was a shift towards regularity in dice. This, along with the abandoning of knucklebones in gambling, indicates a growing concern for physical devices whose individual outcomes have equal chances.
3.1.3. Faked Dice
Another indication of the understanding that a regular die produces equal chance outcomes is the existence of faked dice. They outwardly resemble regular dice but have been surreptitiously tampered with to make the chances of different outcomes unequal. That the tampering must be surreptitious shows that it was not only the cheats who understood the effects of deviations from regularity. David (1955, p. 6) reports what might be a weighted die in Roman times. It has an opening that can be covered with a seal. Her description matches that of ivory dice recovered from the ruins of Pompei. In a museum photo, two ivory dice with round openings and round plate seals can be seen.5
In sixteenth-century England, gambling was widespread and, with it, much cheating. Aydelotte (1913, Ch. IV) is a revealing compendium of the many forms of cheating taking place. Reproducing a list from an early salacious exposé of many forms of cheating, Anon (1555)6, Aydelotte (1913, p. 91) lists fourteen different types of faked dice in the cheater’s outfit and decoded the form of faking in some of them: Fullams were dice loaded with quicksilver or lead: bristles were those with a short hair set in one side to prevent that face lying on the table. Capell conjectures that gourds were dice hollowed out on one side to accomplish the same result as loading.
“Flats” are dice reduced in length on one axis, so that the faces on that axis are more likely. A “langret” or “barred die” was elongated on the axis with faces marked three and four and thus made a cast of three or four more difficult. A passage in Anon (1555) is reproduced7 in Aydelotte (1913, pp. 91–2):
Lo here saith the chetor to this yong Nouisse, a well fauored die that semeth good and square: yet is the forhed longer on the cater and tray, then any other way, and therefore holdeth the name of a langret, such be also called bard cater [4] tres [3], bicause commonly the longer end will of his owne sway draw downwards, and turne vp to the eye sice [6] sinke [5], deuis [2] or ace [1], the principal vse of them is at Nouem quinque. So long as a paier of bard quater tres be walking on the bord so long can ye cast neither .v. nor .ix. onles it be by a great mischance that the roughnes of the bord, or some other stay, force them to stay and run against their kind. For without quater trey, ye wot that, v. nor .ix. can neuer fall.8
3.2. The Mathematical Component: Combinatorics
The above understanding of equal chance cases is, by itself, only of limited use. Real dice games were played with several dice. In such games, a player needs to be able to compare the chance of, say, a pair of dice yielding a sum of two or a sum of seven. The standard approach is to count the number of equal chance cases comprising each outcome. There is only one such case for the sum of two, but there are six cases for the sum of seven. A novice might imagine that a sum of three comes about only in one way: a one on one die and a two on the other. Someone more adept at case counting would recognize that it can come about in two ways: a one on the first die; and a two on the second; and the reverse.
3.2.1. De Vetula
There is ample evidence that this more adept understanding of the combinatorics was in wide circulation for centuries prior to the seventeenth-century. Strong evidence comes in the Latin poem De Vetula (“of the Old Lady”). It is nominally attributed to Ovid but was most likely written in the thirteenth-century by Richard de Fournival, Chancellor of the Cathedral of Amiens. It recounts Ovid’s disappointment in romantic engagements and how he turned to other pursuits. Part of the narrative includes a sustained account of how to count correctly the various combinations of die casts.10 It examines in great detail how the count should go for the casting of three dice. The results are summarized in a table reproduced in Figure 1.
My rather free translation is:
Each compound number has the following number of pips and ways of casting.
3 18 pips 1 casts 1
4 17 pips 2 casts 3
5 16 pips 2 casts 6
6 15 pips 3 casts 10
7 14 pips 4 casts 15
8 13 pips 5 casts 12 [21]
9 12 pips 6 casts 25
10 11 pips 6 casts 27
In all, 108 casts of all the pips.12
This was not an obscure manuscript in its time. Bellhouse (2000, p. 126) reports that nearly sixty copies still exist and that the poem was well cited. The earliest copies were produced manually by scribes. There were printed versions in 1479, 1534, and 1662 and even a French adaptation.
3.2.2. Cardano
Girolamo Cardano, in his Book on Games of Chance ([1663] 1953), was adept in the combinatorics of case counting. His presentation, however, is extended and idiosyncratic. Here is a part of his analysis of two die casts ([1663] 1953, p. 198):
In the case of two dice, the points 12 and 11 can be obtained respectively as (6, 6) and as (6, 5). The point 10 consists of (5, 5) and of (6, 4), but the latter can occur in two ways, so that the whole number of ways of obtaining 10 will be 1/12 of the circuit and 1/6 of equality. Again, in the case of 9, there are (5, 4) and (6, 3), so that it will be 1/9 of the circuit and 2/9 of equality. The 8 point consists of (4, 4), (3, 5), and (6, 2). All 5 possibilities are thus about 1/7 of the circuit and 2/7 of equality. The point 7 consists of (6, 1), (5, 2), and (4, 3). Therefore the number of ways of getting 7 is 6 in all, 1/3 of equality and 1/6 of the circuit. The point 6 is like 8, 5 like 9, 4 like 10, 3 like 11, and 2 like 12.13
This passage correctly counts the combinatorics associated with two die casts. For example, he notes that a sum of 10 arises from (5, 5) and from (6, 4), where the latter can arise in two ways, according to which die shows the 6. These 3 equal chance cases are 3/36 = 1/12th of the total number, 36, of chances. Cardano then reports the same result with his idiosyncratic notion of “equality.” We are to imagine the total number of equal chance outcomes to be divided into two equal parts. In this case, half of 36 equal chance cases is 18. An outcome in one such part has the same chance as an outcome in the other half. The three cases corresponding to a sum of 10 then constitute 3/18 = 1/6th of equality.
There are many more instances in the book of Cardano correctly computing the combinatorics for other chance set ups and include the more complicated case of three die casts. For more details, Ore (1953) provides an extensive analysis of Cardano’s computations and his use of the notion of “equality.”
3.2.3. Galileo
There continued to be sporadic indications of widespread knowledge of combinatoric computations. The most celebrated examples are of computations by Galileo and Newton. In both cases, each was approached by someone with a puzzle concerning die casts. Galileo and Newton obliged by carrying out the calculations. This shows a broader knowledge of these issues concerning dice casts. In order to resolve problems, the leading thinkers of the time were consulted for assistance.14 We have a record of Galileo and Newton’s analysis simply because they were prominent enough to have their papers preserved. Chance was not a major topic of research for either of them.
Galileo’s note, “Sopra le scoperte dei dadi” [concerning an investigation on dice], was written sometime between 1613 and 1623 at the instigation of “… him who has ordered me …”15 The content of the note is recounted by David (1962, Ch. 7) and a translation of the Galileo’s text is provided in David (1962, Appendix 2). The puzzle analyzed concerns the relative chances of throwing a sum of 9 or of 10 with three dice (or formally the same problem of a sum of 12 or 11). The question is motivated by the fact that, if we neglect the order in which they appear, sums of 9 and 10 both arise from six pip combinations:
9 from {6, 2, 1}, {5, 3, 1}, {5, 2, 2}, {4, 4, 1}, {4, 3, 2} and {3, 3, 3}
10 from {6, 3, 1}, {6, 2, 2}, {5, 4, 1}, {5, 3, 2}, {4, 4, 2} and {4, 3, 3}
The illusion that they have equal chances is dispelled, Galileo correctly notes, when we count how many ways each of these six pip combinations can be cast. If all the pips are unequal, such as {6,2,1}, they can arise in six casts. If two only are equal, such as {5,2,2}, they can arise in three casts. If all pips are equal, such as {3,3,3}, it can arise in only one cast. Multiplying by these factors, we have:
9 from 6x{6, 2, 1}, 6x{5, 3, 1}, 3x{5, 2, 2}, 3x{4, 4, 1}, 6x{4, 3, 2} and 1x{3, 3, 3}
10 from 6x{6, 3, 1}, 3x{6, 2, 2}, 6x{5, 4, 1}, 6x{5, 3, 2}, 3x{4, 4, 2} and 3x{4, 3, 3}
3.2.4. Newton
Newton’s combinatoric calculations came in response to a query from Samuel Pepys.17 It concerned the comparison of three outcomes: securing at least one six on a cast of six dice; at least two sixes on cast of 12 dice; or at least three sixes on a cast of 18 dice. Newton formulated the problem in a letter replying to Pepys of December 16, 1693, as a question concerning fair wagers (Newton 1961, p. 299):
A hath six dice in a box, with which he is to fling at least one six, for a wager laid with R. B hath twelve dice in another box, with which he is to fling at least two sixes, for a wager laid with S. C hath eighteen dice in another box, with which he is to fling at least three sixes, for a wager laid with T. The stakes of R, S, & T, are equal; what ought A, B, & C to stake, that the parties may play upon equal advantage?
A stakes 31,031 to R’s 15,625
B stakes 1,346,704,211 to S’s 830,078,125
A stakes 665l. 2s. 1/2d. to R’s (1,000l − 665l. 2s. 1/2d.)
B stakes 618l. 13s. 4d. to S’s (1,000l − 618l. 13s. 4d.)19
3.3. The Interpretive Component: Fair Wagers
The first two components of the theory of chance combinatorics enable the association of different numbers of equal chance outcomes to the outcomes of interest. The primary application of these number counts to game play was a specification of which are the fair wagers; that is, the wagers that favor no gambler in a game. Their identification also served to interpret the import of different chance counts for those who do not gamble. The simplest case has two outcomes with the same number of chances for each, such as a head or a tail on a coin toss, or an even or an odd number on a single die cast. A fair wager has gamblers placing equal stakes on each of the two outcomes. The winner then collects both stakes. What if two outcomes have different numbers of equal chances: one is associated in M chances and the other with N? Then the rule is that the stakes should be in proportion to the number of chances. A fair wager is M on the first and N on the second; or 2M on the first and 2N; and so on. Knowing which are the fair wagers is of great practical utility. Any deviation from the fair wagers will favor one gambler over the other; and prudent gamblers will always ensure that the deviations favor them.
3.3.1. Cardano
Cardano was keenly interested in the conditions under which a wager was fair. He was not just a disinterested theoretician. He gambled frequently. In elaborating his “Fundamental Principle of Gambling,” Cardano explained his concerns in stark terms ([1663] 1953, pp. 189–90):
The most fundamental principle of all in gambling is simply equal conditions, e.g. of opponents, of bystanders, of money, of situation, of the dice box, and of the die itself. To the extent to which you depart from that equality, if it is in your opponent’s favor, you are a fool, and if in your own, you are unjust.
So there is one general rule, namely, that we should consider the whole circuit, and the number of those casts which represents in how many ways the favorable result can occur, and compare that number to the remainder of the circuit, and according to that proportion should the mutual wagers be laid so that one may contend on equal terms.
… if, therefore, the player who wants an ace, deuce, or trey were to wager three ducats and the other player one, then the former would win three times and would gain three ducats; and the other once and would win three ducats; therefore in the circuit of 4 throws they would always be equal. So this is the rationale of contending on equal terms; if, therefore, one of them were to wager more, he would strive under an unfair condition and with loss; but if less, then with gain. ([1663] 1953, p. 200)
The connection to frequencies is intuitively compelling. However, it is theoretically imprecise. For at this stage of the development of chance notions, there was no precise connection between the ratios of chances and the frequencies of their occurrence. If Cardano intended to use frequencies to justify the rule, then his analysis contradicts his recognition (reported in Section 4.2 below) that frequencies and chances do not reliably match.
3.3.2. Port-Royal Logic
The Port-Royal Logic (Arnauld and Nicole 1662) does not have any sustained treatment of chance and gambling. However, it does report the rule of a fair wager. The rule is used to demonstrate that playing a lottery is a poor choice, since the operator’s overhead makes the play unfair. The text that illustrates fair wagers reads (1662, pp. 384–5):21
There are games in which, if ten persons each put in a crown, only one wins the whole pot and all the others lose. Thus each person risks losing only a crown and may win nine. If we consider only the gain and loss in themselves, it would appear that each person has the advantage. But we must consider in addition that if each could win nine crowns and risks losing only one, it is also nine times more probable for each person to lose one crown and not win the nine. Hence each has nine crowns to hope for himself, one crown to lose, nine degrees of probability of losing a crown, and only one of winning the nine crowns. This puts the matter at perfect equality.
This text might be justifying the rule by means of frequencies, but there is insufficient to conclude it definitively. If the phrase “nine times more probable” means “nine times more frequent.” then it is a frequency justification. The terms, in French, probable and probabilité appear in many places in the Port-Royal Logic. However, their use is informal and roughly equivalent to “likely” and “likelihood.” No explicit account of the meanings of the terms is given.
3.3.3. Newton
The rule was in broad, explicit use. It appears without apparently needing any justification as part of Newton’s analysis of the problem posed to him by Samuel Pepys, discussed above. Newton’s formulation is inserted in passing in the middle of his letter of December 16, 1693, to Pepys (Newton 1961, p. 299): “for their stakes must be as their expectations, that is, as the number of chances which make for them.”
3.3.4. Ozanam
Jacques Ozanam’s (1694) Récréations Mathématique et Physiques was an introductory survey of the useful mathematics and science of his time. It extends from simple ideas in arithmetic through geometry to astronomy. Familiar problems of chance are treated fully within the chance combinatoric theory, without mention of “probability.” An illustration of Ozanam’s analysis is his treatment of the problem of points (1694, pp. 69–76).
This classic problem, through Pascal and Fermat’s treatments, figures centrally in the development of theories of chance and probability. In its simplest form, two players gamble in successive games, with the successful player winning one point in each round. When one player achieves some predetermined number of points, that player takes all the stakes and the gambling is over. If play must halt before that termination, what is a fair division of the stakes?
Ozanam considered several instances of the problem. The simplest is that play is halted when the first player lacks two points and the second lacks three points. Ozanam had first solved the problem using the “arithmetical triangle” (Pascal’s triangle). Ozanam then provided a simplified analysis in which he explicitly displayed all the permutations. In this case, one of the players must win sometime over the next four games. He wrote “a” for “player one wins one game” and “b” for “player two wins one game.” He then displayed all possible permutations in his figure (1694, p. 75):
aaaa | aabb | abbb |
aaab | abba | babb |
aaba | bbaa | bbab |
abaa | baab | bbba |
baaa | baba | bbbb |
abab |
aaaa | aabb | abbb |
aaab | abba | babb |
aaba | bbaa | bbab |
abaa | baab | bbba |
baaa | baba | bbbb |
abab |
On Divisions in Games
In game play, one calls a division [Parti] the fair distribution, or the rule [of division] that should be applied to several gamblers who are at play and who play up to a certain number of points. [The stakes are divided] proportionally to that which each has a right to hope for by fortune according to the number of points he lacks for completion. (1694, p. 69)
Thus the division of the first player is to the division of the second as 11 is to 5, etc. (1694, p. 76)
3.4. Huygens’ Completion
So far, chance combinatorics specifies which are the fair wagers in games of chance. A weakness of the theory is that the rule used is declared, but without a rigorous basis. An informal basis is in the loose connection to frequencies. If stakes contributed by gamblers are in proportion to the chance of each winning, then each will over repeated plays wins as much they lose. That is so if the frequencies of wins and losses matches exactly the chances of wins and losses. As every experienced gambler knows, that match is at best approximate and thus is insufficient to give a rigorous basis for the determination of which are the fair wagers.
Huygens’ (1657) De Ratiociniis provided the basis.22 Huygens commenced his analysis by presenting various forms of the problem of points. He then needed to justify his judgment of which division of the stakes was fair. To that end, he laid down a foundational proposition: “that my expectation to win something is worth just such a sum as would get me the same expectation in a fair game” (1657, p. 521).23 Here Huygens introduced the notion of an expectation. It corresponds to the modern probabilistic notion of expectation. Huygens used it to simplify the notion of what are fair stakes in a wager. That notion is a compound notion since it applies to the stakes that should be committed by two or more players in a game of chance. Huygens saw that the conditions of a fair wager can be recovered from a simpler notion, the expectation that involves only one player in a chance situation.
Huygens proceeded to compute expectations in a sequence of propositions that deal with chance situations of increasing complication. The method of analysis employed is the same in each. It is sufficient to look at one case to see how the method works. Huygens’ Proposition II applies when someone has equal chances of obtaining amounts a, b, or c. Huygens showed that the expectation in this case is (a + b + c)/3.
This result is unsurprising if we conceive of equal chance outcomes as arising with equal frequencies. For then each of a, b, or c would arise one third of the time in repetitions and the average return would just be the expectation indicated, (a + b + c)/3. Huygens did not mention a connection to frequencies. He was in no position to use them as a precise basis for his determination of the expectation. For, these three equal chance outcomes will each arise only less commonly in exactly one third of repetitions. Significant deviations from equal frequencies are quite possible.
Huygens’ strategy amounted to finding a surrogate for these equal frequencies in the form of three gamblers in a particular fair game.24 The exact equality of opportunity provided each gambler by the fair game provides the exact equality that frequencies could not provide. The game has three equal chance outcomes in which the three players win the amounts a, b, or c in cyclic permutations, such as in Table 1.
. | Player 1 . | Player 2 . | Player 3 . |
---|---|---|---|
Outcome 1 | a | b | c |
Outcome 2 | b | c | a |
Outcome 3 | c | a | b |
. | Player 1 . | Player 2 . | Player 3 . |
---|---|---|---|
Outcome 1 | a | b | c |
Outcome 2 | b | c | a |
Outcome 3 | c | a | b |
The game is fair in so far as no player has any advantage or disadvantage in these payoffs. To preserve this equality, each player should stake the same amount. Since the payoffs require a total stake of (a + b + c), each player must stake (a + b + c)/3. If they stake any less, there will not be enough to cover the payoffs; if they stake any more, there will an undistributed surplus.
Huygens then used his foundational proposition to infer that (a + b + c)/3 is the appropriate expectation of the chance situation supposed in his Proposition II. Huygens’ analysis requires this foundational proposition for this inference since the original chance situation and the fair game analyzed are different cases. The foundational proposition fills the gap. It asserts that they are equivalent in matters of expectation.25
Huygens’ analysis included complications that, as far as I can see, provided no benefits to the analysis. He supposed that each player stakes an amount x and that one player wins with equal chance the total stakes of 3x. To arrive at the payoffs in Table 1, Huygens supposed that each player has entered into contracts with the other players. They are:26
Player 1-Player 2: if either wins, the winner gives the loser b from the winnings.
Player 1-Player 3: if either wins, the winner gives the loser c from the winnings.
Player 2-Player 3: if either wins, the winner gives the loser a from the winnings.
The overall effect is that each player is playing under the same conditions so that fairness is maintained. After the game is played and the winning player completes the agreed contracts, the returns to each player are as given in Table 2.
. | Player 1 nets . | Player 2 nets . | Player 3 nets . |
---|---|---|---|
Player 1 wins | 3x–b–c | b | c |
Player 2 wins | b | 3x–a–b | a |
Player 3 wins | c | a | 3x–a–c |
. | Player 1 nets . | Player 2 nets . | Player 3 nets . |
---|---|---|---|
Player 1 wins | 3x–b–c | b | c |
Player 2 wins | b | 3x–a–b | a |
Player 3 wins | c | a | 3x–a–c |
To return the fairness of the game, each of players 1, 2 and 3 must net a, c, and b respectively. For example, we must have for Player 1, 3x − b − c = a. A little algebra then shows the expected result, x = (a + b + c)/3. Huygens then noted the obvious extension. If there were equal chances of four amounts, a, b, c, and d, then the expectation would be (a + b + c + d)/4; and so on for larger amounts.
Huygens’ Proposition IV considers the situation in which we have p chances of gaining a and q chances of gaining b, all chances being equal. The expectation is readily recoverable as (pa + qb)/(p + q). While Huygens did not then note it, this Proposition is sufficient to return the then standard rule for a fair wager. To recover it, we set b = 0 and imagine a game in which the first player wins a with p chances and the second player wins a with the remaining q chances.
The expectations are then for player 1, pa/(p + q); and for player 2, qa/(p + q). That is their expectations are in the ratio of their chances of winning, p to q, which is the ratio prescribed for stakes in a fair wager.
The main results of the remainder of Huygens’ De Ratiociniis use these propositions to determine the fair division of stakes in various versions of the problem of points. While the present historical literature has recognized the innovative game theoretic character of Huygens’ analysis, Shafer (2019) goes beyond this literature. He recognizes that Huygens’ analysis was not just a precursor to the later frequency-based probabilistic analysis. It is an alternative mode of analysis.
4. What is Missing from Chance Combinatorics
The elements of chance combinatorics just sketched can be fitted into a modern probabilistic account. What distinguishes chance combinatorics from the later theory is what is missing. The most obvious is that modern probabilistic analysis is routinely applied to outcome spaces of not just infinite but continuum sizes, using the theory of additive measures. The outcome space of chance combinatorics is finite. It is restricted, for example, to all possible combinations of the finite number of die casts in some game. This omission needs no further discussion. Others, however, are more interesting. They are:
No notion of chance as a measure of belief. Its chance notion is an objective property of the physical device with chance properties and its use. It was not a measure of subjective belief, whether well warranted in the evidence or not.
No precise connection to frequencies. While it was recognized that higher chance outcomes occurred more often, there was no precise rule connecting frequencies to chances.
No probability. The ratio of the favorable number of chances to all chances is not distinguished as a measure of chance and, if it appears at all, it is only as an intermediate in calculations.
No comparison of chances across different games. The relative numbers of chances assigned to various outcomes only enabled the direct comparison of the chanciness of outcomes in the same game.
4.1. No Subjective Conception of Chance
There is now a great variety of probabilistic concepts. We divide them loosely into objective and subjective notions and then find in each further subdivisions. The chance notion of chance combinatorics was much narrower. It was simply an objective notion that codified the chance behaviors of physical devices like cast dice, shuffled cards and lottery drawings; and its practical manifestation lay in the identification of which are the fair wagers in each game. The primary texts of chance combinatorics, notably Cardano (1663) and Huygens (1657), employ only this limited notion of chance.
Change was coming.27 Many would soon seek to take this notion of chance from probabilistic devices and use it in more general cases. Chapter 15 of the first edition (Arnauld and Nicole 1662) of the Port Royal Logic had already moved in that direction. There, the Logic admonishes us to make decisions in the face of uncertainty with considerations of the probability of both the good and bad outcomes. The recommendation is clarified by recounting the chances of a fair game and of a lottery, which is deplored as an unfair game. While the Logic’s connection to games of chance has the flavor merely of a useful analogy, Jacob Bernoulli, in his posthumously published Ars Conjectandi (1713), is more direct in making the connection. Part IV recounts a fictional murder and sets out to use chance computations to aid in determining the culprit.
An extensive, modern literature treats the history of the merging of these objective and subjective notions. The primary thesis of Hacking’s (2006) Emergence of Probability is that this merging around 1660 marked the birth of the notion of probability. The two senses he identified are (p. 12) “statistical, concerning itself with stochastic laws of chance processes” and “epistemological, dedicated to assessing reasonable degrees of belief in propositions quite devoid of statistical background.” His verdict is strong: “I say, with only very slight reservations that there was no probability until about 1660.” The definiteness of this moment of creation has been challenged by Garber and Zabell (1979). Franklin’s (2015) Science of Conjecture provides a very detailed examination of notions of probability, especially qualitative notions, up to the seventeenth-century.
The issues raised in this literature are historically of the greatest interest. Just when and how did the two senses of probability merge? The answer, however, is unimportant as far as the cogency of chance combinatorics is concerned. For that theory existed already in a polished and a self-contained form prior to the merging of its notion of chance with the subjective notion.
4.2. No Precise Connection to Frequencies
Chance combinatorics had no precise results connecting chances and frequencies. This was no flaw in the theory. No such results were needed to achieve the theory’s goal of identifying the fair wagers. Since this identification was unequivocal, it was practically of great use to gamblers. The vaguer results on frequencies sketched below could only provide vaguer guidance.
While there were no precise results connecting the number of chances for an outcome and the frequency of its occurrence in repeated plays, it had been long recognized that there is a loose connection. The discussion of the combinatorics of die casts is introduced in Book 1 of de Vetula by noting that not all casts are of equal value when three dice are cast. The “value” is tied to their frequency. The extreme sums of faces, 3 = 1 + 1 + 1 and 18 = 6 + 6 + 6 can only arise in one way.
The remaining sixteen sums may be produced in multiple ways and, the text notes, arise more frequently according to how close they are to the middle sums of 10 and 11:28
On three dice there are eighteen [configurations],
Of which only three can be on top of the dice.
These vary in different ways and from them,
Sixteen compound numbers are produced.
They are not, however,
Of equal value, since the larger and the smaller of them
Come rarely and the middle ones frequently,
And the rest, the closer they are to the middle ones,
The better they are and more frequently they come.
… the die has six [faces and points]; in six casts each point should turn up once; but since some will be repeated, it follows that others will not turn up.
Without some version of the law of large numbers, Cardano had no way to make precise the association of chances and frequencies. Nonetheless, he tried. For the casting of two dice, he noted ([1663] 1953, p. 195):
But the throw (1, 2) can turn up in two ways, so that for it there is equality in nine casts; and if it turns up more frequently or more rarely, that is a matter of luck.
Cardano recognized that the connection of chances and frequencies is improved if there are many trials. He considered 3,600 casts of two dice and considered an outcome that can happen or not with equal chances. This case he calls “equality” since it obtains in one half of his “circuit,” which is the full set of possible outcomes. He wrote (Cardano [1663] 1953, p. 196):
Moreover, a repeated succession, such as favorable points occurring twice, arises from circuits performed in turn; for example, in 3,600 casts, the equality is ½ of that number, namely, 1,800 casts; for in such a number of casts the desired result may or may not happen [with equal probability].30
A direct association of chances and frequencies would require the outcome in 1,800 of the 3,600 casts. Yet, in so far as I can follow his text, the best Cardano can assure his readers is of some vaguely delimited approximation to this expectation. This imprecise connection between frequencies and chances could not be used to justify the precise rule that identifies the fair wagers. With this mode of justification precluded, we can appreciate Huygens’ (1657) acumen in using a game theoretic approach to justifying the rule.
Discerning more precisely the connection between chances and frequencies became more pressing with attempts to extend computations with chances beyond the physical devices of games of chance. This extension was the project of Book IV of Jacob Bernoulli’s (1713) Ars Conjectandi. Chapter IV of Book IV recalls that identifying the equal chance cases for physical devices is a solved problem. Because of the physical equality of die faces, each face arises with equal chance in die casts. How, Bernoulli then asks rhetorically, are we to estimate the chances of various fatal diseases in old age, of various future weather conditions, or of the prospects of players according to their shrewdness or agility. Bernoulli answers that these chances may be estimated “a posteriori” from the observed frequencies of occurrences. Here is one of several of Bernoulli’s examples ([1713] 2006, p. 327):
If, for example, there once existed three hundred people of the same age and body type as Titius now has, and you observed that two hundred of them died before the end of a decade, while the rest lived longer, you could safely enough conclude that there are twice as many cases in which Titius also may die within a decade as there are cases in which he may live beyond a decade.
… as the number of observations increases, so the probability increases of obtaining the true ratio between the numbers of cases in which some event can happen and not happen, such that this probability may eventually exceed any given degree of certainty. ([1713] 2006, p. 328)
4.3. No Probability, No Direct Comparison of Chance Across Different Games
4.3.1. No Probability
We now routinely compare chance events in different outcome spaces. The probability of being struck by lightning31 in any one year (1/500,000) is roughly half the probability of tossing 18 heads in a row (1/218 = 1/262,244). These comparisons were not supported by chance combinatorics. The omission is not reported in the secondary literature. It is very easy to fail to notice what is not there!
The main reason for the omission, I believe, is that the comparison of chances across different games was not needed for a prime application of chance combinatorics: discerning the fair wagers. For this purpose, all that matters are the relative chances of outcomes in the one game, or to use the later term, the one outcome space. As a result, the measure of uncertainty, the counting of chances, provides no useful comparison over different outcome spaces. For example:
When two dice are cast, there are five chances that a sum of six can arise.
When three dice are cast, there are seven chances that sum of six can arise.
While five is less than seven, we cannot now conclude that the first outcome is less likely than the second.
We now make the comparison across outcome spaces by computing the probability of each outcome. That is, we form the ratio of the number of favorable, equal chance cases to the total number of equal chance cases. The probabilities for the above dice problems are 5/36 and 7/216 respectively. These ratios carry a much broader significance for us. Using the law of large numbers, they give an estimate of the frequency of occurrence of the two outcomes in repeated plays; and using each as the parameter in a binomial distribution, we can assess how likely it is for the actual frequency to be close to this estimate.
Terms like “probable” and “probability” appeared sometimes in writings within chance combinatorics as a qualitative notion. Probability in its quantitative, modern sense does not appear. To someone searching for an anticipation of modern probability theory, this is a curious and disappointing failure—a missed opportunity. For someone working within chance combinatorics, it is otherwise. Nothing prevented the formation of the ratio. It is a simple arithmetic division. There was no point in doing it. It would be merely an idle recalibration of the chances. It does not support the conclusion that the first outcome will happen 5 times in 36 casts; and the second 7 times in 216 casts. The best that could be said is that something like these frequencies would arise, but deviations from them should be expected.
The modern sense of probability is in neither Cardano (1663) nor Huygens (1657), and is also missing from Fermat and Pascal’s correspondence, and from Ozanam’s Récréations (1694). They all had no need of it. Once Bernoulli derived his version of the law of large numbers, then the ratio became useful in its own right as an estimate of frequencies. It is no surprise to find that probability is given the ratio definition in the same part of Bernoulli’s Ars Conjectandi that contains the derivation of law of large numbers. He wrote:
Probability, indeed, is degree of certainty, and differs from the latter as a part differs from the whole. Truly, if complete and absolute certainty, which we represent by the letter a or by 1, is supposed, for the sake of argument, to be composed of five parts or probabilities, of which three argue for the existence or future existence of some outcome and the others argue against it, then that outcome will be said to have 3a/5 or 3/5 of certainty. ([1713] 2006, pp. 315–16, his emphasis)
The Probability of an Event is greater or less, according to the number of Chances by which it may happen, compar’d with the whole number of Chances by which it may either happen or fail. Thus If an Event has 3 Chances to Happen, and 2 to Fail; the Probability of it Happening may be estimated to be 3/5, and the probability of its Failing 2/5.
A misleading attribution in the history regarding Leibniz’s De incerti aestimatione ([1678] 1999) needs to be corrected. Hacking (2006, Table of Contents) noted “The definition of probability as a ratio among ‘equally possible cases’ originates with Leibniz.” And (2006, p. 32) “Laplace did define probability as the ratio of favorable cases to the total number of equally possible cases, but so did Leibniz in 1678.” De Melo and Cussens (2004, p. 31) repeated the attribution: “Leibniz’s 1678 manuscript De incerti aestimatione (DIA) contains the first appearance of the ‘Laplacian’ definition of probability in terms of equally possible cases.” If all this means is that Leibniz formed the ratio of favorable to all cases as an intermediate in his analysis, the remark would be correct. However, if we are to understand that Leibniz was initiating the transition in the fundamental concepts of chance from the seventeenth-century case counting to the later probability, then the claim is incorrect. While Leibniz uses the term “probability” (probabilitas) elsewhere in his text, he does not use it to identify this ratio. Throughout his text, the basic chance concept remains the familiar equally likely case of chance combinatorics and the ratio appears as an intermediate in calculating a “hope” (spes), which is comparable to Huygens’ expectation (expectatio).
4.3.2. A Newton Anomaly Resolved
That chance combinatorics did not compare chances across different games follows from a survey of the problems handled by the theory. How can we know that it could not compare them across outcome spaces? That it could not compare chances in this way would be shown if someone working in the theory was tasked with such a comparison. That happened when Pepys asked Newton how to advise Peter a Criminal Convict, in the correspondence recalled above. Pepys’ question involved three different outcome spaces: that of six die casts; of twelve die casts; and of eighteen die casts. He wrote:
… supposing [the dice] instead of 1, 2, 3, &c to bee branded wth ye 6 initiall Letters of ye Alphabet A. B. C. D. E. F. And the Case should then bee this; Peter a Criminal Convict being doom’d to dye, Paul his Friend prevails for his having ye benefitt of One Throw only for his Life, upon Dice soe prepared; with ye Choice of any one of these Three Chances for it, viz.
One F, at least upon Six such Dice.
Two F’s at least upon Twelve such Dice, Or
Three F’s at least upon Eighteen such Dice.
Question.—Which one of these Chances should Peter in this Case choose? (Pepys to Newton, December 9, 1693, in Newton 1961, pp. 297–98)
A modern analysis, such as Stigler (2006), computes the probabilities of the three outcomes directly and compares them. Newton’s analysis is oddly convoluted. He does not compare chances across the different spaces. Rather, he reformulates the question into another: what are fair wagers in each of the three cases. From those fair wagers he reads off which choice is most favorable to the Criminal Convict.
We can now understand why Newton followed this convoluted pathway. Chance combinatorics had no means to compare chances across different outcome spaces.32 The precise result of the application of the theory was the discerning of which are the fair wagers. Newton applied the theory, determined which are the fair wagers in each case and then used a comparison of them to answer Pepys’ question. It was not a needless detour but the most direct use of the resources of chance combinatorics.
5. Fermat and Pascal
The origin of modern probability theory is traditionally attributed to the correspondence between Pierre de Fermat and Blaise Pascal in 1654 on the problem of points. That gets something right about the chronology. A young Huygens in 1655 (as reported in David 1962, pp. 111–12) knew of Fermat and Pascal’s interest in the problem, but not their solution, and was inspired by that knowledge. If we seek fundamental contributions of principle, then none can be found in their correspondence. Their analysis was fully within the existing theory of chance combinatorics. It counted and compared numbers of chances and translated them into the corresponding fair wagers. The only apparent novelty was that their analyses were mathematically more sophisticated and more general and provided the best solution up to that time of the problem of points.
Later historical work reassures us that otherwise there was little of novelty in the correspondence. The problem of points had been posed long before and other solutions offered. David (1962, pp. 37–8) traced early versions back at least to a 1474 work of Paccioli. Similarly, David (1955, p. 61–2, 81–2) identified the use of Pascal’s triangle in combinatorics in several works a century prior to the correspondence. The correspondents were addressing an established problem using existing methods, but applying them to the problem better than anyone before them.
That the analysis was fully within the context of chance combinatorics explains the otherwise curious omissions noted by Franklin:
They say very little, however, about the nature of the entities they are calculating. In their initial letters on the just division of stakes, they merely calculate what would be “impartial” between the players. They appear to have no way of conceptualising a probability except as a just share of a stake, a concept just sufficient for them to deploy the symmetry arguments that result in a numerical solution to the problem. (2016, §2.8)
6. Standard Histories
There was, I have argued, a serviceable if limited theory of chance perfected in the seventeenth-century. How did it become the theory that history forgot? The answer lies in how the history of probability has been written and, especially, how it wrote of the ideas of the seventeenth-century. We shall see below that part of the problem results from an initial overvaluing of Pascal and Fermat’s contribution that became entrenched as standard. Historians expecting to find in it the origin of probability theory sought in vain for novel statements of foundational principles. While more careful historians have corrected this misattribution, the historical literature has persisted in seeking anticipations or precursors of later probability theory in the earlier work. The result was that there was no natural space in the historical narratives for chance combinatorics. It became the theory that history forgot.
6.1. The Elevation of Pascal and Fermat
As we have seen in Section 5, Fermat and Pascal’s correspondence was recognized in writing prior to the nineteenth-century as part of the chronology of events in the development of probability theory. The stronger claim that this correspondence initiated probability theory was made by Laplace. Given Laplace’s authority, his summary was highly influential. The same passage appears twice: in the second edition34 of his technical Theorie (Laplace 1814a, p. xcxix); and in his reflective Essai (Laplace 1814b, p. 89). The translated passage is (1902, p. 185):
Long ago were determined, in the simplest games, the ratios of the chances which are favorable or unfavorable to the players; the stakes and the bets were regulated according to these ratios. But no one before Pascal and Fermat had given the principles and the methods for submitting this subject to calculus, and no one had solved the rather complicated questions or this kind. It is, then, to these two great geometricians that we must refer the first elements of the science of probabilities, the discovery of which can be ranked among the remarkable things which have rendered illustrious the seventeenth century—the century which has done the greatest honor to the human mind.
Perhaps the most important endorsement was in Isaac Todhunter’s history (1865). It was the authoritative early history of probability. Its first chapter briefly recounted earlier work on chance by Cardano, Kepler, and Galileo. Their contributions to the theory of probability were judged dismissively as “extremely slight” (1865, p. 7). The “true origin” is attributed to Pascal and his correspondence with Fermat. The claim is supported with a quote from the first edition of Laplace’s Theorie (1812, p. 3) and a comparable quote from Poisson (1837, p. 1). This attribution became routine in histories of mathematics, such as Rouse Ball (1915, pp. 285, 300) and Cajori (1919, pp. 170–71). The celebration of Pascal and Fermat endures in popular writing. The title of Keith Devlin’s popular work (2008) is The Unfinished Game: Pascal, Fermat, and the Seventeenth-Century Letter that Made the Modern World.
More careful scholarship, starting in the mid-twentieth century, sought to correct Laplace’s overstatement. The corrections are hesitant. Historians felt a need to offer some account of the prominence of the correspondence in the existing literature, while not able to identify why it merits that place. David is ready to “pass over” Pascal and Fermat as the “real begetter[s]” of probability theory, for Fermat, she notes, merely extended work already done by Galileo (1962, p. 110). Hacking does not see in the correspondence any great novelty but praises them for “a completely new standard of excellence for probability calculations” (2006, p. 60). Garber and Zabell (1979, p. 49) locate the importance of the correspondence in the fact that Fermat and Pascal, two leading mathematicians of the day, had taken an interest in chance problems. That aroused the interest of the wider mathematical community.
6.2. The Neglect of Chance Combinatorics
These welcome efforts to correct the historical record have not gone far enough. Most historical writing on chance is still controlled by a quest for modern ideas in earlier times. This is revealed by the language commonly used. Hacking often talks of “precursors” or “anticipations” of modern probabilistic ideas, while denying that he seeks them (2006, p. 9). Nonetheless he reports “unsuccessful anticipations” (p. 12), “anticipations of probability theory” (p. 49), “some anticipation of mathematical expectation” (p. 92) and “… it has been no part of my thesis that there were never precursors nor anticipations.” (p. 56).
Less obvious is the practice of attaching the label of “probability” to seventeenth-century writing and earlier writing on chance where there is none. David (1962, p. 110) wrote of “the first calculations of a probability by Cardan and by Galileo.” Hacking (2006, pp. 11, 61, 92) calls Huygens’ text “the first probability textbook to be published” and “the first printed textbook of probability.” The title of Bellhouse’s (2000) analysis of de Vetula is “A Medieval Manuscript Containing Probability Calculations.” Stigler’s (2006) title identifies “Isaac Newton as a Probabilist.”
Probability is a modern term of art not used then in the modern sense. Is it not so used in Cardano, in the correspondence of Fermat and Pascal, in Huygens, in Ozanam and, contrary to some claims, in Leibniz. In this later historical writing, great efforts were made to write a careful history, responsible to the sources. David’s (1962) monograph set out explicitly to recover the history of chance prior to Fermat and Pascal. This earlier history occupies the first seven of fifteen chapters. Chapter 3 asks why probabilistic ideas and the notion of equally likely possibilities were so long delayed.35 In its eagerness to find precursors of probability, David’s text is too quick to find probabilities. It asserts that John Graunt was “the first Englishman to calculate empirical probabilities on any scale” (p. 103) in his Bills of Mortality. Yet Graunt’s (1665) Bills of Mortality contain no probabilities or probabilistic reasoning. The work reports actual frequencies of various features of the population.36 Similarly David (p. 110) reports “… the first calculations of a probability by Cardan and by Galileo …” When we examine Galileo’s text (reproduced in translation in David 1955, pp. 192–94) there are no probabilities. Galileo computes the number of chances for various cases.
Hacking’s (2006) influential37Emergence of Probability (first edition 1975) is controlled to its detriment by a dismissive appraisal of earlier ideas of chance in the seventeenth-century. “I say, with only very slight reservations,” he confided, “that there was no probability until about 1660” (2006, p. 17) This severe judgment is almost guaranteed by Hacking’s framing of his history.
His quest is for a quite specific, dual notion of probability that is articulated at some length in his Chapter 2, “Duality.” He summarized it as:
In Chapter 2 I emphasized the duality of the probability that emerged around 1660. On the one hand it is epistemological, having to do with support by evidence. On the other hand it is statistical, having to do with stable frequencies. (2006, p. 43)
Chance combinatorics has neither stable frequencies nor epistemology since it needs neither. Frequencies enter only vaguely and even these are rendered redundant by Huygens’ ingenious equality arguments. Hacking’s unnecessarily specific conception precludes his text from recognizing the cogency of earlier ideas of chance such as in the chance combinatorics.
The trend among more works that treat the historical emergence of probability in the seventeenth-century is clear. Chance combinatorics is diminished or overlooked. Gigerenzer et al. (1989) commences on page one with the familiar attribution to Fermat and Pascal. Daston (1988) seems to accept, perhaps reluctantly, the elevation of Pascal and Fermat’s role in the history. Her formulation is cautious:
… the Pascal/Fermat correspondence created a research tradition, complete with problems and concepts, that dominated the field for over fifty years. On these grounds alone it deserves its traditional place in the history of mathematical probability, and I shall not break with that tradition. (Daston 1988, p. 15)
7. An Alternative Historiography
The controlling idea of our histories has been that earlier ideas of chance, such as those in the seventeenth-century, were merely imperfect anticipations of a better, completed modern theory or even just dim glimpses of a greatness to come. I propose an alternative. We should conceive of the seventeenth-century theory of chance combinatorics as a well-formed theory that provides a model for how indefinitenesses of all types could be handled. It showed in one example, games of chance, how they could be represented mathematically and decisions pertaining to them could be reduced to objective numerical computations. This model inspired the search for comparably precise treatments of many other types of indefiniteness. We learn from his examples that Jacob Bernoulli sought an analysis of comparable precision for the indefinitenesses of forensics, of medicine, of the weather and more. This extension remains a part of the project of modern probabilistic analysis today. In many areas, it has met with great successes. In one area, the project is incomplete and possibly terminally so.
The greatest success has been the replacement of the simple mathematics of the seventeenth century, finite counting and finite combinatorics, with the modern theory of additive measures. Its application has led to advances far outstripping anything achieved in the seventeenth-century. A notable instance has been its application to stochastic processes in physics. Statistical mechanics and quantum theory, two jewels of modern physics, depend essentially on its application. We now think of the application as obvious and automatic. Yet in the mid-nineteenth-century, its application was a bold step taken by Maxwell. He recognized the impracticality of tracing the motions of the individual molecules in a kinetic gas by methods then standard in Newtonian mechanics. He suggested that, instead, these motions should be treated collectively using the statistical methods that had been employed in social contexts. In explaining this transition, Maxwell (1872, p. 289) used the example of a school registrar whose analyses proceed without identifying any individual students’ names.
The project met with less success in accounting for the nature of probability itself. In an inversion of the normal picture, the seventeenth-century conception of chance is univocal and the most secure of all. The later conceptions are fragmented and unstable. This problem was already apparent when Jacob Bernoulli first sought to extend the notion of chance beyond the confines of games of chance. In Book IV of Ars Conjectandi, he lamented the difficulty of determining chances outside these confines compared to the ease of determining them for physical devices:
The originators of these games took pains to make them equitable by arranging that the numbers of cases resulting in profit or loss be definite and known and that all the cases happen equally easily. But this by no means takes place with most other effects that depend on the operation of nature or on human will. So, for example, the numbers of cases in dice are known: for a single die there are manifestly as many cases as the die has faces. Moreover these all have equal tendencies to occur; because of the similarity of the faces and the uniform weight of the die, there is no reason why one of the faces should be more prone to fall than another—as would be the case if the faces had dissimilar shapes or if a die were composed of heavier material in one part than another. ([1713] 2006, pp. 326–27)
The theory of chance consists in reducing all the events of the same kind to a certain number of cases equally possible, that is to say, to such as we may be equally undecided about in regard to their existence, and in determining the number of cases favorable to the event whose probability is sought. The ratio of this number to that of all the cases possible is the measure of this probability, which is thus simply a fraction whose numerator is the number of favorable cases and whose denominator is the number of all the cases possible. (1814b, p. 4, 1902, pp. 6–7)
When this classical notion of chance is applied in its original context of games of chance, Boole, Venn, and many later commentators’ criticism is misplaced. Bernoulli had identified the viability of the conception in the physical properties of devices with chance properties. However, their dissatisfaction with the classical interpretation was well placed when this notion of chance is used outside games of chance. The pressing problem was now to find an appropriate conception of probability that could be applied in broader contexts. The continuing development of work on the notion of probability has provided no univocal solution. It is marked by instability and fragmentation. One proposal after another has been advanced, criticized and replacements offered, only for the critical cycle to repeat.
The views of this enormous literature can be grouped loosely into several traditions.40 The frequentism of Boole and Venn found continuing support in the twentieth-century elaborations of Reichenbach and von Mises. In the early twentieth-century, Keynes and Carnap gave a logical interpretation of probability as partial entailment. At the same time, Ramsey, de Finetti, and Savage developed a subjective interpretation that identified probability as strength of belief. That probability has an objective referent in the world is maintained by approaches such as Popper’s propensity interpretation, Jaynes’ objective physical probability and Jon Williamson’s objective Bayesianism.
This brief inventory scarcely touches the vast literature. That each of these traditions retains a following indicates that none has found a fully satisfactory viewpoint. We should contrast this fragmentation41 with the solidity of the conception of chance within chance combinatorics. It remains today the least problematic conception.42 If we wish now to explain what it really means to say that some outcome has probability one half or one sixth, it is almost irresistible to call to mind a coin toss or a die cast and thereby to employ the very concept central to the seventeenth-century theory of chance combinatorics.
8. Conclusion
How close were the ideas of earlier figures to modern probability theory? Answering this question has yielded much interesting history. Another question has been neglected. What was the conception of chance held by earlier figures in the history? The result of this neglect is that the self-contained if limited theory of chance combinatorics of the seventeenth-century has been overlooked. I have tried to establish here that it served as more than a mere anticipation of the later probability theory. It offered a model for how mathematical methods could be applied to broader indefinitenesses. Modern probabilistic analysis emerged from efforts to extend its successes to larger domains.
Notes
Artioli et al. (2011) examined the physical properties of 91 Etruscan dice. The earliest, sample 40, was cubic and was dated to the eighth-century BC. Many more came from the sixth to the third centuries BCE.
My chi-square test for uniformity on the data she supplies for casts of three ancient dice show two passing and the third die, which is visually irregular, failing.
Augustus Caesar, as quoted by Suetonius (1914, p. 235), describes playing this game.
David (1955, p. 3) found that the two broader sides numbered 3 and 4 arise each with probability roughly 4/10. The remaining two narrower sides are numbered 1 and 6 and each arise roughly with probability 1/10.
Exhibit “A Day in Pompei,” Museum of Science, Boston, MA, October 2, 2011–February 12, 2012. https://www.outandaboutinparis.com/2011/10/day-in-pompeii-at-museum-of-science-in.html.
An original of this rare book is in the Huntington Library, San Marino, California.
Correctly—I checked against the original manuscript; Aydelotte dates the work to 1552. The Huntington Library Catalog reports 1555.
This passage proved so enticing to contemporaries that Aydelotte (1913, Appendix B) could demonstrate that it was reproduced nearly verbatim in three subsequent works of the time, all by different authors, in 1597, 1608, and 1612.
A set of 24 doctored dice of the late fifteenth-century has been recovered from the Thames foreshores. They include dice weighted with mercury and also those with repeated pip counts. See https://medievallondon.ace.fordham.edu/collections/show/92.
The image matches that of the corresponding page reproduced in Bellhouse (2000, p. 130), which is identified there as from the 1534 printing. Both include the typographical error in row 8 not found in other editions, such as the 1662 edition shown in Kendall (1956, p. 6) and David (1962, p. 32).
The entry 12 in row 8 is a typographical error that has inverted the order of the digits of the correct value of 21.
The translator has simplified the presentation of the text by writing the outcomes as “(6, 6),” “(6, 5),” etc. Cardano’s Latin is “bis, sex, atque sex, & quinque,” that is “twice six, and six & five.”
Help may well have been needed. The grasp of the combinatorics was then not always correct. David (1962, p. 35) quotes a 1477 commentary on Dante’s Divine Comedy that appears to count throwing a sum of four on three dice as arising only in one way.
David (1962, p. 65) conjectures plausibly that very few could so order Galileo, so the instigation came most likely from his sponsor, the Grand Duke of Tuscany.
It is more likely that the question was prompted by a concern that this simple pip count was an inadequate quantification of chance. In modern terms, a sum of 9 arises with probability 25/216 = 0.1157 and a sum of 10 with probability 27/216 = 0.125. This difference of 0.0093 would require thousands of casts with careful record keeping if the two are to be distinguished. In 1,000 trials, the standard deviation of the frequency of success of a binomially distributed variable is 0.0105.
For a recent analysis, see Stigler (2006).
If we assume l = “pounds,” we might now write this as £1,000, using the symbol for pound that was already in wide use in Newton’s time.
“665l. 2s. 1/2d.” is “665 pounds, 2 shillings and a halfpenny.” (31,031/46,656)1,000 = 665.102023. With twenty shillings to the pound and twelve pennies to the shilling, 0.10202l = 2.04046s and 0.04046s = 0.5d. = 1/2d. This corrects Newton’s erroneous 665.l 0s. 2d. and the failed editorial attempt in Newton (1961, p. 301fn 5) to correct it. (This is for me a familiar calculation. When I grew up in Australia, “pounds, shillings and pence” were still the official currency.)
618 is roughly 1014.
Translation from Arnaud and Nicole ([1662] 1996, p. 274). This passage is identical in the first edition of 1662 and the 1683 edition of the translation.
See David (1962, Ch. 11) for an account of the circumstances surrounding the writing of De Ratiociniis.
There is no circularity in the eventual recovery of the rule for fair stakes in general. Huygens’ analysis proceeds from the special case of a game judged fair because, in modern terms, it has a perfect symmetry over all the players.
Huygens’ analysis requires a tacit assumption that any other association of the chance set up with different fair games will yield the same expectation.
In Huygens’ text, Player 1 is Huygens and the contract between Player 2 and Player 3 is left tacit.
Franklin (2015, pp. 305–306) notes that the mathematician, Roberval, in unpublished writing no later than 1647, had made an explicit connection between belief and combinatorial counts that was unusual for the time. We should believe, Roberval asserted, in a ten rather than a four in three dice casts since there are more ways to cast the ten.
Translation from Bellhouse (2000, p. 134).
Here Cardano’s combinatorics failed. We would now compute the probability of no sums of 3 appearing in nine casts of two dice as (17/18)9 = 0.5978; and the probability of exactly one sum of 3 as 9(1/18)(17/18)8 = 0.3165.
Here the standard translation of Cardano (1663) in Ore (1953) as Cardano ([1663] 1953) is misleading. The translation inserts the word “probability” where it is not in the original, albeit terse Latin (and makes similar insertions elsewhere). The Latin reads “In totidem enim potest contingere, & non contingere.” (p. 265) It is loosely translated as “In the same number [of casts], [the result] may or may not happen.” The Gould translation above in Ore (1953, p. 196) reads: “… for in such a number of casts the desired result may or may not happen with equal probability.” The phrase “with equal probability” is not in the Latin.
According to https://www.cdc.gov/disasters/lightning/victimdata.html.
Newton could have reduced the problem to a single outcome space of 18 die casts. The first outcome—at least one 6 in six casts—would have involved the first six of the 18 casts. Newton did not do this, perhaps because it would be computationally more onerous and more difficult to explain to Pepys.
The text of the first edition differs in many places. In it, Laplace (1812, p. 3) writes: “[The theory of probability] owes its birth to two French geometers [Pascal and Fermat] of the seventeenth-century, which was so fertile in great men and great discoveries …” That they are identified as French raises the possibility that their work was elevated by Laplace out of Gallic pride.
The word “probable” appears twice only in the text (p. 23, p. 96) as a synonym for the informal “likely.” “Chance” appears only once as “mischance” (p. 97) to characterize unfortunate accidents.
It is also disputed in its primary claims. See, for an example, Garber and Zabell (1979).
From the example above in Section 4.2 “there are twice as many cases in which Titius also may die within a decade as there are cases in which he may live beyond a decade.”
Boole directly addressed Poisson’s (1837, p. 31) later statement: “The measure of the probability of an event is the ratio of the number of cases favorable to that event, to the total number of cases favorable or contrary, and all equally possible, where all of which have the same chance” (my translation differs slightly from Boole’s).
See Hacking (2006, p. 14–16) for an attempt to dismiss this fragmentation as unimportant.
It is the least problematic but it is not unproblematic if we conceive of coin tosses and die casts as mechanically deterministic processes. Then von Kries and Poincaré’s method of arbitrary functions, whose history is related in von Plato (1983), provides a serviceable way to secure their pseudorandomness.