The Reflection Effect for Higher-Order Risk Preferences

Abstract Higher-order risk preferences are important determinants of economic behavior. We apply insights from behavioral economics: we measure higher-order risk preferences for pure gains and losses. We find a reflection effect not only for second-order risk preferences, as did Kahneman and Tversky (1979), but also for higher-order risk preferences: we find risk aversion, prudence and intemperance for gains and much more risk-loving preferences, imprudence and temperance for losses. These findings are at odds with a universal preference for combining good with bad or good with good, which previous results suggest may underlie higher-order risk preferences.

, as well as in experiments (see the review article by Trautmann & Van de Kuilen, 2018).
In this paper, we apply insights from behavioral economics to the study of these higher-order risk preferences. Daniel Kahneman called reference dependence "the core of prospect theory" (Kahneman, 2003(Kahneman, , p. 1457). Yet reference dependence, which leads to a reflection of risk aversion over the gain and loss domain, has not been thoroughly investigated or controlled for in existing studies. We measure higher-order risk preferences while directly controlling the reference point to separate gains and losses. We control the reference point by giving subjects an endowment before the experiment. 2 Separating gains and losses allows us to investigate whether higher-order risk preferences, like risk aversion, reflect between the gain and loss domains. This is important for two reasons. The first is external validity. Preferences measured under gains in an experiment may be a poor predictor for choices that naturally involve losses, such as insurance and self-protection decisions. Decision makers may purchase insurance that involves nonperformance risk despite a prediction based on prudence observed under gains that they would not demand such insurance (Eeckhoudt & Gollier, 2005). Likewise, prevention efforts are closely related to prudence, and basing health policy on estimates of prudence under gains may be suboptimal.
The second reason is that the domain will influence decisions dependent on higher-order risk preferences. In a financial crisis, when their portfolios are deeply in the loss domain, investors may come to prefer downside risks, and greater volatility on the financial markets may lead investors to demand insurance against unrelated risks where under more normal circumstances, greater volatility on financial markets has no such effect.
Separating gains and losses means we measure preferences without loss aversion, which affects choices for lotteries that mix gains and losses. For example, if subjects take the highest possible outcome they are sure to get (the MaxMin) as the reference point, loss aversion will bias responses toward risk aversion, prudence, and temperance. 3 Thus, we can test the hypothesis of a preference for combining good with bad or for combining good with good without the confounding effects of loss aversion. Eeckhoudt, Schlesinger, and Tsetlin (2009) and Crainich, Eeckhoudt, and Trannoy (2013) show that these simple preferences may underlie higher-order risk preferences, in which case decision makers who are risk  Eeckhoudt and Schlesinger (2006), for all k, r > 0 and all zero-mean random variablesε,ε 1 , andε 2 . averse should be temperate, whereas those who are risk loving should be intemperate, and both should be prudent.
We also measure higher-order risk preferences with lotteries for which the branches in figure 1 involve probabilities smaller than 0.5. 4 We extend the definitions of Eeckhoudt and Schlesinger (2006) to such small probability lotteries and show that the results of Eeckhoudt et al. (2009) apply to them.
We find a reflection effect for higher-order risk preferences: under gains, the majority of choices are risk averse, prudent, and intemperate, whereas under losses, the majority of choices are risk loving, imprudent, and temperate. The imprudence we find for losses and the combination of risk aversion and intemperance for gains is at odds with the hypothesis that a universal preference for combining good with bad or good with good underlies higher-order risk preferences. We find similar behavior for small probability lotteries and the usual 50-50 lotteries.
In section II, we discuss the definitions of higher-order risk preferences in detail and how these may be derived from preferences for combining good with bad or good with good. We also extend the definitions of higher-order risk preferences so they can be used with smaller probabilities. We present the design of our experiment in section III. In section IV, we present our results, which we discuss and relate to previous findings in section V. Section VI concludes. Eeckhoudt and Schlesinger (2006) define higher-order risk preferences through simple lottery pairs. Let [x, y] denote the lottery that gives outcome x with probability 0.5 and outcome y with probability 0.5. Let denote the decision maker's preference relation. Risk aversion is defined as the preference [−k, −r] [0, −k − r] for all wealth levels and for all k, r > 0. Eeckhoudt and Schlesinger (2006) name such an attitude risk apportionment of order 2. A decision maker is risk loving if the reverse preferences hold. To define prudence (risk apportionment of order 3), the fixed deduction −r is replaced by a zero-mean nondegenerate random variableε (see figure 1). The decision maker is prudent if [−k,ε] [0,ε − k] for all wealth levels, for all k > 0, and for all zeromean, nondegenerate random variablesε. The decision maker is imprudent if the reverse preferences hold. Temperance (risk apportionment of order 4) is defined by replacing −k by another independent random variable. The decision maker is temperate if [ε 1 ,ε 2 ] [0,ε 1 +ε 2 ] for all wealth levels and for all zero-mean, nondegenerate, and independent random variablesε 1 andε 2 . If the reverse preferences hold, the risk attitude is called intemperance. Risk attitudes of orders higher than 4 can be defined through similar procedures, but we do not study those attitudes in this paper. Eeckhoudt et al. (2009) show how stochastic dominance preferences lead to risk apportionment of any order. They show that ifx i dominatesỹ i , i = a, b, through ith order stochastic dominance, then the 50-50 lottery [x a +ỹ b ,x b + y a ] dominates [ỹ a +ỹ b ,x a +x b ] through (a + b)th stochastic dominance. Stochastic dominance preferences thus imply a preference for combining the "good" lottery with the "bad" lottery and contain risk apportionment preferences as defined by Eeckhoudt and Schlesinger (2006) as a special case. A preference for combining good with bad thus leads to a combination of risk aversion, prudence, and temperance, for example. Crainich et al. (2013) apply this logic to decision makers who prefer combining good with good and show that this leads to a combination of risk-loving, prudent, and intemperate preferences. It follows that indifference toward combining good with bad and good with good leads to risk neutrality, prudence neutrality, and temperance neutrality. The results above also hold when all final outcomes are only in the domain of gains (relative to some reference point) or of losses. We can therefore test for these implications separately for both domains to see if there are behavioral differences between them.

II. Theoretical Background
Because we also study preferences over lotteries involving probabilities different from 50-50, we need to extend the above definitions. Let (p: x, p: y, 1 − 2p: z) denote the lottery that gives outcomes x and y with probability p and outcome z with probability 1 − 2p. Risk aversion is then defined as (p: −k, p: −r, 1 − 2p: c) (p: 0, p: −k − r, 1 − 2p: c), prudence as and temperance as (p:ε 1 , p:ε 2 , 1 − 2p: c) (p: 0, p:ε 1 +ε 2 , 1 − 2p: c) for all p ∈ [0, 1], all k, r > 0, all c, all independent zero-mean risksε,ε 1 , andε 2 and all wealth levels. Risk-loving preferences, imprudence, and intemperance are defined as the reverse preferences. Under expected utility, these extended definitions are equivalent to the usual definitions of risk aversion, prudence, and temperance (necessity follows from the independence axiom and sufficiency follows from their definition). Furthermore, the results of Eeckhoudt et al. (2009)

III. Design
We measure higher-order risk preferences in three treatments, two involving gains (with all outcomes positive additions to the initial payment) and one involving losses (with all outcomes negative additions to the initial payment). To induce a strong reference point, subjects face both gains and losses relative to their initial endowment; we therefore have a within-subject design for testing gains and losses. The two gain treatments involve a between-subject design (subjects were assigned randomly to either gain treatment). The 50-50 gain treatment involves the usual 50-50 lotteries, and the small probability gain treatment involves the small probability lotteries discussed in section II. The small probability treatment allows us to offer the possibility of sizable gains. For the loss treatment, we measure preferences using the usual 50-50 lotteries. Small probabilities cannot be used together with large losses, because losses exceeding the initial endowment would lead to negative earnings. For each of the three treatments, we measure three higher-order risk attitudes (risk aversion, prudence, and temperance). Thus, we have nine treatment-risk attitude pairs in total.
To study the effects of reference-dependence, it is important to control the reference point. To this end, subjects were given a 15 euro endowment at the start of the experiment. They were told that this endowment was their payment for participating in the experiment, that they could gain additional money or lose part of it, and that it was equal to the expected value of participating. Throughout our analysis, we assume that subjects take the initial endowment as their reference point. This is a common assumption in the literature and is consistent with a reference point based on rational expectations (Kőszegi & Rabin, 2006) or based on the status quo. Baillon, Bleichrodt, and Spinu (2020) find evidence that a sizable fraction of subjects take the status quo as their reference point, and Etchart-Vincent and l'Haridon (2011) find that behavior is similar under losses from an initial endowment and losses out of subjects' own pockets.
A screenshot of one of the tasks is in appendix B, and the tasks are listed in table 6 in appendix C. Lotteries are presented in compound form in all tasks. We use compound lotteries because they most clearly present the choice as between combining good with bad or good with good. Haering, Heinrich, and Mayrhofer (2020) find that prudence and temperance are stronger for compound lotteries than for reducedform lotteries. Deck and Schlesinger (2017) investigate presenting lotteries in compound form and in reduced form and find that while aggregate patterns are not much different, individuals have different preferences between the different formats.
Probabilities are presented as drawing a colored token from a bag with 100 colored tokens. To avoid mixing gains and losses, outcomes are chosen in such a way that relative to the initial payment, they are always negative in the loss treatment and always positive in the gain treatments. There are no zero outcomes (relative to the initial 15 euro) to prevent possible effects such as loss and zero avoidance from influencing responses. Subjects were presented with twelve tasks for each risk attitude in a given treatment. The relatively large number of tasks allows us to distinguish subjects who are indifferent (or confused) from those who have a clear attitude without having to measure willingness to pay. Measuring willingness to pay would complicate the procedure while the binary choices are already quite complex, especially for temperance.
The experiment was performed in the ESE-econlab of the Erasmus School of Economics. Subjects were randomly selected from the ESE-econlab subject pool, which consists of people who have registered to participate in experiments and invited to sign up for sessions through an automated email system. The 245 subjects who participated in the experiment made 17,640 choices. 5 Upon entering the lab, subjects were given an envelope containing 15 euros and assigned a seat in a cubicle. They started the experiment and left the lab at the same time and made their choices on a computer. In the instructions, they were informed that one of their choices in the experiment would be implemented for real, which would result in winning additional money or losing part of their initial payment, and that their expected earnings were equal to the amount they had been given. The instructions furthermore contained an explanation of the possible outcomes of a lottery similar to one of the more complicated lotteries used in the experiment and subjects were asked three comprehension questions that had to be answered correctly before they could proceed to the incentivized tasks. In addition, subjects were asked to answer a (nonincentivized) practice question before getting to the incentivized tasks to allow them to become familiar with the interface.
Each subject was randomly assigned to either the small probability gain treatment or the 50-50 gain treatment by the software, and all participated in the loss treatment. Every subject thus participated in two treatments, the loss treatment and one of the gain treatments, and 121 subjects participated in the 50-50 gain treatment, 124 subjects in the small probability gain treatment, and all of these subjects (245 in total) in the loss treatment. The tasks measuring a particular higher-order risk attitude for a given treatment were presented together, and the order of the tasks within such a block of tasks was randomized, as was the order of the blocks themselves. Thus, whether a subject first faced a loss or a gain task was determined randomly. The location, left or right, of the option that indicates risk apportionment was randomized for each subject and each task. Subjects could go back to previous tasks in the same block of questions and change their choices if they so wanted. 6 Subjects could continue to the next task only after making a choice in the current one, which they could do by clicking on the lottery they preferred. To indicate the choice they had made, the selected option was then highlighted. After participating in the treatments, subjects were asked some background questions (on gender, age, nationality, degree program, and income); they were explicitly allowed not to answer these questions. 7 After answering all questions in the experiment, subjects were asked one by one to come to the front desk to play one of their choices for real. Subjects were asked to roll a sixsided die to select one of the six blocks of tasks, three of which measured preferences under losses and three of which measured preferences under gains, and a twelve-sided die to select according to which of the twelve choices in that block they would be paid. The subject would then draw a colored plastic token from an opaque bag with a composition of tokens corresponding to that described in the selected task. Composing the bag was done in full view of the first subject in each session for whom that bag was needed, and subjects who came thereafter could inspect the bag if they wished. Some subjects had to draw tokens from more than one bag. Depending on the final outcome, the subject would then be paid in addition to their initial €15 payment or have to give up part of it. The average earnings of subjects were €14.80, and the total duration of a session, including the payment procedure, was less than one hour. Figure 2 shows the number of times subjects chose in agreement with risk aversion, prudence, and temperance in the loss treatment and in the gain treatments. For the two gain treatments, the patterns are very similar: we find risk aversion and prudence in both gain treatments, which is consistent with the usual findings in the literature, and weak (if significant) intemperance, which is consistent with the findings of Deck and Schlesinger (2010) and Baillon, Schlesinger, and Van de Kuilen (2017) but not with the results of Ebert and Wiesen (2014), Deck and Schlesinger (2014), and Noussair, Trautmann, and Van de Kuilen (2014), who find modest temperance in the aggregate. Figure 2 suggests a reflection effect for higher-order risk attitudes: responses are markedly different for the loss treatment compared to the gain treatments. For losses, we find much more risk-loving preferences, leading to risk neutrality in the aggregate, as well as imprudence and much temperance. This reflection effect is also visible in table 1, which shows the proportion of choices consistent with risk apportionment in each of the three treatments. 8 To test the reflection effect, we perform a Wilcoxon signed rank test. This indicates that the differences in the number of risk-averse choices between the loss treatment and the 50-50 gain treatment and the small probability gain treatment are significant, as are the differences in the number of prudent choices between the loss treatment and the gain treatments and the greater frequency of temperate choices in the loss treatment compared to the gain treatments, all with p-values smaller than 0.001.

A. Aggregate Behavior, Nonparametric Methods
For all risk attitudes in all treatments, the mode is either twelve or zero choices consistent with risk apportionment (risk aversion, prudence, or temperance), meaning fully consistent choices for or against risk apportionment, and in most cases, the second-most-common outcome is twelve choices for the reverse higher-order risk attitude. This consistency is reassuring considering the involved choices subjects need to make when measuring higher-order risk preferences.
The data appear to become noisier as we go from measuring risk aversion, to measuring prudence, to measuring temperance, at least for gains. Taking the distance from six choices for the option satisfying risk apportionment as an ordinal measure of consistency, 9 we find a negative Spearman rank correlation between the risk order and consistency for 50-50 gains (ρ −0.186, p-value <0.001) and small probability gains (ρ −0.134, p-value 0.001). The small probability gains data are more consistent than the 50-50 gains data: the average distance from random choice is 4.71 for 50-50 gains and 5.05 for small probability gains for second-order tasks Asterisks indicate a significant difference from 50% (binomial test) at the * * 5% and * * * 1% level.
(Mann-Whitney U test, p-value 0.059), respectively, 4.05 and 4.74 for third-order tasks (p-value 0.008), and 3.68 and 4.36 for fourth-order tasks (p-value 0.015). For losses, we find no relation between consistency and the risk apportionment order of the tasks (ρ 0.050, p-value 0.177).
The percentage of choices of the risk apportionment option are indicated in table 6 for each task. For most treatment-risk attitude pairs, the majority of choices is for or against risk apportionment across all tasks.
A preference for combining good with bad or good with good leads to three possible combinations of risk attitudes (see section II): those who are risk averse, prudent, and temperate; those who are risk loving, prudent, and intemperate; and those who are risk neutral, prudence neutral, and temperance neutral. At the aggregate level, we do not find evidence in support of a preference for combining good with bad or good with good underlying higher-order risk preferences. In the gain treatments, which induce the most risk aversion, we find the least temperance, and for losses we find imprudence in the aggregate.

B. Individual Behavior, Nonparametric Methods
Within treatments, we can test for the correlation between the various higher-order risk preferences. Spearman correlation coefficients and p-values are reported in table 2 for the three treatments. A preference for combining good with bad or good with good predicts that the number of temperate choices is positively correlated with the number of risk-averse choices, but the number of prudent choices is not. There is a significant correlation between temperance and risk aversion only for the small probability gain treatment, and there it is dwarfed by the correlation between prudence and risk aversion. We also find a significant correlation between prudence and risk aversion for losses.
A caveat to the tests of the correlations in table 2 is that large numbers of indifferent subjects would push any positive correlation toward 0. Many subjects chose the risk-averse, prudent, or temperate option no more than two times (out of twelve) or at least ten times, and the probability of doing so when choosing randomly is slightly less than 4%. We therefore classify subjects accordingly. The remaining subjects, who chose the risk-averse, prudent, or temperate option between three and nine times, we classify as risk neutral, prudence neutral, and temperance neutral. 10 The frequencies of types are reported in table 7 in appendix D.
Using this classification, we also find little evidence that fourth-order risk attitudes depend on second-order risk attitudes. For losses and 50-50 gains, the patterns of temperance do not appear to depend on second-order risk preferences, and a Fisher's exact test (p-value 0.610, respectively 0.645) cannot reject that the patterns are the same. The distribution of fourth-order risk preferences does depend on secondorder risk preferences for the small probability gain treatment (p-value 0.001), but this is mostly driven by the relatively large number of temperance-neutral subjects among the group of risk-neutral individuals. 11

C. Aggregate Behavior, Maximum Likelihood Estimation
The previous results are based on an informal argument that choices at the extremes are unlikely to be the result of a subject choosing randomly between the options. To model this explicitly, we perform maximum likelihood estimation (MLE). We estimate a mixed binomial distribution for each order of risk apportionment, with π s the proportion of decision makers who satisfy risk apportionment for that particular order, π o the proportion of decision makers with the opposite risk attitude, and π n the proportion of neutral or indifferent decision makers.
We allow for types with strict preferences to make errors. In any task, subjects who satisfy risk apportionment have a probability η < 1/2 of choosing the option that does not indicate risk apportionment, which is allowed to differ from the error rate δ < 1/2 for those who have the opposite preference. We assume that the probability that an indifferent decision maker chooses the option that indicates risk apportionment is equal to 1/2. 12 The probability of choosing the option indicating risk apportionment x out of twelve times can be represented by the following density for a given order, 10 This classification is consistent with the maximum likelihood estimations of the proportion of each type, which we present below. We also considered the choice patterns of perfectly consistent subjects only. These looked qualitatively similar. 11 When excluding neutral types, the difference is not statistically significant (p-value 0.117). Maximum likelihood estimation also does not indicate significance. See section IVD. 12 The position (left or right) of the lottery indicating risk apportionment was randomized for each task, so even an indifferent subject who always chooses the left or right option would choose the option indicating risk apportionment with a probability of 0.5. πs indicates the share of subjects satisfying risk apportionment (risk aversion, prudence, and temperance for orders 2, 3, and 4), mistakenly choosing the reverse option with probability η. πo indicates those with the opposite attitude and error rate δ, and πn the neutral subjects. Underlines indicate the mode. with π s + π n + π o = 1: The error rates are estimated separately for each higherorder risk preference because the lotteries become increasingly complex as the order of the risk preference increases, and it is important to be able to distinguish risk preferences from a tendency toward random choice. Noisy behavior may also be captured by the proportion of decision makers with neutral risk attitudes: a decision maker who is confused or inattentive may simply choose (almost) randomly. The parameter values, estimated using numerical methods, are presented in table 3.
The estimated proportions support the results from the nonparametric analysis. For losses, there are slightly more risk lovers than risk averters; there is also much imprudence and much temperance. For 50-50 gains and small probability gains, there is much risk aversion, much prudence, and some intemperance. Likelihood ratio tests, reported in table 4, show that seven of these differences are statistically significant. Only the risk-loving attitude for losses and intemperance for small probability gains are not significantly more common than the opposite attitude; in these cases, we cannot reject the null hypothesis that π s = π o against the alternative hypothesis that π s = π o . Null hypothesis is that the proportion of those who satisfy risk apportionment relative to those who have the opposite attitude (πs/πo) is the same for losses and the specified gain treatment.
Finally, to test the reflection effect, we test whether the proportion of those who satisfy risk apportionment relative to those who have the opposite preference is the same for losses and gains. Table 5 reports likelihood ratio estimates and p-values for the null hypothesis that π l s /π l o = π g s /π g o , where π l s /π l o is the proportion of subjects who satisfy risk apportionment for a given order relative to the proportion of subjects with the opposite attitude for losses, and π g s /π g o has the same meaning but for either gain treatment. The results show that the greater proportion of risk-loving, imprudent, and temperate (relative to risk-averse, prudent, and intemperate) subjects for losses than for gains are all highly significant.

D. Individual Behavior, Maximum Likelihood Estimation
To investigate a preference for combining good with bad or good with good further, we perform a maximum likelihood estimation where we test the frequencies of combinations of second-and fourth-order risk attitudes. Subjects are risk averse and temperate if they have a preference for combining good with bad, risk loving and intemperate if they have a preference for combining good with good, and risk neutral and temperance neutral otherwise. There are nine possible combinations of second-and fourth-order risk attitudes. We denote these different types as π a,b , a, b ∈ {s, n, o}, where a indicates whether the second-order risk attitude satisfies risk apportionment of order 2 (risk aversion), matches the opposite preference (risk-loving preferences), or is neutral toward risk apportionment of order 2 (risk neutral) and b indicates whether the fourth-order risk attitude satisfies fourth-order risk apportionment (temperance), matches the opposite attitude (intemperance), or is neutral (temperance neutral).
We again allow for errors, which differ across types and orders of risk attitudes. A risk-averse subject is assumed to have a probability υ of mistakenly choosing the riskier option, a risk-loving subject has a probability θ of mistakenly choosing the safer option, and a risk-neutral subject chooses either option with probability 1/2. Temperate subjects are assumed to mistakenly choose the intemperate option with probability ω, intemperate subjects mistakenly choose the temperate option with probability ζ , and temperance-neutral subjects choose either option with probability 1/2. The estimated proportions are presented in figure 3 for losses and in figure 4 the gain treatments.  If subjects have a preference for combining good with bad or good with good, then the symmetric types (π s,s , π n,n , π o,o ) should be most common. This means that most of the mass should be on the diagonals from the top left to the bottom right in figures 3 and 4. The data do not show such a pattern. For losses, the most common type in fact combines temperance with a risk-loving attitude. For 50-50 gains, the two most frequent types combine risk aversion with temperance neutrality and risk aversion with intemperance. For small probability gains, the most common type combines risk aversion with temperance, in agreement with a preference for combining good with bad, but the second most common type, with a share of 31% of the subjects, combines risk aversion with intemperance, and the difference in the proportions of these two types is only 1 percentage point.
To formally test the hypothesis of a preference for combining good with bad or good with good for gains and losses, we test whether π s,s + π o,o = π s,o + π o,s against the alternative that π s,s + π o,o = π s,o + π o,s using a (log) likelihood ratio test. We do not include the proportion of neutral types in the test as subjects may be classified as indifferent because they are confused, distracted, or inattentive, and the extent of this may depend on the order of the risk preference measured. Neither for losses (LR 1.11, p-value 0.292), nor for 50-50 gains (LR 2.01, p-value 0.157), nor for small probability gains (LR 0.18, p-value 0.674) can we reject that π s,s + π o,o = π s,o + π o,s . Thus, we do not find evidence that risk aversion is combined more often with temperance than intemperance or that risk lovers are more likely to be intemperate than temperate for gains or losses.

E. Background Characteristics and Robustness
In this section, we briefly report on tests to investigate the influence of various factors on responses. 13 First, we investigate the effects of background characteristics. As all subjects participated in a gain and the loss treatment, background characteristics cannot drive the reflection effect we find, but nonetheless it may be interesting to see whether they play a role. Gender in particular has been found to influence risk preferences in various studies (see Croson & Gneezy, 2009, for a review). Other characteristics such as age, nationality, income, or degree program may also play a role, although experimental samples tend to be quite homogeneous, making it difficult to detect such effects. We do not find robust evidence of background characteristics influencing responses. See the online appendix for details.
We also perform two robustness checks testing for session effects and ordering effects. The results of a Kruskal-Wallis test do not indicate that there are important session effects. ANOVA with variables indicating whether subjects were presented with gain or loss tasks first, whether the first block measuring a given risk attitude did so for losses or for gains, and the position (as the first, second, and so on) of 13 We thank an anonymous referee for this suggestion. each block of questions does not indicate significant ordering effects. The details are provided in the online appendix.

V. Discussion
The aggregate pattern of risk aversion, prudence, and slight intemperance we find for pure gains in our experiment mirrors earlier findings of Deck and Schlesinger (2010) and Baillon et al. (2017). Deck and Schlesinger (2014), Ebert and Wiesen (2014), and Noussair et al. (2014) also find risk aversion and prudence in the aggregate, but they also find moderate temperance rather than intemperance. Deck and Schlesinger (2014) point out that the modest intemperance of Deck and Schlesinger (2010) can be explained if there were unusually many risk lovers in their sample but could not verify this explanation as Deck and Schlesinger (2010) collected no information on second-order risk attitudes. The aggregate intemperance in our sample does not appear to be caused by intemperate risk lovers. The number of risk lovers is small in the gain treatments, and we do not find evidence that fourth-order risk preferences are a function of secondorder risk preferences. Baillon et al. (2017) also have few risk lovers in their sample while finding intemperance. The respective temperance or intemperance is quite weak at the aggregate level in all studies, suggesting that the observed discrepancies may simply be the consequence of differences in the makeup of samples combined with modest effects from differences in presentation.
Although a preference for combining good with bad or good with good seems to explain higher-order risk preferences for mixed lotteries, our results indicate this is not the case for pure gains or losses. We do not find evidence at the individual level that risk aversion is combined with temperance, and the imprudence we find for losses, as well as the combination of risk aversion and intemperance for gains, is evidence against the hypothesis. The differences between earlier findings and ours may be explained by loss aversion. This is a question that deserves attention in future studies.
We find clear reference dependence of higher-order risk preferences. As per the usual findings, the loss frame induces much more risk-loving preferences. We find that prudence and temperance are also affected: preferences shift from prudence and intemperance under gains to imprudence and temperance under losses. Thus, we have a full reversal of higher-order risk attitudes: risk aversion, prudence, and intemperance under gains and much more risk-loving preferences, imprudence, and temperance under losses. This is consistent with reflection (Kahneman & Tversky, 1979), where risk preferences are reversed under losses. The different findings suggest it should be worthwhile to investigate reference dependence further in the context of higher-order risk preferences.
Our results contrast with those by Deck and Schlesinger (2010) and Maier and Rüger (2012), who find no influence of a loss frame on higher-order risk preferences. Neither of those studies directly controls the reference point, which is needed to separate gains and losses. Deck and Schlesinger (2010) rewrite lotteries so fixed payments are presented as fixed deductions. Maier and Rüger (2012) have subjects return to the lab and lose part of their earlier winnings, but they cannot control what happens between sessions. Whether a reference point is induced successfully is ultimately an empirical question. Reproducing reference dependence of risk aversion demonstrates that outcomes intended to be gains and losses are perceived as such, and neither study does so. 14 Prospect theory (Tversky & Kahneman, 1992) has been suggested as an explanation of findings of higher-order risk preferences (Deck & Schlesinger, 2010). Reference dependence is an important component of prospect theory, which can explain the differences in higher-order risk preferences between the gain and loss treatments. The other important component of prospect theory is inverse-S-shaped probability weighting, which leads to risk-loving behavior for small probability gains. We do not find the predicted behavior: choices in the small probability gain treatment closely resemble those in the 50-50 gain treatment, meaning, in particular, that we observe clear risk aversion. A possible explanation for the observed risk aversion in the small probability gain treatment is that the probability weighting function is convex rather than inverse-S shaped. Convex probability weighting for gains has been found in some experiments, including Van de Kuilen and Wakker (2011). Table 8 in appendix E shows the predicted choices of prospect theory based on the functional form used in Tversky and Kahneman (1992) and the parameter values they estimate. These imply inverse-S probability weighting. For our tasks, the predictions are the same for all tasks in a given treatment measuring a specific risk attitude. Besides predicting that people are risk loving for our small-probability treatment, prospect theory predicts prudence 15 and intemperance for losses. It predicts prudence for both losses and gains because the prudent option has both a better best and worst outcome than the imprudent option, the probabilities of which are overweighted with inverse-S probability weighting. Convex probability weighting for losses can accommodate the imprudence we find, as this leads to underweighting the worst outcome.
As empirical findings point to a fourfold pattern of risk attitudes, it would have been interesting to include a small probability loss treatment. We did not do this for practical considerations. Another interesting extension would be to 14 Using hypothetical choices, Attema, l'Haridon, and Van de Kuilen (2019) find risk aversion and prudence for gains and risk neutrality and prudence neutrality for losses. Brunette and Jacob (2019) claim to find imprudence for losses, but their results do not support this. Their loss tasks are the negative of their gain tasks, but this reverses which option is prudent. Hence, their modal preference is prudence for both gains and losses. They also pay out two tasks, one of which may involve gains, so prospects are mixed rather than pure losses, and they do not reproduce reference dependence of risk aversion. 15 The utility function that Tversky and Kahneman (1992) use is prudent (under expected utility), but this does not drive the results: with linear utility, the same predictions follow. test for the effects of different endowments. We leave this for future work.
Our results indicate that it is important for policy recommendations to consider whether one is measuring higherorder risk preferences for gains, losses, or both. For example, when investigating how demand for insurance with nonperformance risk (a type of probabilistic insurance) relates to prudence, it will be important to measure prudence for losses, the domain in which insurance decisions are taken. The imprudence we find for losses indicates that demand for probabilistic insurance may be greater than expected based on the prudence found for gains, and policies aimed at reducing nonperformance risk (such as increasing trust in insurer solvency) may not increase demand by as much as expected based on an assumption of prudent consumers.
In a similar vein, prudence reduces the attractiveness of prevention efforts, which reduce but do not take away entirely the probability of a loss. Imprudence means that decision makers undertake more prevention efforts than predicted based on prudence found for gains. This suggests opportunities for nudging; for example, by framing outcomes as losses, people may be more willing to engage in action to reduce the probability of catastrophic climate change.
The influence of background risks on risk-taking behavior may also interact with the domain in unexpected ways: a financial crisis may at the same time lead to increased volatility of stocks and shift the domain of investment decisions of people with significant wealth tied to stocks to losses. Our results indicate this would lead to greater temperance, which means such people would become more risk averse and may, for example, respond by purchasing insurance against unrelated losses (such as health insurance) or by making more defensive investment decisions even for assets unaffected by the increased volatility of stocks.

VI. Conclusion
It is well established that second-order risk attitudes for gains are the mirror image of those for losses. In an experiment, we observe the reflection effect also for higher-order risk preferences. We find prudence for gains but imprudence for losses and intemperance for gains but temperance for losses. This reflection affects the external validity of higherorder risk preferences measured under gains only and has behavioral implications when choices involve losses.
The recent literature has found evidence for the hypothesis that higher-order risk preferences are generated by a preference for combining good with bad or good with good. Such a preference implies risk aversion should be combined with temperance and that all decision makers should be prudent. We find that correlations between the number of risk-averse choices and the number of temperate choices are small and insignificant. Furthermore, the imprudence we find as the majority preference for losses, and the simultaneous preference for risk aversion and intemperance on the aggregate level in the gain domain, are at odds with this hypothesis for pure gains and pure losses.