Abstract
While causal models are gaining popularity in science studies due to their explanatory power over certain phenomena, they are often misused, creating a false sensation of confidence. This letter aims to discuss the limitations of such models from internal and external critique, showing some of the risks they entail.
PEER REVIEW
1. INTRODUCTION
In recent years, causal graph models, introduced by Pearl (2009), have taken an increasingly prominent position in the study of inequalities in science (Moffitt, 2005; Traag, 2021; Traag & Waltman, 2022; Woodward, 2007). It is regarded by some as the only way to disentangle the essential from the concomitant (Pearl & Mackenzie, 2018). By showing some limitations of this framework, I aim to warn on the problems of considering causal models as the sole option for the study of inequalities, while showing the dangers of its misuses.
By construction, within the context of inequalities in science, causal models are aimed to answer a specific type of question: Is there enough evidence to prove direct discrimination? This is a question that can only be answered within the limits of the data set in use and controlling for all relevant confounding factors. This method of questioning reflects an underlying worldview of the phenomena under study. It establishes a directionality in the burden of proof, and a delimitation of what can be known and what falls beyond the scientific discussion. Both aspects can be considered as external critiques. But first, I will delve into the internal consistency of causal models for the study of inequalities in science.
2. INTERNAL CRITIQUE
As with any parametric statistical model, causal models rely on a series of assumptions. Among these, the completeness of the model is crucial (Pearl, 2009). There should not be any missing variable relevant to explain the phenomena, or at least we should expect that the omitted variables cancel each other out on average and can be considered noise. If the model is incomplete, there is no way to measure the degree to which the conclusions of the ill-defined causal model are valid or not.
The assumption of completeness is impossible to operationalize for two main reasons. The first is a technical limitation. Although we can measure, or conceptualize, a metric for many of the analytical categories involved with inequalities in science, a comprehensive data set that allows us to work with all those variables at the same time does not exist, and will not exist in the foreseeable future. The second limitation is deeper, as many elements are just not directly measurable—such as the prestige or the habitus—but still remain a relevant part of structural inequalities that needs to be qualitatively and/or theoretically accounted for. The intertwined nature of inequalities in science implies that unobserved factors have a multiplying effect rather than offsetting each other. The lack of completeness is a problem that affects all statistical models, and not just causal analysis. Although causal graphs make assumptions more explicit than other regression methods, this limitation persists and calls for complementary approaches. A second internal problem is that causal theory is built around causality structures that can be defined as directed acyclic graphs (Pearl & Mackenzie, 2018). But already the seminal work on inequalities in science, the Matthew effect (Merton, 1968), implies a cyclical mechanism where old citations drive new citations. The cumulative nature of inequalities implies both that inequalities cannot be conceived as directed acyclic graphs, and that any small—and possibly statistically undiscoverable—effect can eventually mount up to a big difference over time. In this sense, history presents a dual challenge for causal models. First, cumulative effects are associated with cyclic graphs, which cannot be properly modeled. Second, the mechanisms through which systemic inequalities take form also evolve over time, adding further complexity to the modeling process.
Alternatively, quasiexperimental designs can be defined for very specific case studies where it can be assured that only the variables of interest are in play. This scenario also presents the limitation of the degree of external validity of the results. In this sense, it shares some common ground with case studies and qualitative analysis, which within their case study can provide valuable knowledge, but its conclusions should not be considered valid for the population as a whole.
3. EXTERNAL CRITIQUE
One of the main harms that causal theory entails to the discussion of inequalities is that it redirects the focus of discussion. Fairness, disparity, and bias need to be defined as independent concepts (Traag & Waltman, 2022), and in many cases the questions deemed valid are focused only on bias (Cruz-Castro & Sanz-Menéndez, 2021; van den Besselaar, Mom et al., 2020). The historical forms of discrimination (by race, gender, nationality, or any other form) change over time, and the lack of explicit discrimination does not imply an absence of systemic injustice. Social inequality is a historical process, and as Traag and Waltman (2022) acknowledge, the legacy of past discrimination is a central element of understanding current inequalities. Moreover, individual choices—in terms of fields, research topics, career path, work-life balance, etc.—are not made in a void, but can be explained as part of the system (Larregue & Bourihane, 2024). Individuals’ choices are endogenous to the system, as they are based on their past and present material possibilities, their cultural and academic capital, and the stereotypes they are subjected to (Bourdieu, 1975). Causal models can be useful for the definition of specific policy interventions, but if we do not acknowledge their limitations, they can be used to funnel the discussion towards the existence—or absence of—direct discrimination, which is only a limited part of the overreaching problem.
4. EXAMPLE
A recent article published in QSS (Cruz-Castro & Sanz-Menéndez, 2023) proposed the following definition:
by gender disparity we understand a difference in the outcome of interest between male and female1 applicants; whereas by gender bias, we understand any difference between male and female applicants that is directly causally affected (and directly measured) by their gender; a gender disparity may be the result of an indirect causal pathway from someone’s gender to a particular outcome and may be affected by differences in merit, but a gender bias is a direct causal effect of the action of reviewers.
The main problem that our society faces nowadays is the cumulative nature of inequalities—within the life cycle and intergenerationally—rather than the direct discrimination that an individual faces at one specific moment of their lives. Therefore, the most needed work to be done in this area is that which focuses on the systemic aspect rather than on the explicit discrimination that could be measured with a causal framework.
5. CONCLUSION
On a recent critical self-reflection of his field, the Nobel Prize winner in economics Angus Deaton considered that
The currently approved methods, randomized controlled trials, differences in differences, or regression discontinuity designs, have the effect of focusing attention on local effects, and away from potentially important but slow-acting mechanisms that operate with long and variable lags. Historians, who understand about contingency and about multiple and multidirectional causality, often do a better job than economists of identifying important mechanisms that are plausible, interesting, and worth thinking about, even if they do not meet the inferential standards of contemporary applied economics. (Deaton, 2024)
ACKNOWLEDGMENTS
I want to thank Yotam Sofer, Carolina Pradier and Natsumi S. Shokida for their valuable comments and suggestions. This work was possible thanks to the support of the SSHRC.
COMPETING INTERESTS
The author has no competing interests.
FUNDING INFORMATION
This project was funded by the Social Science and Humanities Research Council of Canada Pan-Canadian Knowledge Access Initiative Grant (Grant 1007-2023-0001), and the Fonds de recherche du Québec-Société et Culture through the Programme d’appui aux Chaires UNESCO (Grant 338828).
Note
While causal analysis is often presented as a more rigorous approach to the discussion, the misuse of terminology (using male and female instead of men and women to refer to gender) is also telling about their lack of conceptual rigor when discussing gender inequality.
REFERENCES
Author notes
Handling Editor: Gemma Derrick