Symposium — Leaving Theory Behind: Why Simplistic Hypothesis Testing is Bad for IR

7 September 2013, 0915 EDT

EJT_19_3_cover.inddEditor’s Note: This is a guest post by John J. Mearsheimer and Stephen M. Walt. It is the third installment in our “End of IR Theory” companion symposium for the special issue of the European Journal of International Relations. SAGE has temporarily ungated all of the articles in that issue. This post refers to the article of the same name (PDF). A response, authored by Dan Reiter, will appear at 10am Eastern.

Other entries in the symposium–when available–may be reached via the “EJIR Special Issue Symposium” tag.

Theory is the lodestone in the field of International Relations (IR). Its theorists are the field’s most prestigious scholars and the books and articles that dominate the study of IR are all theory-laden works. Yet IR is moving away from developing or carefully employing theories and instead emphasizing “simplistic hypothesis testing.” Theory plays a minor role in this enterprise, where most of the effort is devoted to collecting data and testing empirical propositions.

Unfortunately, deemphasizing theory and privileging hypothesis testing is a misstep that is less likely to produce important new knowledge about international politics. Although testing hypotheses is an essential component of social science, the creation and refinement of theory is the most important activity in any field of study. Because the world is infinitely complex, we need mental maps to identify what is important in different domains. In particular, we need theories to identify the causal mechanisms that explain recurring behavior and show how these mechanisms relate to each other.

Theories are simplified pictures of reality. They provide general explanations that apply across space and time. Although theories require simplification and abstraction, their component parts must still refer to entities and processes that exist in the real world. Even if they are not directly observable, the assumptions and causal mechanisms that underpin a theory must be a reasonable approximation of reality.

Theories are essential because they provide an overarching framework—the big picture—for a specific domain. Novel theories can revolutionize our understanding of the world—as Darwin’s theory of evolution did—and theories allow us to predict the consequences of different actions. Thus, theory is essential for diagnosing policy problems, making policy decisions, and evaluating policy outcomes. Theories help us to look at the past in different ways, and they are especially valuable when dealing with new situations or when facts are sparse. Finally, theory is essential for conducting valid empirical tests; hypothesis tests that are not guided by a sophisticated understanding of theory are unlikely to produce useful cumulative knowledge.

Social science requires both developing and testing theory; the challenge is to find the optimal balance between these two activities. Unfortunately, in recent years the balance in IR has shifted away from theory and toward simplistic hypothesis testing, to the detriment of the field.

Simplistic hypothesis testing begins by choosing a particular phenomenon (the dependent variable), which is often a familiar topic like war, alliance behavior, human rights performance, etc. The next step is to identify one or more independent variables that might account for significant variation in the dependent variable. The researcher(s) then selects or compiles data sets containing measures of the independent, dependent and possible control variables. Finally, the various hypotheses are tested against each other, using appropriate methodological techniques to deal with potential sources of bias. The desired result is one or more well-verified hypotheses, which hopefully will become part of a growing body of knowledge about international behavior.

For the most part, contemporary hypothesis-testers are not engaged in pure induction, insofar as the hypotheses under study are sometimes drawn from earlier theoretical works. Nonetheless, theory plays a modest role in much of their work.  In particular, little attention is paid to explaining how or why a particular independent variable might cause the dependent variable. Nor is much effort devoted to devising a general explanation for the observed results. Instead, the main effort is on finding statistically significant relationships within the data.

Unfortunately, this approach to research leads to several important problems.

First, if scholars employ statistical models that do not conform to the underlying theory, then the hypothesis tests will not produce meaningful results. Second, valid hypothesis tests require data that correspond to the underlying concepts being studied. Insufficient attention to theory can lead researchers to employ indicators or measures that do not capture the concepts being “tested.” Third, privileging hypothesis testing is also unwise given the low quality of much of the data in the IR field.  Fourth, by themselves, hypothesis tests cannot explain the observed results or tell us how different hypotheses fit together. Finally, lack of attention to theory inhibits the cumulation of knowledge. Lacking a common theoretical framework, simplistic hypothesis testers often use different models to study the same phenomena, define key variables in different ways, and employ different data sets and estimation procedures.  Competing findings keep piling up, but consensus and cumulation remain elusive.

Why is the IR field headed in this direction? The main reason is the prevailing incentive structure in academia. For starters, Ph.D. programs tend to emphasize hypothesis testing rather than theory, simply because there is no program of study that can reliably create imaginative theorists. But any graduate program can train students how to test hypotheses, so that is what they do.

The prevalence of simplistic hypothesis testing also reflects the professionalization of academia itself. Established professions often employ esoteric vocabulary and arcane techniques in order to highlight their “specialized knowledge.” They also tend to adopt simple and seemingly objective ways to evaluate different members of their profession. Taken together, these features encourage academics to employ rarified methodological techniques that are hard for outsiders to understand and to evaluate each other using citation counts or other basic metrics. Simplistic hypothesis testing is ideal from this perspective, because it can generate lots of publications with a minimum of risk. Doing theory is inherently chancier, because one can work diligently for many years, never make a significant conceptual advance, and thus have little to show for one’s efforts.

Diminished attention to theory is likely to increase the already-wide gulf between academia and the policy world, because many hypothesis testers are not interested in policy issues and neglecting theory will leave the IR field less able to help policymakers understand the “big picture” or formulate effective policy responses. This trend runs the risk of making the academic study of IR even less relevant to understanding and solving important real world problems.

The present trajectory could be reversed if enough IR scholars decided theory should be restored to its proper place. Such an epiphany is unlikely, however, due to the powerful professional incentives encouraging simplistic hypothesis testing. This situation does not augur well for the IR field. Without good theories, we cannot trust our empirical findings, and we cannot even make sense of all the provisional hypotheses that scholars keep generating. There are many roads to better theory, but that should be the ultimate destination.