John Mearsheimer and Stephen Walt have written a piece that is critical of the supposed move to hypothesis testing and the failure of IR folks to do grand theory. I have many reactions to this development that I thought I would engage in a bit of listicle:
- My first reaction was: Next title: why too much research is bad for IR….
- As folks pointed out on twitter and on facebook discussions, it seems ironic at the least that someone who made a variety of testable predictions that did not come true (the rise of Germany after the end of the cold war, conventional deterrence, the irrelevance of international institutions, etc) would suggest that testing our hypotheses is over-rated or over-done.
- When I was preparing for my comprehensive exams long ago, I worked with a member of my cohort who was a Political Theorist just trying to get through the process. He would just read everything and ask “what would Ken Waltz think of this?” Well, invoking the WWWD mantra here, I think he might wonder why M&W are writing this stuff when they could be producing yet more Grand Theory or more Grand Theorists. Waltz produced Walt after all ….
- Which leads to the next question: what does this complaint say about their students? Either they failed their students (their students did not learn to do good grand theory) or the students have failed them (their students have focused on stuff other than grand theory). I know a good number of their students, and their work is often quite terrific and influential, so I am confused.
- If M&W have failed to re-generate themselves, it could be because they and their generation of grand theorists have answered all of the big questions, leaving us with the small questions and the dirty work of testing hypotheses. Perhaps they should be happy that their work is done and ride off into the sunset?
- Perhaps the utility M&W really seek to maximize is citations (given what they say at the end of the piece, I guess I am wrong here…). I became convinced in the early 1990s that producing controversial work seemed to be more important than producing convincing work. Mearsheimer’s piece blasting the “False Promise of Institutions” seemed to be citation-bait to me. Similarly, Walt’s article finding fault with the move towards formal theory seemed aimed not so much at convincing people but at attracting counter-attacks. [It is interesting that their latest piece cites approvingly the Fearon 1995 IO piece on Rationalist Explanations for War that Walt considered old wine in new bottles way back when].
What really frustrates me is that their claims make them bad realists and make me a Marxist. How so?
As realists, they think that power matters greatly in international relations–determining not just outcomes but interests. But M&W seem to ignore the role of power in the Political Science profession and especially in the IR economy. Who controls the commanding heights of the IR profession? Those who run the major journals. Those that serve as editors of series at the major presses. Those that run or influence the major fellowship-granting, post-doc giving institutions. Those that work at the most prestigious institutions and thus have access to the smartest students, to the largest endowments, to the media, and to the policy world. Mearsheimer and Walt seem to forget that they are among the most powerful and influential figures in our profession, yet they often feel so oppressed that they support the Perestroika movement and complain when institutions do not hire their students.*
The important thing to keep in mind is that M&W have a great deal of power in the profession, that folks have often feared their ire, and yet they feel as if the field is passing them by, focusing more on medium range theory (which can be tested) rather than grand theory (which can always be rationalized). In their article, they assert that theory is downgraded, and I would respond that grand theory may be a bit passé at the moment but there is heaps of theory out there. Because Realism is so indeterminate, because Liberalism is such a broad school (of which I think I am a member) and because constructivism is not really a paradigm with a shared core set of logics, we need more theory, not less, to develop clear expectations that can be subjected to tests. I think, ultimately, the complaint here is not about theory vs hypothesis testing but grand theory versus everything else (just like Walt’s complaint about formal theory conflated rational choice theory with formal modeling).
In terms of specific gripes about their piece,
- they avow that they are scientific realists (as opposed to instrumentalists) in terms of epistemology–that assumptions must be realistic, not useful fictions. Ok, fine, and rational actor assumptions for individuals can be considered to be realistic, but rational actor assumptions for states? Probably closer to useful fictions than something that can “be shown to be right or wrong.” So, I am confused.
- I am confused why they digress into the scientific realism vs instrumentalism epistemology discussion at all since it does not really connect directly to their complaint about theory vs. hypothesis testing. I can imagine a two by two where we have the four combinations of work: scientific realism and theory, SR and hypothesis testing, instrumentalism and theory, and instrumentalism and hypothesis testing and that each box is chock full of IR articles/books.
- Instrumentalists do not believe that process tracing is useful way to test theories? Oh, there is where this stuff matters. They are going to argue that process tracing is on the wane? I am really confused now because there is plenty of work that “tests hypotheses” via process tracing–do the casual processes avowed in the theory play out in reality as we trace the course of events? They cite the TRIP report stuff a lot in this piece (just as Satan can quote scripture), but not clear that the real numbers on the work being done today shows that process tracing is less in style than before.
- They go on to list the virtues of theory. Which is cool and useful and completely familiar to anyone who graduated a PhD program in the past forty years or so.
- Their discussion of “hypothesis testing” seems pretty insulting to me. That people have questions about a phenomenon, such as war, they identify variables that might be relevant, then they identify a dataset, and then there is testing….. I love that this starts with “At the risk of caricature.” It is a caricature. Yes, some people start with problems as opposed to starting with a grand theory they want to play with. But their choice of variables is not just reaching into a random bag of variables (well mostly), but considering the existing theory, extending/developing/inventing one’s own theory, and then indeed testing.
- This contrast they set up reminds of something that actually does exist or did way back when–that there were two schools of IR–coastal (Chicago is on a lake so it counts as coastal) and midwestern with the former focused on generating theory and the latter focused on testing theory. There was something to this distinction, but over the years, the coastal folks have gotten more interested in the construction of datasets to test their hypotheses (note that David Lake and the rest of the bandits at UCSD are on a coast) and the midwestern folks who used to reach into the Correlates of War dataset to assess whether there were correlates … of war are now doing some very interesting theoretical work that they then test.
- They assert that the hypothesis testers are not focused on the microfoundations of the work: “little intellectual effort is devoted to creating or refining theory; i.e., to identifying the microfoundations and causal logics that underpin the different hypotheses. Nor is much effort devoted to determining how different hypotheses relate to one another.” Really?
- I love the fact that they use Fearon to attack Huth and Russett when Fearon would appear to be just as guilty of being an instrumentalist hack at times (see his work on Insurgency and Civil War that has been most influential) under the M&W definitions of instrumentalist hackery.
- This article here is reminiscent of other stuff I have seen that attacks quantitative work but calls it something else. Yes, the data is not great, but there are problems with process tracing as well. Indeed, if we find correlations despite shaky data, it might mean the relationships are actually that much more convincing.
- Lack of cumulation as a problem for the hypothesis testers? It seems to me that the grand theory debates of the 1980s and 1990s had limited cumulation. After all, Mearsheimer was revising Realism back to Morgenthau with his focus on the quest for power rather than security. How is that cumulative?
- Citing the democratic peace stuff here is kind of funny since that finding led to a heap of theoretical competition–each new entrant into this debate was compelled by the existence of a core empirical finding to develop distinct theories and then test them in new ways. This was not hypothesis testing of adding just a few variables but really thinking about the causal connections between democracy and peace.
- They blame the expansion of IR PhD programs for this focus on hypothesis testing. Any institution can train students in methodology but so few can attract the brains who can think creatively and theoretically. It would be tempting to point out that one can see many amazingly sharp people who have creative juices aflowing who were so poorly trained that they could not articulate a research design, but I shall refrain.
- Not sure that this is true: “privileging hypothesis testing creates more demand for empirical work and thus for additional researchers.” This is so anti-Moneyball–that if we do go too far down a hypothesis testing path, wouldn’t the folks who are the grand theorists be that more rare and thus special and appreciated? I do think our market is a bit self-correcting (again, they make me a Marxist, damn it) as the fetish for formal modeling has worn off, that the fad in the most high tech methods has run its course, and now we have a fad of experiments.
- I love one of the very last lamentations in the piece: “Instead of relying on “old boy” networks, a professionalized field will use indicators of merit that appear to be impersonal and universal. In the academy, this tendency leads to the use of “objective” criteria—such as citation counts—when making hiring and promotion decisions.” Are they actually saying that old boy networks are better than trying to use more objective criteria? That citation counts are causing the move away from grand theory? Sure, the person who invents a great dataset gets heaps of citations, but so do the folks who come up with some great theory. Or at least folks who come up with theories/articles that people pay attention to (perhaps they should do some google scholar searches to see who is getting the most cites–the theorists or the folks they would accuse of being inductive hypothesis testers–but that would involve … dare I say it .. hypothesis testing). As someone who has lamented about being Rudolph, left out of the reindeer games, I can say this: the sooner we come up with objective criteria instead of relying on old boy** networks, the better.
- “What matters is one’s citation count, not helping outsiders understand important policy issues.” Um, how does grand theory help outsiders understand important policy issues better than middle range theory?
- “Academic disciplines are socially constructed and self-policing; if enough IR scholars thought the present approach was not working, they could reverse the present trajectory.” Ah, realism does not apply to the discipline because it has to do with persuasion and not power.
- “Emphasizing quality over quantity in a scholar’s portfolio might help.” They apparently do not follow Political Science Job Rumors which focuses oh so much on the importance of getting into APSR, IO, and the top presses. I wonder if you did some data collection and assessed who was getting tenure where, would it be about citation counts or hits in the top outlets or neither? If we just led the old boy network sort this out, I am sure things would be fine.
- Great conclusion: “The study of IR should be approached with humility.” Indeed. Perhaps this piece is not aimed at maxing citations but unintentional comedy?
Most folks are still doing their own thing–some quant, some qual, some realist, some not, some focused more on generating theory and some more focused on testing competing theories. Our discipline is actually a pretty big tent. As a result, one’s view of it depends on where one sits at the circus–people notices the folks who are unlike themselves with lots of confirmation bias affirming their sense of minority One can always feel like an outsider even when one is standing astride some of the most important institutions in the profession.
*This is not just a realist thing. I remember seeing Peter Katzenstein complain at an APSA or ISA about how oppressed constructivists were at a time where they were getting the best jobs, the best post-docs, getting their books and articles published in the best outlets.
** Oh, and, note while they probably meant nothing by the “boy” in “old boy”, one might ponder whether old boy networks might just be a wee bit sexist.