Tag: methods

Interested in critical international politics?

GregynogThe Gregynog Ideas Lab, a thinking space for scholars interested in studying global politics from a range of critical, postcolonial, feminist, post-structural and psychoanalytic traditions, takes place every summer at Gregynog Hall in mid-Wales (UK).

This unusual summer school offers a set of seminars & workshops, an artist-in-residence, methods training and one-on-one consultations to allow graduate students and established scholars to re-examine their own work, participate in ongoing conversations and meet new people who share an interest in critical international politics. Participants – both guest professors and students – come from various corners of the world and it is above all the informal and open atmosphere that is valued by all.

Continue reading


Labels and tribes

In the Matrix, it’s trivial to specify the underlying
data-generating process. It involves kung fu.

 Given PTJ’s post, I wanted to clarify two points from my original post on Big Data and the ensuing comment thread.

I use quantitative methods in my own work. I’ve invested a lot of time and a lot of money in learning statistics. I like statistics! I think that the development of statistical techniques for specifying and disciplining our analytic approach to uncertainty is the most important development in social science of the past 100 years. My objection in the comments thread, then, was not to the use of statistics for inference. I’m cautious about our ability to recover causal linkages from observational data, but no more so than, say, Edward Leamer–or, for that matter, Jeffrey Wooldridge, who wrote the first econometrics textbook I read.

My objection instead is to the simple term “inferential statistics,” because the use of that term to describe certain statistical models, as opposed to the application of statistical models to theoretically-driven inquiry, often belies an unconscious acceptance of a set of claims that are logically untenable. The normal opposition is of “inferential” to “descriptive” statistics, but there is nothing inherently inferential about the logistic regression model. Indeed, in two of the most famous applications of handy models (Gauss’s use of least-squares regression to plot asteroid orbits  and von Botkiewicz’s fitting of a Poisson distribution to data about horses kicking Prussian officers), there is no inference whatsoever being done; instead, the models are simply descriptions of a given dataset. More formally, then, it is not the case that “inferential” describes a property of statistical models, but rather should be taken strictly to refer to their use. What is doing the inferential work is the specification of parameters, which is why it is sometimes entirely appropriate to have a knock-down fight over whether a zero-inflated negative binomial or a regular Poisson is the best fit for a given test of a given theory.

So, my objection on this score is narrowly to the term “inferential statistics,” which I simply suggest should be replaced by something slightly more cumbersome but much more accurate: “the use of statistics for inference.” What this definition loses in pedantry it gains in accuracy.

The second point is that my post about Big Data was meant to serve as a warning to qualitative researchers about what could happen if they did not take the promise of well-designed statistical methods for describing data seriously. My metaphor of an invasive species was meant to suggest that we might end up with a much-impoverished monoculture of data mining that, by dint of its practitioners’ superior productivity, would displace traditional approaches entirely. But the proper response to this is not to equate the use of statistical methods with data mining (as I think a couple of commenters thought I was arguing). Quite the contrary: It would be much preferable for historians to learn how to use statistics as part of a balanced approach than for historians to be displaced by purely data miners.

This is all the more relevant because the flood of Big Data that is going to hit traditionally qualitative studies will open new opportunities for well-informed and teched-up researchers who can take advantage of the skills that leverage the availability of petabytes of data. After all, the real enemy here for qual and quant researchers in social science is not each other but a new breed of data miner who believes that theory is unnecessary, a viewpoint best expressed in 2008 by Chris Andersen in Wired:

But faced with massive data, this approach to science — hypothesize, model, test — is becoming obsolete. … There is now a better way. Petabytes allow us to say: “Correlation is enough.” We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.

I feel confident that no reader of the Duck wants to see this come to pass. The best way to head that off is not to adopt an thinking anti-statistical stance but rather to use those methods when proper in order to support a deeper, richer understanding of social behavior.

Stuff political scientists like #5 — a Large N

I have been doing a lot of work with survey data lately, as well as some reading in critical theory. Maybe that inspired my deconstruction of the gendered language of stats. Or maybe I just like to work blue.

Your girlfriend has told you, “Honey, your data set is big enough for me. It’s OK if it doesn’t get you into the APSR.” She might tell you, “It is not the size of p-value that matters, it is what you do with it.” A good theory can make up for a large-N, she reassures you. But political scientists know the truth. Size matters. Political scientists like a large-N.

A large-N enables you to find a statistically significant relationship between any two variables, and to find evidence for any number of crazy arguments that are so surprising, they will get you published. Political scientists like to be surprised. Your theory might be dapper and well dressed, but without the large-N, political scientists will not swoon. They go crazy for those little asterisks.

Some qualitative researcher might come in and show that your variables are not actually causally related, but it will be too late. You will have 200 citations on Google Scholar, and their article will be in the Social Science Research Network archive forever. Your secret is safe. Go back to Europe, qually!

Political scientists also like a large-N because it gives you degrees of freedom. You can experiment with other variables in your model without worrying about multicollinearity. You aren’t tied down to one boring variable. Political scientists like to swing.

Political scientists prefer it if the standard error in your data is smooth and consistent and does not increase as the X value rises. Consider waxing or shaving your data with simple robust standard errors if you have problems with heteroskedasticity. They also like a big coefficient that slopes upward. Doesn’t everyone? And fit, don’t forget about fit. Fit makes things more enjoyable.

It is best if your large-N data does not have a lot of measurement error. You might say, a little is natural, like when I jump in the pool, but this is not acceptable in political science. You should, however, have variation in your dependent variable. Variety is good. It keeps things spicy. When a political scientists wants to get really kinky, he or she will bootstrap his data.

It is best if your data is normally distributed, but political scientists generally forgive that. They like data of all shapes and sizes. They just close their eyes and pretend that it is symmetrical. Binomial. Fat tails. Oooh. That just sounds dirty.

Political scientists will tell you that if your dataset is not big enough, your confidence intervals will be too wide. Paradoxically, this will drain your confidence and make it harder for you to perform in the future. But don’t worry, they have drugs for that.

Don’t leave anything to chance. Get yourself a large-N. But don’t listen to those ads on TV late at night. Those quick data fixes don’t work.


Beyond Qual and Quant

PTJ has one of the most sophisticated ways of thinking about different positions in the field of International Relations (and, by extension, the social sciences), but his approach may be too abstract for some. I therefore submit for comments the “Political Science Methodology Flowchart” (version 1.3b).

Note that any individual can take multiple treks down the flowchart.

Of Quals and Quants

Qualitative scholars in political science are used to thinking of themselves as under threat from quantitative researchers. Yet qualitative scholars’ responses to quantitative “imperialism” suggest that they misunderstand the nature of that threat. The increasing flow of data, the growing availability of computing power and easy-to-use software, and the relative ease of training new quantitative researchers make the position of qualitative scholars more precarious than they realize. Consequently, qualitative and multi-method researchers must not only stress the value of methodological pluralism but also what makes their work distinctive.

Few topics are so perennially interesting for the individual political scientist and the discipline as the Question of Method. This is quickly reduced to the simplistic debate of Quant v. Qual, framed as a battle of those who can’t count against those who can’t read. Collapsing complicated methodological positions into a single dimension obviously does violence to the philosophy of science underlying these debates. Thus, even divisions that really affect other dimensions of methodological debate, such as those that separate formal theorists and interpretivists from case-study researchers and econometricians, are lumped into this artificial dichotomy. Formal guys know math, so they must be quants, or at least close enough; interpretivists use language, ergo they are “quallys” (in the dismissive nomenclature of Internet comment boards), or at least close enough. And so elective affinities are reified into camps, among which ambitious scholars must choose.

(Incidentally, let’s not delude ourselves into thinking that mutli-method work is a via media. Outside of disciplinary panels on multi-method work, in their everyday practice, quantoids proceed according to something like a one-drop rule: if a paper contains even the slightest taint of process-tracing or case studies, then it is irremediably quallish. In this, then, those of us who identify principally as multi-method stand in relation to the qual-quant divide rather as Third Way folks stand in relation to left-liberals and to all those right of center. That is, the qualitative folks reject us as traitors, while the quant camp thinks that we are all squishes. How else to understand EITM, which is the melding of deterministic theory with stochastic modeling but which is not typically labeled “multi-method”?)

The intellectual merits of these positions have been covered better elsewhere (as in King Keohane and Verba 1994, Brady and Collier’s Rethinking Social Inquiry, and Patrick Thaddeus Jackson’s The Conduct of Inquiry in International Relations). Kathleen McNamara, a distinguished qualitative IPE scholar, argues against the possibility of an intellectual monoculture in her 2009 article on the subject. And I think that readers of the Duck are largely sympathetic to her points and to similar arguments. But even as the intellectual case for pluralism grows stronger (not least because the standards for qualitative work have gotten better), we should realize that is incontestable that quantitative training makes scholars more productive (in the simple articles/year metric) than qualitative workers.

Quantitative researchers work in a tradition that has self-consciously made the transmission of the techne of data management, of data collection, and the analysis of data vastly easier not only than its case-study, interpretivist, and formal counterparts but even than quant training a decade or more ago. By techne, I do not mean the high-concept philosophy of science. All of that is usually about as difficult and as rarefied as the qualitative or formal high-concept readings, and about as equally useful to the completion of an actual research project–which is to say, not very, except insofar as it is shaped into everyday practice and reflected in the shared norms of the average seminar table or reviewer pool. (And it takes a long time for rarefied theories to percolate. That R^2 continues to be reported as an independently meaningful statistic even 25 years after King (1986) is shocking, but the Kuhnian generational replacement has not yet so far really begun to weed out such ideological deviationists.)

No, when I talk about techne, I mean something closer to the quotidian translation of the replication movement, which is rather like the business consultant notion of “best practices.” There is a real craft to learning how to manage data, and how to write code, and how to present results, and so forth, and it is completely independent of the project on which a researcher is engaged. Indeed, it is perfectly plausible that I could take most of the thousands of lines of data-cleaning and analysis code that I’ve written in the past month for the General Social Survey and the Jennings-Niemi Youth-Parent Socialization Survey, tweak four or five percent of the code to reflect a different DV, and essentially have a new project, ready to go. (Not that it would be a good project, mind you, but going from GRASS to HOMOSEX would not be a big jump.) In real life, there would be some differences in the model, but the point is simply that standard datasets are standard. (Indeed, in principle and assuming clean data, if you had the codebook, you could even write the analysis code before the data had come in from a poll–which is surely how commercial firms work.)

There is nothing quite like that for qualitative researchers. Game theory folks come close, since they can tweak models indefinitely, but of course they then have to find data against which to test their theories (or not, as the case may be). Neither intepretivists nor case-study researchers, however, can automate the production of knowledge to the same extent that quantitative scholars can. And neither of those approaches appear to be as easily taught as quant approaches.

Indeed, the teaching of methods shows the distinction plainly enough. Gary King makes the point well: unpublished paper:

A summary of these features of quantitative methods is available by looking at how this information is taught. Across fields and universities, training usually includes sequences of courses, logically taken in order, covering mathematics, mathematical statistics, statistical modeling, data analysis and graphics, measurement, and numerous methods tuned for diverse data problems and aimed at many different inferential targets. The specific sequence of courses differ across universities and fields depending on the mathematical background expected of incoming students, the types of substantive applications, and the depth of what will be taught, but the underlying mathematical, statistical, and inferential framework is remarkably systematic and uniformly accepted. In contrast, research in qualitative methods seems closer to a grab bag of ideas than a coherent disciplinary area. As a measure of this claim, in no political science department of which we are aware are qualitative methods courses taught in a sequence, with one building on, and required by, the next. In our own department, more than a third of the senior faculty have at one time or another taught a class on some aspect of qualitative methods, none with a qualitative course as a required prerequisite.

King has grown less charitable toward qualitative work than he was in KKV. But he is on to something here: If every quant scholar has gone from the probability theory –> OLS –> MLE –> {multilevel, hazard, Bayesian, … } sequence, what is the corresponding path for a “qually”? What could such a path even look like? And who would teach it? What books would they use? There is no equivalent of, say, Long and Freese for the qualitative researcher.

The problem, then, is that it is comparatively easy to make a competent quant researcher. But it is very hard to train up a great qualitative one. Brad DeLong put the problem plainly in his obituary of J.K. Galbraith:

Just what a “Galbraithian” economist would do, however, is not clear. For Galbraith, there is no single market failure, no single serpent in the Eden of perfect competition. He starts from the ground and works up: What are the major forces and institutions in a given economy, and how do they interact? A graduate student cannot be taught to follow in Galbraith’s footsteps. The only advice: Be supremely witty. Write very well. Read very widely. And master a terrifying amount of institutional detail.

This is not, strictly, a qual problem. Something similar happened with Feynman, who left no major students either (although note that this failure is regarded as exceptional). And there are a great many top-rank qualitative professors who have grown their own “trees” of students. But the distinction is that the qualitative apprenticeship model cannot scale, whereas you can easily imagine a very successful large-lecture approach to mastering the fundamental architecture of quant approaches or even a distance-learning class.

This is among the reasons I think that the Qual v Quant battle is being fought on terms that are often poorly chosen, both from the point of view of the qualitative researcher and also from the discipline. Quant researchers will simply be more productive than quals, and that differential will continue to widen. (This is a matter of differential rates of growth; quals are surely more productive now than they were, and their productivity growth will accelerate as they adopt more computer-driven workflows, as well. But there is no comparison between the way in which computing power increases have affected quallys and the way they have made it possible for even a Dummkopf like me to fit a practically infinite number of logit models in a day.) This makes revisions easier, by the way: a quant guy with domesticated datasets can redo a project in a day (unless his datasets are huge) but the qual guy will have to spend that much time pulling books off the shelves.

The qual-quant battles are fought over the desirability of the balance between the two fields. And yet the more important point has to do with the viability, or perhaps the “sustainability,” of qualitative work in a world in which we might reasonably expect quants to generate three to five times as many papers in a given year as a qual guy. Over time, we should expect this to lead to first a gradual erosion of quallies’ population, followed by a sudden collapse.

I want to make plain that I think this would be a bad thing for political science. The point of the DeLong piece is that a discipline without Galbraiths is a poorer one, and I think the Galbraiths who have some methods training would be much better than those who simply mastered lots and lots of facts. But a naive interpretation of productivity ratios by university administrators and funding agencies will likely lead to qualitative work’s extinction within political science.


What is Coding, Anyway?

One of my tasks since getting back from hiatus has been to wade through political science journals that piled up over the summer. In the April issue of PS: Political Science and Politics, I discovered this little gem: a one-page “article” entitled “Picturing Political Science” which consisted of the following:

“What do political scientists study? As part of a larger project, we coded every article in 25 leading journals between 2000 and 2007. We then created a word cloud of the 6,005 titles using https://www.wordle.net. The 150 most-used words appear in the word cloud. The size of each word is proportional to the number of times the word is mentioned. Draw your own conclusions.”

I really like the idea that a mainstream political science journal would legitimate a Wordle cloud as a genuine piece of scholarship. And in that vein, let me engage with the piece on its methodological and conceptual merit.

Leave aside the questionable link between the image and the title of the “piece.” (Are titles alone really the best indicator of the content of a work of scholarship? My recent experience with my book publisher suggests not.) What got me was the claim in the “abstract” of the piece that the authors “coded every article” in these journals. What in the world do they mean by “coding”? The authors do not tell us, and do not share any coded data with us. Did they simply mean they extracted the titles, pasted them into a word document, and plugged them into Wordle?

If so, that’s not coding, and claiming that it is only spreads confusion about the meaning of the term. (Which may be considerable. The Wikipedia page on “coding in the social sciences” is little more than a stub at present – consider this a call for concerned academics with more time on their hands than I to flesh it out with citations and nuance.)Broadly speaking, coding is the act of categorization for the purposes of analysis (it’s not the same as just counting frequencies of terms).

Researchers working with quantitative datasets “code” as they prepare datasets for statistical analysis, when they determine (for example) that conflicts with fewer than 1000 battle deaths constitute a 0 and those with 1000 or more constitute a 1 in a spreadsheet. A process of interpretation (not simply an automated frequency count of words) is involved in analytically converting historical records to numbers.

For those using qualitative methods and working with text rather than numbers, “coding” involves assigning categories of meaning to specific passages in text (interviews, focus group transcripts, blog posts, news articles, or something else). The method for so doing can be entirely interpretive, as when a graduate student goes through a stack of Security Council resolutions with different colored highlighters; or it can involve a more rigorous process where a detailed codebook is designed for use by independent coders to apply the annotations separate from the principal investigator, and where mathematical equations such as Cohen’s Kappa are used to measure the reliability of different annotations among coders. It can involve sorting documents into stacks on an office floor, or it can involve sophisticated and layered annotations on a text file using advanced qualitative data analysis software, documenting an analytical process whereby others might replicate one’s work.

So as we discuss what political scientists “do,” let’s just not cheapen the term “coding” by using it too loosely. And let’s not cheapen the significance of tools like Wordle in the profession by implying they do something they do not.


If A Computer Model Says It, It Must Be True

“If the U.S. merely doubled its annual aid [to Pakistan] from $700 million to $1.5 billion, America’s influence in the country would significantly jump, while the militants’ would drop drastically. Why? Because with that sort of financial flow, corrupt rural officials would suddenly profit more from helping the U.S. than from helping the Taliban.”

So says the computer model that predicted Khamenei’s rise to power and the timing of Pervez Musharraf’s fall.

This from yesterday’s NYTimes expose on political scientist Bruce Bueno de Mesquita, who helped popularize the application of game theory to political and economic decision-making, and who in addition to scholarly publishing consults with firms like British Aerospace and Marconi Electonics, as well as the Central Intelligence Agency.

The article, sort of an advertisement for Bueno de Mesquita’s new book The Predictioneer’s Game is good press not only for Bueno de Mesquita but for the political science profession. Accordingly, it was important that the article did not equate political science with game theoretic models, and in fact (as both Dan Drezner and some of the commenters at the Monkey Cage note) demonstrated that qualitative methods – interviewing, interpretation, and coding – are key to the number-crunching for which Bueno de Mesquita is famous.


Truthy or Dare

If you’ve not done so, open up your Political Science & Politics and read James Fowler‘s article “The Colbert Bump in Campaign Donations: More Truthful than Truthy.” In this brilliant piece, Fowler empirically tests whether support exists for Stephen Colbert‘s claim that Congresspersons who appear on his late-night comedy show receive a “bump” in their approval ratings.

Now I need to go back through my video archive to make sure I’m correct in thinking the correct indicator of approval ratings, according to Colbert, is opinion polls (Fowler uses donations to Congressional campaigns as a proxy). That notwithstanding, I think this article is brilliant for three reasons.

1) It’s brilliantly, refreshingly funny – bravo to PS&Politics for publishing not only a scholarly article about political satire, but one written in a satirical style. (Fowler peppers his descriptions of selection effects and Mann Whitney U nonparametric tests with such gems as: “I’m sure Stephen will be pleased there is a ‘man’ in his statistical test – though, what kind of a man calls himself Whitney?”

2) It’s an article born to make students excited about political science: a simple empirical test of a popular empirical claim, with all the boring theory-relevance evacuated. Of course, because it does absolutely nothing to build theory, many journals might not have published it. But I’m delighted it was published, because it advances our understanding of how popular culture impacts political outcomes. More political scientists should focus on using our methodological tools to test popular assumptions. Who says polisci has to be boring?

3) Also, the article introduces political scientists who don’t watch the Colbert Report (I was surprised to learn that the average viewership is only about 1.3 million) to a popular phenomenon that nonetheless exerts “a disproportionate real-world influence” due to its elite demographic; while introducing Colbert fans to a dispassionate analysis (minus hype) of the show’s impact on real-world politics.

Fowler’s methodology is creative and intriguing. Instead of simply tracing the actual before-and-after campaign success of Colbert’s interviewees, he controls for selection effects by pair-matching Colbert Report guests with similar political candidates who did not go on the show. His none-too-counterintuitive finding is that the Colbert “bump” in fact exists, but only for Democrats.

Read the whole thing here.


© 2021 Duck of Minerva

Theme by Anders NorenUp ↑