Tag: coding

On Paradigms, Policy Revelance and Other IR Myths

I had every intention this evening of writing a cynical commentary on all the hoopla surrounding Open Government, Open Data and the Great Transparency Revolution. But truth be told, I am brain-dead at the moment. Why? Because I spent the last two days down in Williambsurg, VA arbitrating codes for a Teaching, Research and International Politics (TRIP) project (co-led by myself and Jason Sharman) which analyzes what the field of IR looks like from the perspective of books. It is all meant as a complement to the innovative and hard work of Michael Tierney, Sue Peterson and the TRIP founders down at William & Mary, who have sought to map the field of IR by systematically coding all published articles in the top 12 peer-reviewed disciplinary journals for characteristics such as paradigm, methodology, epistemology and policy relevance. In addition, the TRIP team has conducted numerous surveys of IR scholars in the field, the latest round capturing nearly 3000 scholars in ten countries. The project, while not immune from nit-picky criticism about its methodological choices and conclusions, has yielded several surprisingly results that have both reified and dismantled several myths about the field of IR.

So, in the spirit of recent diatribes on the field offered by Steve and Brian, I summarize a few of the initial findings of our work to serve as fodder for our navel-gazing discussion:

Myth #1: IR is now dominated by quantitative work

Truth: Depends on where you look. This is somewhat true if you confine yourself to the idea that we can know the field only by peering into the pages of IO, ISQ, APSR and the like. Between 2000-2008, according to a TRIP study by Jordan et al (2009), 38.8% of journal articles employed quantitative methods,while 30.4% used qualitative methods. [In IPE, however, the trend is definitely clearer: in 2006, 90% of articles used quantitative methods — see Maliniak and Tierney 2009, 20)]. But the myth of quantitative dominance is dispelled when we look beyond journals. In the 2008 survey of IR scholars, 72% of scholars reported that they use qualitative methods as their primary methodology. In our initial study of books between 2000-2010, Jason and I found that 58% of books use qualitative methods and only 9.3% use quantitative (the rest using mainly descriptive methods, policy analysis and the rare formal model).

Myth #2: In IR, it’s all about PARADIGMS.

Truth: Well, not really. As much as we kvetch about how everyone has to pay homage to realism, liberalism, constructivism (and rarely, Marxism) in order to get published, the truth is that a minority of published IR work takes one or more of these paradigms as the chosen framework for analysis. Surveys reveal that IR scholars still think of Realism as the dominant paradigm, yet realism shows up as the paradigm of choice in less than 10% of both books and article. Liberalism is slightly more prevalent – it is the paradigm of choice in around 26% of journal articles and 20% of books. Constructivism has actually overtaken realism, but still amounts to only 11% of journal articles and 17% of books in the past decade. Instead, according to the TRIP coding scheme, most of the IR work is “non-paradigmatic” (meaning it takes theory seriously, but doesn’t use one of the usual paradigmatic suspects) or is “atheoretic”. [Stats alert: 45% of journal articles are non-paradigmatic and 9.5% atheoretic, whereas books are 31% non-paradigmatic and 23% are atheoretical).

So, Brian: does IR still “really like” the isms?

Myth #3: Positivism rules.

Truth: Yep, that one is pretty much on the mark. 86% of journal articles AND 85% of books between 2000-2010 employed a positivist methodology. Oddly, however, only 55% of IR scholars surveyed report to see themselves as positivists. I’m going to add that one to the list of “things that make me go hmmmmm…..”

Myth #4: IR scholarship is not oriented towards policy.

Truth: Sadly, true. Only 12% of journal articles offer policy recommendations. [Ok, a poor proxy, but all I had to go on from the TRIP coding system]. Books are slightly more likely to dabble in policy, with 22% offering some sort of policy prescriptions – often quite limited and lame in my humble coding experience. Still, curiously, scholars nonetheless perceive themselves differently. 29% of scholars says they are doing policy-oriented research. This could be entirely true if they are doing this outside the normal venues of published research in the discipline and we’re simply not capturing it in our study (blogs, anyone?). All of which begs several questions: are IR scholars really engaging in policy debates? If so, how? Where? If not, why not? (Hint: fill out the next TRIP survey in the fall 2011 and we’ll find out!!)

(Note to readers: I was unable to provide a link to the draft study that Jason and I conducted on books, as it is not yet ready for prime time on the web. But if you have any questions about our project, feel free to email me).


Crowdsourcing Data Coding

I just finished watching a video of CrowdFlower’s presentation at the TechCrunch50 conference. CrowdFlower is a plaform that allows firms to crowdsource various tasks, such as populating a spreadsheet with email addresses or selecting stills from thousands of videos that have particular qualities. The examples in the video include very labor intensive tasks, but tasks that a firm is not likely to either need again or feels is worth dedicating staff to.

As I was watching the video I thought about the potential to leverage such a platform for large-scale coding of qualitative data. In the social sciences, often we find the need in large scale research for the massive coding of data, whether it is language from a speech, the tenor or sentiment of quotations (or newspaper articles in media studies), the nature of cases (i.e. did country A make a threat to country B, did country B back down as a result, etc.), or the responses from an open-ended survey. Coding is an issue whether you conducting qualitative or quantitative analysis–especially where you have captured large amounts of data. Often times the data is not inherently numerical and needs to be translated so that quantitative analysis can be conducted. Likewise, with a qualitative approach one still needs to categorize various data points to allow for meaningful comparisons.

The interesting thing about a service like Crowdflower is that it can leverage a ready group of workers globally who are ready and willing to conduct the coding at a reasonable price. Additionally, Crowdflower utilizes various real-time methods to ensure the quality of the coding. Partially this is achieved through the scoring of coders relative to their past performance, how they fair on tasks that are “planted” by Crowdflower (i.e. salting with tasks where the correct answer is known ahead of time), and how much agreement there is between coders on various items.

The final method comes up quite a bit in social science research when you have to determine how to categorize a given piece of data. The level of agreement is crucial to confidently coding a particular case. I would imagine that a platform such as CrowdFlower could make that task easier and more robust by quickly tapping into a larger pool of coders.

Has anyone used a service like CrowdFlower in this way (i.e. coding data from qualitative research)? Would be interested in your perspective.

[Cross-posted at bill | petti]


What is Coding, Anyway?

One of my tasks since getting back from hiatus has been to wade through political science journals that piled up over the summer. In the April issue of PS: Political Science and Politics, I discovered this little gem: a one-page “article” entitled “Picturing Political Science” which consisted of the following:

“What do political scientists study? As part of a larger project, we coded every article in 25 leading journals between 2000 and 2007. We then created a word cloud of the 6,005 titles using https://www.wordle.net. The 150 most-used words appear in the word cloud. The size of each word is proportional to the number of times the word is mentioned. Draw your own conclusions.”

I really like the idea that a mainstream political science journal would legitimate a Wordle cloud as a genuine piece of scholarship. And in that vein, let me engage with the piece on its methodological and conceptual merit.

Leave aside the questionable link between the image and the title of the “piece.” (Are titles alone really the best indicator of the content of a work of scholarship? My recent experience with my book publisher suggests not.) What got me was the claim in the “abstract” of the piece that the authors “coded every article” in these journals. What in the world do they mean by “coding”? The authors do not tell us, and do not share any coded data with us. Did they simply mean they extracted the titles, pasted them into a word document, and plugged them into Wordle?

If so, that’s not coding, and claiming that it is only spreads confusion about the meaning of the term. (Which may be considerable. The Wikipedia page on “coding in the social sciences” is little more than a stub at present – consider this a call for concerned academics with more time on their hands than I to flesh it out with citations and nuance.)Broadly speaking, coding is the act of categorization for the purposes of analysis (it’s not the same as just counting frequencies of terms).

Researchers working with quantitative datasets “code” as they prepare datasets for statistical analysis, when they determine (for example) that conflicts with fewer than 1000 battle deaths constitute a 0 and those with 1000 or more constitute a 1 in a spreadsheet. A process of interpretation (not simply an automated frequency count of words) is involved in analytically converting historical records to numbers.

For those using qualitative methods and working with text rather than numbers, “coding” involves assigning categories of meaning to specific passages in text (interviews, focus group transcripts, blog posts, news articles, or something else). The method for so doing can be entirely interpretive, as when a graduate student goes through a stack of Security Council resolutions with different colored highlighters; or it can involve a more rigorous process where a detailed codebook is designed for use by independent coders to apply the annotations separate from the principal investigator, and where mathematical equations such as Cohen’s Kappa are used to measure the reliability of different annotations among coders. It can involve sorting documents into stacks on an office floor, or it can involve sophisticated and layered annotations on a text file using advanced qualitative data analysis software, documenting an analytical process whereby others might replicate one’s work.

So as we discuss what political scientists “do,” let’s just not cheapen the term “coding” by using it too loosely. And let’s not cheapen the significance of tools like Wordle in the profession by implying they do something they do not.


© 2021 Duck of Minerva

Theme by Anders NorenUp ↑