Tag: statistics (Page 1 of 2)

The Crass Argument for Teaching More Math In Poli Sci Courses

LATE UPDATE: PTJ blogs about undergrad education from a very different starting point.

A few months back, we had a lively debate about what to teach undergraduates in political science. As I prepare to motivate 20 undergraduates to learn elementary statistical analysis tools AND basic R skills, I’ve been thinking about this subject a lot. I think that we both should aim to teach political science to undergraduates–that is, the skills and methodologies that are necessary for understanding research published in, say, the APSR of the 1980s—and also that we should think hard about what employable skills our students should leave with.

I submit that, up to a point, research methodologies and employable skills are pretty well the same thing.

Here’s some crass, utterly unscientific, and in-your-face data to support this point.

This figure (slightly easier to view PDF version here) draws on data from Georgetown’s Public Policy Institute and reflects my impression of plausible alternative careers for the students I’ll be teaching. (This ranges from the ministry to math/computer science–extremes that, at least at Georgetown, carry a vow of chastity.) Across this range, political science does fairly well on both percent employed full time and median wage. What I find striking, though, is that the more “mathy” a subject is, the better its graduates tend to score on both measures.

For students in my seminar, this will become my warrant to expect them to become pretty good at certain types of skills. (In conversations with folks at other institutions, I’ve been assured that undergrads are more eager and willing these skills on average than Ph.D. students, which sounds about right.) For the broader discipline, I think this sort of evidence can be used to justify including more analytical training in our major programs.
Continue reading


What’s Wrong With This Picture?


This graph comes to you from a newly published article on the politics of the drone campaign published this week in International Studies Perspectives. I haven’t yet read the full piece so cannot yet comment on it substantively or theoretically. Nor have I looked closely at the authors’ code-book. However based on the abstract, the analysis appears to rest on the empirical evidence of a newly coded dataset (the latest of many out there presuming to calculate the percentage of civilians – v. non-civilians – killed in drone strikes) to make claims about the justifiability of such attacks – presumably by weighing civilian harms against military effectiveness. My reaction here pertains solely to this graph, and what strikes me is the disjuncture between the authors’ coding of “civilians,” and the actual definition of civilians in the 1977 1st Additional Protocol to the Geneva Conventions.

Continue reading


Emerging Technologies, Material and Social

Recording Casualties and the Protection of Civilians from Oxford Research Group (ORG) on Vimeo.

As the lone social scientist in a room of lawyers, philosophers and technicians last week, I was struck by a couple of things. One was the disconnect between descriptive and normative ethics, or rather questions of is versus ought. Everyone was speaking about either norms and rules, but whereas the lawyers treated existing norms and rules as social facts the philosophers treated them as questions of ethics that could and should be altered if necessary on their ethical merit. Another was the disconnect between material and social technologies. Engineers in the room seemed especially likely to assume that material technology itself evolved independent of social facts like laws, ethical debates, or architectures of governance, though they disagreed about whether this was for better or worse.
I suspect, to the contrary, that there is an important relationship between all three that bears closer investigation. To give an example, an important thread seemed to unite the discussion despite inevitable interdisciplinary tensions: that both material technologies (like weaponry or cyber-architecture) and social technologies (like international laws) should evolve or change to suit the demands of human security. That is, the protection of vulnerable non-combatants should be a priority in considerations of the value of these technologies. Even those arguing for the value of lethal autonomous robots made the case on these terms, rather than national security grounds alone.

Yet it bears pointing out (as I think the video above does quite well) how difficult that very factor is to measure.
How do we know whether a particular rule or law or practice has a net benefit to vulnerable civilians? How does one test the hypothesis, for example, that autonomous weapons can improve on human soldiers’ track record of civilian protection, without a clear baseline of what that track record is? Knowing requires both material technologies (like databases, forensics, and recording equipment) and social technologies (like interviewing skills and political will).

And make no mistake: the baselines are far from clear because our social technologies on casualty counting are lacking, because nothing in the existing rules of war requires record-keeping of war casualties, efforts to do so are patchy and non-comparable, and the results are data that map poorly onto the legal obligations of parties to armed conflict. Hence, an important emerging social technology would be efforts to standardize casualty reporting worldwide. Indeed such social technologies are already under development, as the presentation from the Oxford Research Group exemplifies. Such a governance architecture would be a logical complement to emerging material technologies whose raison d’etre is predicated on improving baseline compliance with the laws of war. In fact without them I wonder if the debate about the effects of material technologies on war law compliance can really proceed in the realm of descriptive ethics or must remain purely in the realm of the philosophical.

Anyway. These kinds of “emerging social technologies” or what scholars like me might call “norm-building efforts” received relatively little consideration at the workshop, which was focused primarily on the relationship between emerging material technologies (robotics, cyberspace, non-lethals, human augmentation) to existing governance architecture (e.g. the law of armed conflict). But I think – and will probably write more on this question presently – that an important question is how emerging material technologies can expose gaps and irregularities in social technologies of governance, catalyze shifts in norms, as well as (possibly) strengthen enforcement and adherence to those norms themselves if put to good use.


Winecoff vs. Nexon Cage Match!

Kindred Winecoff has a pretty sweet rebuttal to my ill-tempered rant of late March. A lot of it makes sense, and I appreciate reading graduate student’s perspective on things.

Some of his post amounts to a reiteration of my points: (over)professionalization is a rational response to market pressure, learning advanced methods that use lots of mathematical symbols is a good thing, and so forth.

On the one hand, I hope that one day Kindred will sit on a hiring committee (because I’d like to see him land a job). On the other hand, I’m a bit saddened by the prospect because his view of the academic job market is just so, well, earnest.  I hate to think what he’ll make of it when he sees how the sausage actually gets made.

I do have one quibble:

While different journals (naturally) tend to publish different types of work, it’s not clear whether that is because authors are submitting strategically, editors are dedicated to advancing their preferred research paradigms, both, or neither. There are so many journals that any discussion of them as doing any one thing — or privileging any one type of work — seems like painting with much too wide a brush.

Well, sure. I’m not critical enough to publish in Alternatives, Krinded’s not likely to storm the gates of International Political Sociology, and I doubt you’ll see me in the Journal of Conflict Resolution in the near future. But while some of my comments are applicable to all journals, regardless of orientation, others are pretty clearly geared toward the “prestige” journals that occupy a central place in academic certification in the United States.

But mostly, this kind of breaks my heart:

I’ve taken more methods classes in my graduate education than substantive classes. I don’t regret that. I’ve come to believe that the majority of coursework in a graduate education in most disciplines should be learning methods of inquiry. Theory-development should be a smaller percentage of classes and (most importantly) come from time spent working with your advisor and dissertation committee. While there are strategic reasons for this — signaling to hiring committees, etc. — there are also good practical reasons for it. The time I spent on my first few substantive classes was little more than wasted; I had no way to evaluate the quality of the work. I had no ability to question whether the theoretical and empirical assumptions the authors were making were valid. I did not even have the ability to locate what assumptions were being made, and why it was important to know what those are.

Of course, most of what we do in graduate school should be about learning methods of inquiry, albeit understood in the broadest terms. The idea that one does this only in designated methods classes, though, is a major part of the problem that I’ve complained about. As is the apparent bifurcation of “substantive” and “methods of inquiry.”And if you didn’t get anything useful out of your “substantive” classes because you hadn’t yet had your coursework in stochastic modeling… well, something just isn’t right there. I won’t tackle what Kindred means by “theory-development,” as I’m not sure we’re talking about precisely the same thing, but I will note that getting a better grasp of theory and theorization is not the same thing as “theory-development.”

Anyway, I’ll spot a TKO to Kindred on most of the issues.


Challenges to Qualitative Research in the Age Of Big Data

Technically, “because I didn’t have observational data.”
Working with experimental data requires only
calculating means and reading a table. Also, this
may be the most condescending comic strip
about statistics ever produced.

The excellent Silbey at the Edge of the American West is stunned by the torrents of data that future historians will be able to deal with. He predicts that the petabytes of data being captured by government organizations such as the Air Force will be a major boon for historians of the future —

(and I can’t be the only person who says “Of the future!” in a sort of breathless “better-living-through-chemistry” voice)

 — but also predicts that this torrent of data means that it will take vastly longer for historians to sort through the historical record.

He is wrong. It means precisely the opposite. It means that history is on the verge of becoming a quantified academic discipline. That is due to two reasons. The first is that statistics is, very literally, the art of discerning patterns within data. The second is that the history that academics practice in the coming age of Big Data will not be the same discipline that contemporary historians are creating.

The sensations Silbey is feeling have already been captured by an earlier historian, Henry Adams, who wrote of his visit to the Great Exposition of Paris:

He [Adams] cared little about his experiments and less about his statesmen, who seemed to him quite as ignorant as himself and, as a rule, no more honest; but he insisted on a relation of sequence. And if he could not reach it by one method, he would try as many methods as science knew. Satisfied that the sequence of men led to nothing and that the sequence of their society could lead no further, while the mere sequence of time was artificial, and the sequence of thought was chaos, he turned at last to the sequence of force; and thus it happened that, after ten years’ pursuit, he found himself lying in the Gallery of Machines at the Great Exposition of 1900, his historical neck broken by the sudden irruption of forces totally new.

Because it is strictly impossible for the human brain to cope with large amounts of data, this implies that in the age of big data we will have to turn to the tools we’ve devised to solve exactly that problem. And those tools are statistics.

It will not be human brains that directly run through each of the petabytes of data the US Air Force collects. It will be statistical software routines. And the historical record that the modal historian of the future confronts will be one that is mediated by statistical distributions, simply because such distributions will allow historians to confront the data that appears in vast torrents with tools that are appropriate to that problem.

Onset of menarche plotted against years for Norway.
In all seriousness, this is the sort of data that should
be analyzed by historians but which many are content
to abandon to the economists by default. Yet learning
how to analyze demographic data is not all that hard,
and the returns are immense. And no amount of
reading documents, without quantifying them,
 could produce this sort of information.

This will, in one sense, be a real gift to scholarship. Although I’m not an expert in Hitler historiography, for instance, I would place a very real bet with the universe that the statistical analysis in King et al. (2008) , “Ordinary Economic Voting Behavior in the Extraordinary Election of Adolf Hitler,” tells us something very real and important about why Hitler came to power that simply cannot be deduced from the documentary record alone. The same could be said for an example closer to (my) home, Chay and Munshi (2011), “Slavery’s Legacy: Black Mobilization in the Antebellum South,” which identifies previously unexplored channels for how variations in slavery affected the post-war ability of blacks to mobilize politically.

In a certain sense, then, what I’m describing is a return of one facet of the Annales school on steroids. You want an exploration of the daily rhythms of life? Then you want quantification. Plain and simple.

By this point, most readers of the Duck have probably reached the limits of their tolerance for such statistical imperialism. And since I am a member in good standing of the qualitative and multi-method research section of APSA (which I know is probably not much better for many Duck readers!), who has, moreover, just returned from spending weeks looking in archives, let me say that I do not think that the elimination of narrativist approaches is desirable or possible. Principally, without qualitative knowledge, quantitative approaches are hopelessly naive. Second, there are some problems that can only practically be investigated with qualitative data.

But if narrativist approaches will not be eliminated they may nevertheless lose large swathes of their habitat as the invasive species of Big Data historians emerges. Social history should be fundamentally transformed; so too should mass-level political history, or what’s left of it, since the availability of public opinion data, convincing theories of voter choice, and cheap analysis means that investigating the courses of campaigns using documents alone is pretty much professional malpractice.

The dilemma for historians is no different from the challenge that qualitative researchers in other fields have faced for some time. The first symptom, I predict, will be the retronym-ing of “qualitative” historians, in much the same way that the emergence of mobile phones created the retroynm “landline.” The next symptom will be that academic conferences will in fact be dominated by the pedantic jerks who only want to talk about the benefits of different approaches to handling heteroscedasticity. But the wrong reaction to these and other pains would be kneejerk refusal to consider the benefits of quantitative methods.


Crunching Corpse Counts: A Rejoinder by Michael Spagat Et Al.

One of the few items recently that has caused me to emerge from my nothing-but-Friday-nerd-blogging temporary hiatus was this article on civilian war deaths by Michael Spagat and his collaborators. I wrote a post with some praise and some questions, and recently received a thoughtful response by email from Michael and his crew in which they further detail the coding methods used in the project. Since the original thread generated some interest, I’ve decided to post their response here.

Civilian Targeting Index Clarification
by Madelyn Hicks, Uih Ran Lee, Ralph Sundberg and Michael Spagat

Since publishing our paper in PLoS ONE on the Civilian Targeting Index we have received some interesting feedback both in emails and on the “Duck of Minerva” blog concerning the nature of the one-sided violence data we use from the Uppsala Conflict Data Program (UCDP). In particular, some readers wonder how solid the underlying evidence of intentionality can really be for incidents coded as one-sided violence (i.e. ’civilian targeting’). We would like to take this opportunity to clarify this important coding issue in some detail.

First, here is a short general discussion. UCDP coding of ‘deliberate’ or ‘intentional’ civilian targeting is not a judicial assessment (e.g. of manslaughter or murder), nor is it an attempt to ‘know’ a perpetrator’s motivations. Instead, UCDP coding methodology assesses whether particular conflict-related deaths were likely to be one-sided or battle-related based on a combined review of: the plausible target, the method by which a killing was carried out, presented evidence, and credible statements or attributions of guilt. In each situation of violence the human coders, using what evidence is at hand, first attempt to identify a likely target. Coders also consider what method of attack was used (bombing, shooting, IED, etc.).

It is from this analysis of available information that a coder draws a conclusion regarding whether this was a likely ‘intentional’ one-sided death.
Often Uppsala infers the intention to kill civilians – and only civilians – from the absence of a possible military or conflict-related target. Conversely, the presence of such a target will be sufficient to force a coding of battle death (rather than one-sided death). This means that the bar is set fairly high for classifying deaths as one-sided since for many incidents there will be plausible military or conflict-related targets. Thus, many killings that could in reality have been largely intentional will, nevertheless, be classified as battle deaths because the available evidence will not be considered strong enough to code these death as one-sided.

Note, however, that there are some coding outlets for expressing uncertainty over intentionality. For example, when there is weak evidence of intentionality, deaths can be placed into the “high” category of one-sided deaths rather than into the “best” category. Such coding might be reconsidered later as further evidence comes to light. For example, in Burundi some such deaths were later transferred from “high” to “best” as new evidence became available that these events were really massacres.

We now turn to specific examples of coding in practice, focusing on the borderline between one-sided (i.e., intentional civilian targeting) fatalities and battle deaths. First, in many cases, the correct coding is fairly obvious due to the method used in killing, as reported by credible sources. For example, many one-sided deaths were attributed to the Lord’s Resistance Army (LRA) based on a Human Rights Watch Report (“The Christmas Massacres”, February 16, 2009) that contained graphic descriptions such as the following: “LRA combatants hacked their victims to death with machetes or axes or crushed their skulls with clubs and heavy sticks.” In Colombia there were many incidents of armed groups entering villages and then massacring people by shooting them in their heads at close range. In Iraq there were numerous incidents of bodies of executed individuals found, often with their hands tied behind their backs, shot in the head and bearing marks of torture. Such killings are unambiguously one-sided deaths.

The Iraq example does, however, raise an important point about UCDP coding which everyone should bear in mind when interpreting the data; the coding scheme only admits deaths that can be attributed to particular armed groups. Thus, many execution deaths in Iraq (and elsewhere) are not included because the perpetrating groups are not known. (These deaths can be added later if the identities of perpetrators are discovered.) Of course, the requirement of identifying perpetrators limits the coverage of the UCDP data. More importantly, it affects the tallies for some groups differently than it affects others. This biases the data and we need to think through the implications of these biases when we make interpretations. It is, for example, worth knowing that the Taliban often claim credit for executions of teachers and unmarried couples while groups in Iraq typically execute their victims anonymously.

Of course, many events are much more ambiguously situated on the battle death/one-sided death borderline than the ones described so far. For example, IED-caused deaths of civilians are coded as battle deaths because one cannot rule out the possibility that the intention of the perpetrators was to attack NATO or Afghan forces. Mortar fire is treated similarly; since mortars are hard to control it is difficult to conclude that a mortar was intended to hit civilians only even if it does only hit civilians. Fatalities from rockets fired from the Gaza Strip and southern Lebanon are coded as battle deaths for the same reason. Civilian deaths in aerial bombings, which usually have plausible military objectives, are also normally coded as battle deaths unless there is clear evidence of intentional targeting of civilians. Checkpoint killings in which soldiers fire on an approaching vehicle after going through a sequence of warnings will usually be classified as battle deaths based on statements of soldiers that they believed they were firing on hostile individuals inside the attacked vehicle, and in the absence of evidence to the contrary.

It is clear from the examples of the previous paragraph that the battle death/one-sided death distinction, and hence the Civilian Targeting Index (CTI), is of no use for analyzing the reckless endangerment or indiscriminate killing of civilians, as we point out in our paper. Other tools, such as the Dirty War Index [Hicks and Spagat (2008)] are required for this purpose, with a recent example of measuring indiscriminate effects on civilians in Iraq [Hicks et al. (2011)].

Another important point is that UCDP systematically reexamines incidents as new evidence appears. For example, there was an incident in Afghanistan in March of 2007 in which US Marines killed 12-19 civilians following ambush. Those civilians were killed as the Marines sped off from the scene of the ambush, peppering passing vehicles with gunfire even though they were already clear of danger. When a US military court ruled that the soldiers had used ‘excessive force’, these deaths were transferred by UCDP from battle deaths to one-sided deaths. New evidence also comes from bodies like truth commissions, forensic investigations, court proceedings (including tribunals for the former Yugoslavia, the ICC, military courts and criminal courts) and NGOs. Note the wide range of sources that UCDP brings into play in compiling its data. The UCDP data are not based only on media surveillance as many people seem to believe.

Soon UCDP will start releasing its data at the incident level. At that stage, anyone will be able to inspect the data incident by incident and develop a much stronger understanding than we are able to provide in this short comment. Moreover, people will be able to question incidents and to argue that codings should be changed. Thus, over time UCDP data should improve while we simultaneously improve our understanding of its strengths and weaknesses.


New Statistics on Civilian Targeting

In a new paper, Michael Spagat and a number of collaborators explore the determinants of intentional civilian killing in war.

Using sophisticated regression analysis they claim to have found “four significant behavioral patterns”:

“First, the majority (61%) of all formally organized actors in armed conflict during 2002-2007 refrained from killing civilians in deliberate, direct targeting.

Second, actors were more likely to have carried out some degree of civilian targeted, as opposed to none, if they participated in armed conflict for three or more years rather than for one year.

Third, among actors that targeted civilians, those that engaged in greater scales of armed conflict concentrated less of their lethal behavior into civilian targeting ad more into involvement with battle fatalities.

Fourth, an actor’s likelihood and degree of targeting civilians was unaffected by whether it was a state or a non-state group.”

Now those who follow the literature on war law compliance will find a number of these arguments to be quite interesting, somewhat counter-intuitive, and highly policy-relevant. I’ll leave that discussion to comments (may even kick it off) but in the body of this post let me just say two things.

First, this paper is path-breaking not just for its findings but for the data it relies on. The authors are working with a new data-set based on combining three existing Uppsala data-sets: One-Sided Violence, Battle-Related Deaths and Non-State Conflict. Their aim is to disaggregated “intentional targeting of civilians” from wider civilian battle deaths, thus distinguishing for the first time I know of in a large-N study between civilian deaths caused on purpose and those caused incidentally from lawful operations. This is a fundamental distinction in international law that until now has until now been poorly reflected in the datasets on civilian deaths, making it difficult to track war law compliance, as I’ve argued here and here. Parsing existing data this way is a huge step forward for those of us trying to understand the effectiveness of the civilian immunity norm.

But that said I do see a limitation with the data coding. It seems the “battle-related civilian deaths” category ends up including both “collateral” deaths (which are legitimate under international law) and “indiscriminate deaths” which are not (see p. 12) and on the legal rule see AP 1 Article 51(4). So the “intentional targeting” category which is based on the one-sided-violence data, reflects only one type of war law violation rendering the data as currently coded only partially useful for tracking compliance with the Geneva conventions. A more versatile dataset would code each of these categories separately, so that scholars designing their research in different ways could choose to sum or disaggregate them in different combinations. Conflating indiscriminate attacks with collateral damage is both conceptually problematic and, I fear, risks leading to misunderstandings about the status of indiscriminate attacks in international law. I hope that continued work will be done on this dataset to correct for this problem as it will make the data far more useful to replicating and testing these and other hypotheses on civilian victimization.


Beyond Qual and Quant

PTJ has one of the most sophisticated ways of thinking about different positions in the field of International Relations (and, by extension, the social sciences), but his approach may be too abstract for some. I therefore submit for comments the “Political Science Methodology Flowchart” (version 1.3b).

Note that any individual can take multiple treks down the flowchart.

Worst. Argument. Ever.

Frank Pasquale at Balkinization:

The Dodd-Frank Act also promises to shed some sunlight on ever-rising CEO pay levels. As Sam Pizzigatti explains, “corporations must now also report their overall wage ‘median’ and the ratio between this median and their top pay.” Seizing on some laughable comments on how “unduly burdensome” the law is, “the House Financial Services Committee’s Capital Markets Subcommittee [recently] approved, by a vote of 20 to 12 . . . legislation (H.R. 1062) to repeal the Dodd-Frank pay ratio mandate.”

Here’s the argument for why this requirement is “unduly burdensome”:

The burden of this median pay calculation requirement is significant. It would require a company to gather and calculate compensation information for each employee as required for senior executives under the SEC disclosure rules, determine the pay of each employee from highest to lowest, and then identify the employee whose pay is at the midpoint between the highest- and lowest-paid employee. No public company currently calculates each employee’s total compensation as it calculates total pay for CEOs on the proxy statement; therefore, companies would be required to invest considerable resources to implement this mandate to produce a meaningless statistic.

And, OMG, many companies have overseas offices with different pay systems on, get this, different computers!!

Do you hear that noise? That, my friends, is the sound of oligarchy.

The process of finding the median (pictured above) was so exhausting that Dan went to sleep immediately after writing this post.

Congo Rape Study: Systematic or Simplistic?

I have not yet read the new report on rape in the Congo, but judging from the news coverage of its reported findings, I have three thoughts:

1) I am not as concerned as some critics about the methods used (a population sample of household interviews) or the staggering results: 400,000 women assaulted in a single year. I am concerned about the comparisons to the US (or other countries) since unless the same methods are replicated in the US (or other countries) there is no way to compare rape rates or to accurately call Congo the “rape capital of the world.”

2) Though the emphasis is on the number of rapes committed by soldiers, the report also shows that nearly a quarter of the rapes recorded were perpetrated by the women’s husbands or domestic partners. This is consistent with earlier Oxfam data that demonstrated the majority of rapes in the Congo between 2004 and 2008 were perpetrated by civilians, not soldiers.

3) Since patterns of sexual violence against men in the Congo are better documented than in many other conflicts, it is particularly surprising that this study focuses only on women, and only on women “of reproductive age.” This promotes a troubling stereotype about rape and rape victimization.

Jason Stearns, who has been writing and blogging up a storm about Congo this past week, is the first to point out that the social construction of the rape angle has been as much about selling the Congo story to Western grassroots constituencies as about reporting the conflict accurately.

It’s hard to know from his various op-eds and articles where Stearns actually stands on this. At Foreign Policy, he argues that it was only when John Prendergast‘s Enough Project stopped trying to explain the conflict and started focusing on “rape and conflict minerals” that they were able to get Western publics interested in putting pressure on their elected officials. But at CSM, he points out “Congo is More Than Rape and Minerals” proposes a point by point list of pitfalls journalists should avoid in writing about the Congo, not least is simplistic protrayals of rape:

Some Congolese are unscrupulous and vicious, but they usually have reasons for what they do. If we can understand why officials rape (and it’s not always just as a “weapon of war”) and why they steal money (it’s not just because they are greedy) we might get a bit better at calibrating solutions. Of course, it’s much harder to interview a rapist or a gun-runner than their victims. But don’t just shock us; make us understand. Otherwise we only have ourselves to blame when we react to a rape epidemic by just building hospitals and not trying to get at the root causes.

So it sounds like the most important report on rape in the Congo is one that hasn’t been written yet: in which perpetrators themselves are systematically interviewed. Actually, political scientists have already blazed a trail here: in this study, authors Maria Erikkson Baaz and Maria Stern find among other things that Congolese soldiers see ethical distinctions between different types of rapes.

That kind of insight might not be so useful for advocacy purposes in the West, but it might help aid workers, peace-keepers and protection specialists in the Congo in their prevention efforts – at least vis a vis military perpetrators. (As Laura Seay details, advocacy attention to a problem doesn’t by itself ensure the policy outcome you want – for that, you need to understand the situational context.)

But ultimately Stearns would like to see a wider repertoire of stories about the Congo in the Western press

Who are the Chinese companies working in the Congo and what have their experiences been? Did you know that Congo was one of the first countries to experiment with mobile cash-transfers to pay for demobilized soldiers? Have you checked out the famous artist studios in Kinshasa of Cheri Samba or Roger Botembe? The country’s tax revenues have doubled over the past several years – how does that square with its corrupt reputation? What are Dan Gertler’s financial relations with the Israeli right-wing? The Kivus apparently produce 40 percent of the world supply of quinine – might be a story there.

[cross-posted at Lawyers, Guns and Money]


Actually, We Don’t Know How Many Civilians Are Dying in Drone Strikes.

Peter Bergen and Katherine Tiedemann at the New America Foundation are keeping one of the most useful datasets on drone strike fatalities that I know of. They’ve been tallying reports of strikes since 2004. They limit their data to those reported by:

“news organizations with deep and aggressive reporting capabilities in Pakistan (the New York Times, Washington Post, and Wall Street Journal), accounts by major news services and networks (the Associated Press, Reuters, Agence France-Presse, CNN, and the BBC), and reports in the leading English-language newspapers in Pakistan (the Daily Times, Dawn, the Express Tribune, and the News), as well as those from Geo TV, the largest independent Pakistani television network.”

This gives them a systematic, if conservative, estimate of total fatalities. They then gather, archive and code the data in a transparent and replicable way – unlike other estimates of drone strikes that don’t provide evidence of how they derive their statistics. Bergen and Tiedemann’s results gives us a descriptive picture of how drone strikes have increased over time and changed by location and impact. Their website includes a set of helpful visualizations:

While I find the effort impressive and have sometimes cited Bergen and Tiedemann’s data as decent mid-range estimates of drone-strike fatalities, I am developing some reservations about the coding methods being used and the inferences being made after looking more closely at their dataset. In particular, Bergen and Tiedemann’s estimates of the ratio between civilian to militant deaths by strikes bears closer examination.

1) It’s important to emphasize that these estimates, most recently outlined in a Foreign Policy article entitled “There Were More Drone Strikes — and Far Fewer Civilians Killed”, do not actually measure of the ratio of civilian to militant deaths. They measure the ratio of reported civilian to reported militant deaths. This is a very important distinction that seems to have been lost on Bergen and Tiedeman, who claim in their recent Foreign Policy piece “even as the number of reported strikes has skyrocketed, the percentage of non-militants killed by the attacks has plummeted.” It is more accurate to say that the percentage of non-militants reported killed by the attacks has plummeted.

Acknowledging that this is data on news reporting more than data on actual deaths puts the data in a different light. For example, the declining trend in ‘civilian deaths’ could mean fewer civilians are in fact being killed. Or it could mean a shift in how reporters are interpreting ‘civilian’ or ‘militant’ over this time period – a period in which the very concept of the “civilian” is being degraded in popular, media and diplomatic discourse both by evolving events and by the notion, among other things, that a person loses their civilian status simply by being suspected of militancy against their government.

2) But let us set aside for a moment the question of whether (and which part of) war law (and therefore the civilian/combatant distinction) really applies to US airpower inside Pakistan. And let’s assume that it is legitimate to treat “suspected militant” as synonymous with “combatant” and “non-suspect” as synonymous with “civilian.” I still worry that Bergen and Tiedemann are overestimating militant deaths in these reports. One of the reasons for this is probably inevitable given their method: they rely on what mainstream reporters say, and reporters rely on information from the governments doing the killing. But another reason is completely within their control: by using “militant” rather than “civilian” as the default code when the actual status of the deceased, according to the reports, is “unknown” or contested.

For example, Bergen and Tiedemann record a December 31, 2009 attack in which CNN reported 2 were killed, 3 injured, and it was unclear whether any of the dead or injured were militants; and in which AFP reported 3 militants were killed and that “the identity of the militants is not known yet”; This event was coded in the Bergen/Teidemann dataset as “Al-Qaeda/Taliban killed: 2-5; Others killed; unknown.”

At a minimum, it would seem to me, this event should have been coded as 2-5 deaths “status unknown” rather than counting as either definitely militants or definitely civilians. In fact, however, it would be more consistent with humanitarian law, from which the civilian/combatant distinction is derived, to record any deaths in which the status of the deceased are unknown as civilians. (Article 50(1) of the 1st Additional Protocol to the Geneva Conventions states that “In case of doubt whether a person is a civilian, that person shall be considered to be a civilian.”)

I would be interested to know how the Bergen/Tiedemann ratio of “civilians”/(non-militant suspects) to “combatants”(militant suspects) would change if their coding were replicated with either of these two minor yet significant changes introduced. (In the case of the Jamestown Foundation study released earlier this year the latter approach would have made an enormous difference in their findings even with males over 13 excluded, jumping the civilian hit rate from 5% to 27%.)

3) All this only goes to show how impoverished our understanding of the civilian impacts of different weapons will remain until some independent verification mechanism is established for tallying and reporting the dead in today’s wars. Important efforts are underway to fill this critical gap in the Geneva regime and should be supported by advocates of human rights and humanitarian accountability.

[cross-posted at Lawyers, Guns and Money]


“Statistics is the New Grammar”

[Cross-posted at Signal/Noise]

In the latest issue of WIRED, Clive Thompson pens a great piece which echoes a sentiment I’ve touched on before: in a data-driven world it is critical that all citizens have at least a basic literacy in statistics (really, research methodology broadly, but I’ll take what I can get).

Now and in the future, we will have unprecedented access to voluminous amounts of data. The analysis of this data and the conclusions drawn from it will have a major impact on public policy, business, and personal decisions. The net effect of this could go either way–it can usher in a period of unprecedented efficiency, novelty, and positive decision making or it can precipitate deleterious actions. Data does not speak for itself. How we analyze and interpret that data matters a great deal, which puts a premium on statistical literacy for everyone–not just PhDs and policy wonks.

Thompson notes a number of statistical fallacies that many, including members of the media, fall prey to. Using a single event to prove or disprove a general property or trend is one spectacular one that we see all the time, particularly with large, macro-level events. Regardless of what side of the climate change debate you are on a single snow storm or record-breaking heat wave does not rise to the level of hypothesis-nullifying or -verifying evidence.

There are oodles of other examples of how our inability to grasp statistics–and the mother of it all, probability–makes us believe stupid things. Gamblers think their number is more likely to come up this time because it didn’t come up last time. Political polls are touted by the media even when their samples are laughably skewed.

Take correlation and causation. The cartoon below nicely illustrates the common fallacy that the correlation of two events is enough to prove that one causes the other:

In thinking about this I remembered an argument I had with a number of colleagues while in grad school over why they had to be at least somewhat literate in quantitative analysis and game theory since they never intended to use such methods. Given that we will only see an increase of data and data-based (no pun intended) arguments, policies, and decisions we need to, at a minimum, be able to understand how the results were achieved and whether or not the studies are flawed. Patrick is probably the last person to apply quantitative methods to social scientific problems, but he can certainly speak the language with the best of them.

Bottom line: the importance of statistical literacy will only increase. Statistics will come to permeate our lives, more so than ever before. We had better be able to speak the language.

Methodology Lessons: DOE’s Natural-gas Overstatement

[Cross-posted at Signal/Noise]

The Wall Street Journal reported yesterday that the US Department of Energy is set to restate the data it collects on U.S. natural-gas production. The reason? The Department has learned that its methodology is seriously flawed:

The monthly gas-production data, known as the 914 report, is used by the industry and analysts as guide for everything from making capital investments to predicting future natural-gas prices and stock recommendations. But the Energy Information Administration (EIA), the statistical unit of the Energy Department, has uncovered a fundamental problem in the way it collects the data from producers across the country—it surveys only large producers and extrapolates its findings across the industry. That means it doesn’t reflect swings in production from hundreds of smaller producers. The EIA plans to change its methodology this month, resulting in “significant” downward revision.

The gap in output between what the 914 report has been predicting and what is actually occurring has been growing larger and larger. Many analysts have long suspected the methodology underlying the reports was faulty, but the EIA has been slow to revise it. The overestimation of output has depressed prices, the lowest in 7 years. Any revision to the methodology will bring about a “correction” in energy markets and particular states will surely see their output dip significantly.

So what can we learn from this from a methodological perspective? A few things:

  1. How you cast the die matters: The research methodology that we employ for a given problem significantly impacts the results we see and, therefore, the conclusions we draw about the world. The problem with the DOE’s 914 report wasn’t simply a matter of a bad statistical model, it was the result of unrepresentative data (i.e. relying only on the large producers). This isn’t simply an issue of noisy or bad data, but of systemic bias as a result of the methodology employed by the EIA. The data itself is seemingly reliable. The problem lies with the validity of the results, caused by the decision to systematically exclude small producers and potentially influential observations from the model.
  2. Representativeness of data doesn’t necessarily increase with the volume of data: More than likely the thought went that if the EIA collected data on the largest producers they’re extrapolations about the wider market would be sound–or close enough–since the largest players tend to account for the bulk of production. However, as we see with the current case, this isn’t necessarily true. At some point in history this methodology may have been sound, but it appears that changes to the industry (technology, etc) and the increased importance of smaller companies have rendered the old methodology obsolete. Notice that the EIA’s results are probably statistically significant, but achieving significance really isn’t that difficult once your sample size gets large enough. What is more important is representativeness–is the sample you’ve captured representative of the larger population? Many assume that size and representation are tightly correlated–this is an assumption that should always be questioned and, more importantly, verified before relying on the conclusions of research.
  3. Hypothesis-check your model’s output: The WSJ article notes that a number of independent analysts long suspected a problem with the 914 reports by noticing discrepancies in related data. For example, the 914 report claimed that production increased 4% in 2009. This was despite a 60% decline in onshore gas rigs. If the 914 report is correct, would we expect to see such a sharp decline in rigs? Is this logically consistent? What else could have caused the 4% increase? The idea here is to draw various hypotheses about the world assuming your conclusions are accurate and test them–try to determine, beyond your own data and model, whether your conclusions are plausible. Too often I’ve found that business fail to do this (possibly because of time constraints and less of a focus on rigor), but academics often fall into the same trap.


On The Ecology of Human Insurgency

Charli highlighted the recently published work of Sean Gourley in Nature on the patter of frequency and magnitude of attacks in insurgencies, so I wanted to cross-post my critique of this work to initiate a discussion here at Duck.

nature08631-f4.2.jpgThe cover story of this month’s Nature features the work of a team of researchers examining the mathematical properties of insurgency. One of the authors is Sean Gourley, a physicist by training and TED Fellow, and this work represents the culmination of research by Gourley and his co-authors—a body of work that I have been critical of in the past. The article is entitled, “Common ecology quantifies human insurgency,” (gated) and the article attempts to define the underlying dynamics of insurgency in terms of a particular probability distribution; specifically, the power-law distribution, and how this affects the strategy of insurgents.

First, I am very pleased that this research is receiving such a high level of recognition in the scientific community, e.g., Sean tweeted that this article “beat out ‘the new earth’ discovery and the ‘possible cancer cure’ for the cover of nature.” Scholarship on the micro-level dynamics of a conflict is undoubtedly the future of conflict science, and these authors have ambitiously pushed the envelope; collecting an impressive data set spanning both time and conflict geography. Bearing in mind the undeniable value of this work, it is important to note that several claims made by the authors do not seem consistent with the data, or are at least require a dubious suspension of disbelief.

In many ways I reject the primary thrust of the article, which is that because the frequency and magnitude of attacks in an insurgency follows a power-law distribution this somehow illuminates the underlying decision calculus of insurgents. Without belaboring a point that I have made in the past, the observation that conflicts follow a power-law is in no way novel, and I am disappointed that the authors failed to cite though I am encouraged that the authors did cite the seminal work on this subject (thank you for pointing out my errata, Sean). The data measures the lethality and frequency of attacks perpetrated in the Iraq, Afghanistan, Peru and Colombia insurgencies, but the connection between this and the strategy of an insurgent is missing.

The authors’ primary data sources are open media reports on attacks; therefore, their observation simply reveals that open-source reporting on successful insurgent attacks follows a power-law. There are two critical limitations in the data that prevent it from fully answering the questions posited by the authors. First, there is some non-negligible level of left-censoring, i.e., we can never attempt to quantify the attacks that are planned by insurgents and never carried out, or those that are attempted by fail (defective IEDs, incompetent actors, etc.). Although they do not inflict damage, these attacks a clearly byproducts of insurgent strategy, and therefore must be present in a model of this calculus. Second, while the authors claim to overcome selection bias by cross-validating attack observations, this remains a persistent problem. Consider the insurgencies in Iraq and Afghanistan; in the former most of the attacks occurred in heavily populated urban areas, garnering considerable media coverage. In contrast, Afghanistan is largely a rural country, where the level of media scrutiny is considerably lower, meaning that media outlets there are inherently selective in what they report, or most reports are generated by US DoD reporting. How do we handle the absence of attack observations for Afghan villages outside the purview of the mainstream media?

The role of the media is central to the decision model proposed by the authors, which is illustrated in the figure above. Again, however, this presents a logical disconnect. As the figure describes, the authors claim that insurgents are updating their beliefs and strategies based on the information and signals they receive from broadcast news, then deciding whether to execute an attack. For lack of a better term, this is clearly putting the cart before the horse. The media is reporting attacks, as the authors’ data clearly proves; therefore, the insurgents’ decision to attack is creating news, and as such insurgents are gaining no new information from media reports on attacks that they themselves have perpetrated. Rather, the insurgents retain a critical element of private information, and are updating based on the counter-insurgency policies of the state—information they are very likely not receiving from the media. The framework presented here is akin to claiming that in a game of football (American) the offense is updating their strategy in the huddle before ever having seen how the defense lines up. Without question updating, in football both sides are updating strategy constantly, but it is the offense that dictates this tempo, and in an insurgency the insurgents are on offense.

This interplay between an insurgency and the state is what must be the focus of future research on the micro-dynamics of conflict. From the perspective of this research, a more novel track would be to attempt to find an insurgency that does not follow a power-law; but rather a less skewed distributions, such as the log-normal or a properly fit Poisson. Future research may also benefit from examining the distribution of attacks in the immediate or long-term aftermath of a variation in counter-insurgency policy. After addressing some of the limitations described above, such research might begin to identify the factors that contributed to why some counter-insurgency policies shift the attack distribution away from the power-law. The key to any future research; however, is to connect this to the context of the conflict in a meaningful way.

Again, congratulations to Sean and his team, I hope their piece will initiate a productive discussion in both academic and policy arenas on the methods and techniques for studying the micro-dynamics of conflict.

Photo: Nature


Speaking of Fatality Data…

A physicist named Sean Gourley has created a model that he claims explains the power law distribution of deaths in insurgencies across a range of country contexts. Just published in Nature. The abstract is here. Check out his presentation on his original correlational findings from last May:

Q&A about his new model here. I’m not sure I understand it well enough to comment, but I figured Duck readers would find it interesting, and I’m asking myself how I can get my hands on his data to look at whether it’s broken down by category of victim…


“OMG”: War Death Statisics Reconsidered

Researchers associated with the Human Security Report Project have a new article in the Journal of Conflict Resolution contradicting a recent critique of corpse-counting techniques prevalent in the battle-deaths community. The original critique, authored by Obermayer, Murray and Gakidou (humorously referred to as OMG by the authors of this new rebuttal) compared war death reports from the World Health Organization‘s sibling survey data with battle-death estimates for the same countries from the International Peace Research Institute in Oslo (PRIO) and concluded that battle-death estimating methods (which draw on incident reports by third party observers) significantly undercount war deaths because so many go unreported. Surveys, OMG argued, constitute a better measure of the death toll of war; and they conclude that in the wider survey data there is less support for the widely reported global decline in war deaths.

Michael Spagat, Andrew Mack and their collaborators point out a few errors and inconsistencies in the comparison drawn by “OMG.” The most damning of these is that the data are non-comparable, since the PRIO dataset is measuring “battle-deaths” (soldiers killed and civilians killed in the crossfire in two-sided wars where a government is one of the parties) whereas the WHO dataset is measuring all “war deaths” as reported by conflict-affected populations. So the most OMG can say is battle-death estimating methods undercount war deaths because they aren’t counting war deaths. Maybe they have a point. Actually I think both sets of indicators – as well as the labels we assign to them – are subject to critique, and I’ve said so elsewhere.

However regardless of how we define which corpses to count and what to call them, what ought to be at issue here is how best to arrive at valid estimates. Suppose OMG’s original findings were true, and suppose both datasets were actually trying to measure the same thing. Does this mean surveys are a more accurate measure than incident reporting, or simply that both measures are inaccurate in different directions? I can imagine that surveys would result in significant over-reporting, just as I find it plausible that incident reports report-based estimation methods may miss some data. I am no number-cruncher, but if I were constructing a casualty dataset for specific wars, I expect I would want to take the average of the two estimates. So this debate over which methodology is more accurate strikes me as a slightly misplaced.


Tallying Collateral Damage

Earlier I blogged about the importance and absence of data disaggregating unintentional civilian deaths from total civilian deaths in wars worldwide. To get a preliminary handle on this question, I examined a dataset on civilian victimization developed by Alexander Downes at Duke University for his study on why governments target civilians in war. His dataset includes 100 interstate wars and runs from 1823-2003. It includes low, medium and high estimates for the number of civilian deaths for each party in each conflict, based on available secondary sources. It also includes a separate binary variable for whether there is evidence that governments targeted civilians directly. His not uncontroversial methodological appendices are here. Wars are coded as including evidence of intentional civilian victimization if hostilities included indiscriminate bombardment of urban areas, starvation blockades or sieges, massacres or forced relocation. Civilian deaths in wars not using these techniques can be roughly assumed to be unintentional, or “collateral damage.”*

So are unintentional civilian deaths trending up or down in absolute terms and / or as a percentage of all civilian deaths? This analysis – which is a rough first cut, mind you – suggests that collateral damage rather than war crimes may now constitute the majority of civilian deaths in international wars worldwide, and that the total number of collateral damage deaths is 20 times higher than at the turn of the last century.

The ratio of collateral damage victims to war crimes victims has dramatically increased since the end of the Cold War. According to Downes’ dataset, between 1823 and 1900, unintentional deaths constituted 17% of all deaths in war. Since 1990, that number has risen to 59%.

In other words, the majority of civilian deaths since 1990s have not been war crimes but have been perfectly legal “accidental” killings. Of course this could partly be a result of a decrease in direct targeting of civilians over time, which would be a good thing.

But collateral damage is not only increasing as a percentage of all civilian deaths.
The number of collateral damage victims is also increasing over time in absolute terms. Between 1823 and 1900, 84 civilians per year on average were the victims of collateral damage. Since 1990, the number is 1688 per year – a twenty-fold increase.

So it’s not just a question of collateral damage staying constant while war crimes drop. According to this data, at least, collateral damage is actually taking many more lives than ever before – despite purported increases in precision munitions.

What does this all mean? First, because this cut at the numbers is so rudimentary and so based on data designed to track actual civilian victimization rather than collateral damage, it seems crucial to gather some genuine data on the actual problem. Human rights and humanitarian law organizations should launch cross-national studies aimed at determining the actual numbers. They should also regularly disaggregate their civilian casualty data into intentional v. unintentional in their reporting.

But if these numbers are anywhere close to correct (and I suspect if anything they are conservative) this analysis suggests an urgent need for a rethinking the laws of war designed to protect civilians. In the 1970s, when the [added: Additional Protocols to the] Geneva Conventions were hashed out, a key concern of governments’ was to protect civilians from intentional attack. War crimes are dropping in part because international laws against targeting civilians are working. Collateral damage is increasing in part because of the absence of such clear-cut rules. It’s time for this to change.
*It’s a crude measure because in any given conflict, some civilians may be targeted directly and others may be “collateral damage;” but collateral damage counts here show up only for wars in which there was not also intentional civilian victimization. The data is also limited to interstate wars. But assuming Downes’ data is more or less accurate, we can derive a very conservative set of collateral damage numbers by tallying all civilian deaths for each war in which the state killing civilians was coded as not having done so intentionally. (I used the mid-level estimates in the dataset).


How Many of War’s Civilian Casualties are “Collateral Damage”?

This is an important question from a legal and humanitarian perspective.

In legal terms, targeting civilians is a war crime. Accidentally killing or maiming them in the pursuit of legitimate military objectives is, well, just too bad. So in judging government’s records of compliance with the law, one needs to measure the difference.

There are policy ramifications to such measurements as well. Over time, atrocities against civilians seem to be falling. But at the same time, some governments seem more complacent than ever about accidental deaths. The assumption behind the wiggle room in the law is that if countries do their best not to hit civilians, then collateral damage will always be the least of the problem for civilian populations. And perhaps this was true in earlier times. But what if in fact the majority of civilian deaths worldwide now come from these “accidents of war”? If so, this would suggest that the laws of war are woefully outdated – that even if fully implemented they do not, in fact, do enough to protect civilians. In that case, humanitarian organizations really should be in an uproar.

So what percentage of total civilian deaths are “collateral damage” and is this percentage trending up or down over time? I’ve begun investigating the answer as part of my current book project, and as far as I can tell, no one really knows. Human rights reporting generally doesn’t distinguish intentional from unintentional deaths, treating all civilian casualties as the tragedies that they are. Neither do academic tools such as the Dirty War Index or various datasets on conflict fatalities in general or civilian victimization. Even databases that count casualties for specific wars, like the Iraq Body Count, tend to break down the data into the type incident (suicide bombing v. shooting) rather than the intent of the perpetrator. And if a comprehensive study exists tracking unintentional civilian deaths worldwide, I haven’t heard of it.

So if any of you has, please let me know.

« Older posts

© 2021 Duck of Minerva

Theme by Anders NorenUp ↑