Tag: forecasting

Can’t stop thinking about tomorrow…

Michael Horowitz and Philip Tetlock have an interesting piece in Foreign Policy that examines the record on long-range forecasting of global events — 15 – 20 years into the future. They acknowledge the inherent difficulties of such a projections, but still wonder:

whether there are not ways of doing a better job — of assigning more explicit, testable, and accurate probabilities to possible futures. Improving batting averages by even small margins means the difference between runner-ups and World Series winners — and improving the accuracy of probability judgments by small margins could significantly contribute to U.S. national security.

Overall, I like the piece, but I do wonder about a couple of the basic premises and their prescription.

1. Would improving the accuracy of probability judgments actually enhance US national security? I’m not convinced. And, unfortunately, Horowitz and Tetlock don’t unpack this claim. They do acknowledge, and I agree, that improving accuracy would be difficult and it would only be improvements on the margins. The world is getting more complex, not less. It is more dynamic, not less. New and more actors in the international system interacting with greater frequency, more intensity, and faster speeds means that there is a constantly changing strategic environment in which actors act and react — and continue to change the strategic environment. In short, minor improvements in accuracy just might do anything because on whole everything is getting more complex.

2. Is accuracy the right metric? Even if we did have a better understanding (or thought we did) of the future, any policy calibrations made today on the basis of what that future might look like, could alter the future in ways that deviate from the accuracy of the long-range forecasting. In this sense, accuracy may well be the wrong metric.

3. Is there a downside in trying to get better? Maybe. Horowitz and Tetlock conclude:

Even if we were 80 percent or 90 percent confident that there is no room for improvement — and the Global Trends reports are doing as good a job as humanly and technically possible at this juncture in history — we would still recommend that the NIC conduct our proposed experiments. When one works within a government that routinely makes multibillion-dollar decisions that often affect hundreds of millions of lives, one does not have to improve the accuracy of probability judgments by much to justify a multimillion-dollar investment in improving accuracy.

Again, I think there is utility in long-range forecasting exercises, I’m just not sure I see any real benefits from improved accuracy on the margins. There may actually be some downsides. First, a “multi-million dollar investment” (they don’t tell us exactly how much) is still money and it may be a waste time and money to throw even more resources at an effort that is principally of interest only to the participants. Do policymakers really get much from projects like Global Trends or other long-range forecasts — and would they get added benefits from marginal improvements in accuracy? They already have their own biases and perceptions of the future — do these exercises have any real influence?

Second, what if we spend more time, money, and other resources to enhance those capabilities such that it alters decision-makers’ perceptions and gives them an unfounded sense of accuracy, i.e, that they come to see long-range forecasting as producing accurate or realistic futures? We may get a whole host of policy reactions that are unnecessary, wrong, and counterproductive based on what are still probabilistic outcomes.

I’m not saying we shouldn’t tweak these exercises to make them better for all involved. I also agree with Horowitz and Tetlock that there is utility in conducting these long-range forecasting efforts. It is helpful to enlist a broad set of academic and government views to assess current and long-term trends. My own sense is that these efforts probably tell us more about the present than they do about the future. They force analysts to articulate their often embedded assumptions and to project into the future the likely consequences of their current assessments. I think we should keep them, I’m just not sure we need to spend too much more time and money on them. Of course, I might be wrong.


Who needs experts to forecast international politics?

 This is a guest post by Michael C. Horowitz, Associate Professor of Political Science at the University of Pennsylvania.

Who can see the future? For us mere mortals, it’s hard, even for so-called experts. There are so many cognitive biases to take into consideration and even knowing your own weaknesses often does not help. Neither does being smart, apparently. So, what does make for “good judgment” when it comes to forecasting? When, if ever, do experts have advantages in making predictions? And how can we combine expertise and statistical models to produce the best possible predictions? This is not just an academic question, but one relevant for policy makers as well, as Frank Gavin and Jim Steinberg recently pointed out. There are new efforts afoot to try and determine the boundary conditions in which experts- both political scientists and otherwise- can outperform methods such as the wisdom of crowds, prediction markets, and groups of educated readers of the New York Times. At the bottom of this post is information on how to assist in this research. I hope you will consider doing so.
Jacqueline Stevens recently argued in the New York Times that “Political Scientists are Lousy Forecasters.” In her article, which othershavealreadydissected, she discusses Phil Tetlock’s work on expert forecasting. His book, Expert Political Judgment, has become the definitive work on the subject. The postage stamp version she cites is that experts are only slightly better than dart-throwing chimps at predicting the future, if they are better at all.
However, the notion that Tetlock argues that experts are know-nothings when it comes to forecasting is simply wrong, as others have already pointed out. More important, Expert Political Judgment was a first foray into the uncharted domain of building better forecasting models. Several years later, Tetlock is back at it, and this time he has invited me, Richard Herrmann of Ohio State University, and others to join him. The immediate goal this time is to participate in a forecasting “tournament” sponsored by the United States intelligence community. The intelligence community has funded several teams to go out and build the best models possible – however they can – to forecast world events. Each team has to forecast the same events, a list of questions given to the teams by the sponsor, and then submit predictions [note: Tetlock’s team dominated the opposition in year one – so we’ll find out this year whether adding me helps or not. Unfortunately, there’s no place to go but down].

Our team is called the Good Judgment team, and the idea is to not only win the tournament, but also to develop a better understanding of the methods and strategies that lead to better forecasting of political events. There are many facets to this project, but the one I want to focus on today is our effort to figure out when experts such as political scientists might have advantages over the educated reader of the New York Times when it comes to forecasting world events.
One of the main things we are interested in determining is the situations in which experts provide knowledge-added value when it comes to making predictions about the world. Evidence from the first year of the project (year 2 started on Monday, June 18) suggests that, contrary to Stevens’ argument, experts might actually have something useful to say after all. For example, we have some initial evidence on a small number of questions from year 1 suggesting that experts are better at updating faster than educated members of the general public – they are better at determining the full implications of changes in events on the ground and updating their beliefs in response to those events.
Over the course of the year, we will be exploring several topics of interest to the readers – and hopefully authors – of this blog. First, do experts potentially have advantages when it comes to making predictions that are based on process? In other words, does knowing when the next NATO Summit is occurring help you make a more accurate prediction about whether Macedonia will gain entry by 1 April 2013 (one of our open questions at the moment)? Alternatively, could it be that the advantage of experts is that they have a better understanding of world events when a question is asked, but then that advantage fades over time as the educated reader of the New York Times updates in response to world events?
Second, when you inform experts of the predictions derived from prediction markets, the wisdom of groups, or teams of forecasters working together, are they able to use this information to yield more accurate predictions than the markets, the crowd, or teams, or do they make it worse? In theory, we would expect experts to be able to assimilate that information and use it to more accurately determine what will happen in the world. Or, maybe we would expect an expert to be able to recognize when the non-experts are wrong and outperform them. In reality, will this just demonstrate the experts are stubborn – but not in a good way?
Finally, are there types of questions where experts are more or less able to make accurate predictions? Might experts outperform other methods when it comes to election forecasting in Venezuela or the fate of the Eurozone, but prove less capable when it comes to issues involving the use of military force?
We hope to explore these and other issues over the course of the year and think this will raise many questions relevant for this blog. We will report back on how it is going. In the meantime, we need experts who are willing to participate. The workload will be light – promise. If you are interested in participating, expert or not, please contact me at horom (at) sas (dot) upenn (dot) edu and let’s see what you can do.

Better Political Forecasts through Crowdsourcing

Dan Drezner links to a recent article by Philip Tetlock on the difficult business of political forecasting. His evaluation of this troubled pastime is accomplished through the review of three recent books that all claim to provide a better way to see the future of politics. His own research (Expert Political Judgment: How Good Is It? How Can We Know?, a fantastic book that you really should read) offers solid reasons to be skeptical of any pronouncements by ‘experts’ that they have some kind of proprietary knowledge about the future.

While I think his critique of the three books and of political forecasting in general is quite good, I find lacking one of his suggestions for how to improve the practice; namely, crowdsourcing. My issues does not lie with the practice of crowdsourcing, but rather the way that Tetlock describes it.

After his review of the three books (and the requisite approaches to forecasting each represents), Tetlock provides a powerful suggestion for how to improve the prediction business–crowdsourcing political forecasts:

Aggregation helps. As financial journalist James Surowiecki stressed in his insightful book The Wisdom of Crowds, if you average the predictions of many pundits, that average will typically outperform the individual predictions of the pundits from whom the averages were derived. This might sound magical, but averaging works when two fairly easily satisfied conditions are met: (1) the experts are mostly wrong, but they are wrong in different ways that tend to cancel out when you average; (2) the experts are right about some things, but they are right in partly overlapping ways that are amplified by averaging. Averaging improves the signal-to-noise ratio in a very noisy world. If you doubt this, try this demonstration. Ask several dozen of your coworkers to estimate the value of a large jar of coins. When my classes do this exercise, the average guess is closer to the truth than 80 or 90 percent of the individual guesses. From this perspective, if you want to improve your odds, you are better-off betting not on George Friedman but rather on a basket of averaged-out predictions from a broad ideological portfolio of George Friedman–style pundits. Diversification helps.

As Dan points out in his post, this suggestion potentially violates two of the necessary conditions of successful outsourcing, and that is the independence of the experts and diversity of their opinion. Dan says it best:

One of the accusations levied against the foreign policy community is that because they only talk to and read each other, they all generate the same blinkered analysis. I’m not sure that’s true, but it would be worth conducting this experiment to see whether a Village of Pundits does a better job than a single pundit.

I would actually go farther than Dan here. The problem with approach isn’t simply that political scientists and pundits may conduct their analysis in an echo chamber (although that is definitely an issue), but rather that for the crowdsourcing of these issues to work properly you would want as diverse a crowd as possible–meaning, you would wan to include individuals from outside of political science and the political pundit community.

Outside of an effective aggregation mechanism, James Surowiecki points to three necessary conditions for successful crowdsourcing:

  1. Diversity of opinion
  2. Independence of those opinions
  3. Decentralization (i.e. ability to lean on local knowledge)

Political Scientists and pundits do not hold a monopoly on useful insights into the world of politics. Other actors have an interest in understanding and predicting what will happen politically, including financial analysts, corporations, journalists, and politicians and citizens around the globe. Each of these groups likely brings their own perspective and lens for analyzing political outcomes to the table, and from a crowdsourcing perspective that is precisely what one would want (diversity, independence, and decentralization). The answer isn’t simply to gather more opinion from political pundits, but rather to gather more opinion from additional actors who represent an even greater diversity of opinion.

I agree with Dan that it would be worthwhile to set up some kind of experiment to determine the optimal composition of a political forecasting crowd. I smell a side project a brewin’….

[Cross-posted at bill | petti]


© 2021 Duck of Minerva

Theme by Anders NorenUp ↑