It’s not just what you forecast, it’s how you forecast

January 15, 2016 12:00 PM

The search term “stock market crash 2016” produces more than 12,400 Google results. 

Jeremy Grantham has predicted the market will collapse “around the [2016] presidential election or soon after” and that the Dow will fall to “around half of its peak or worse.” 

At MarketWatch, Paul Farrell predicts a coming mega crash, writing “Dow 5,000.” 

Robert Kiyosaki, author of “Rich Dad, Poor Dad,” says the biggest market crash in history is set for 2016.

There’s no shortage of sky-falling prophecies. Still Farrell and Kiyosaki both have predicted similar crashes within the last four years, only to see the markets rise. Very rarely do media outlets revisit such predictions if and when they are sorely wrong. But more 2016 predictions will come; after all, “stock market crash 2015” produces 125,000 Google results.

These so-called experts have a spotty record when it comes to forecasting, as evidenced by the analysis of many market predictions in 2015 found in the following pages. But now academic research and a groundbreaking book back the argument that amateurs outperform experts in finance, politics and even national intelligence when it comes to prediction.

Common behavioral traits and a successful forecasting methodology have been identified and developed by University of Pennsylvania professor Philip Tetlock, who explains in the book, “Superforecasting: The Art and Science of Prediction” (co-authored by Dan Gardner) why an elite class of amateurs get it right and how anyone can improve their forecasting skills.

The good judgment project 

Co-author Dan Gardner is well-known for his book, “Risk: The Science and Politics of Fear,” which explores risk psychology and why humans make irrational decisions based on fear. 

“We’re the healthiest people who have ever lived, but we don’t act like it,” says Gardner, a Canadian journalist. As he began considering future events that could affect today’s standard of living and levels of comfort, Gardner rigorously explored areas of geopolitical forecasting.

“When you focus on the science of forecasting, you’ll end up exploring the work of Professor Philip Tetlock,” Gardner says. “He is one of the few top-flight researchers who are trying—in a scientific way—
to tackle forecasting.” 

Tetlock is famous for his 2005 statement that the average expert is “roughly as accurate as a dart-throwing chimpanzee” when it comes to forecasting. That conclusion came after a 20-year forecasting competition that ended in 2004. In his book, “Expert Political Judgment,” Tetlock analyzed more than 80,000 predictions made by nearly 300 experts in the fields of political science, economics and other advanced disciplines. The results didn’t justify the use of the title, “experts” when it came to forecasting the future.

But “Superforecasting” is Tetlock’s most exhaustive effort yet on the art and science of prediction. The book explores results from the Good Judgment Project, a rigorous forecasting competition that matched thousands of amateurs against some of the world’s most prominent prognosticators and prediction algorithms. It turns out the amateurs delivered far more confident, accurate forecasts. Even more surprising, some members of the top 2% of participants did not hold the professions or statistical background one might expect. Among the scientists, mathematics professors, and financial engineers were a housewife, a welfare case worker, an opera director and a local pharmacist. These individuals might not be members of Mensa, but their attitude and approach helped rank them among an elite crop of participants now officially known as “Superforecasters.”

The authors explain that not only is forecasting an inherent skill, but it also is one that can be improved through a rigid discipline of practice, practice and more practice.

“Basketball players get better at free throws by [practicing] free throws and forecasters get better at forecasting by making forecasts,” says Gardner. 

“But there’s a huge caveat here: Merely making forecasts isn’t enough. Like a good basketball player practicing free throws, you have to get prompt, clear feedback, pay close attention to it, think hard about what you’re doing and adjust it in light of the feedback. 

The problem for forecasters is that the feedback they usually get is slow in coming and so vague it can be read any number of ways. To get the clear feedback that’s essential to fruitful practice, you need to develop something like the methodology Phil Tetlock created, and we describe in the book.” 

Funded by the Intelligence Advanced Research Projects Activity (IARPA)—part of the U.S. intelligence community—the Good Judgment Project was launched by Tetlock and his wife Barbara Mellers in 2011. 
The competition recruited more than 2,800 volunteers to take part in a forecasting competition that required experts and amateurs with an interest in current events and geopolitics to set probabilities for future events that had a specific time horizon.

Gardner argues that forecasting—despite its flawed role in today’s media—is one of the world’s most important disciplines, one that has tremendous ramifications for global politics, economics and social structures.

“All our decisions are based on forecasts, so forecasts matter. They can be the difference between wealth and poverty, even life and death,” says Gardner. “Yet, far too many of us are remarkably unserious about forecasts and forecasting. Even some of the biggest, most sophisticated organizations in the world—organizations that spend millions, even billions, on forecasting—do not scientifically test the accuracy of the forecasting, so they don’t really know how good their forecasting is or if it could be better.”

As Gardner notes, Tetlock’s project is the largest exercise of its kind ever done. Although the Good Judgment Project is still young, the research has been nothing short of groundbreaking.

The Good Judgment Project asked 2,800 volunteers to use publicly available data—news streams, research reports and more—to calculate probabilities on the likelihood of future events. Questions were quite diverse and centered on topics including future gold prices, expectations for OPEC’s global production output and the potential for political agreements among nations. 

It wasn’t long before the top 2% began to separate from the field, and their performance continued to improve. In the ensuing projects, these superforecasters beat the average participant by 65% and crushed the predictions of four algorithmic platforms by 35% to 60%. The superforecasters also outperformed government professionals who have security clearances and access to classified intelligence briefings on the topics.

Traits of superforecasters

Walter Hatch, a portfolio manager at McAlinden Research Partners, joined the competition as a volunteer forecaster in 2011. Having read Tetlock’s “Expert Political Judgment,” he jumped at the chance to take part in the rigorous project. 

Despite his mixed performance in the first year, he buckled down and increased his participation level. A CFA with a Ph.D. in political science, Hatch ultimately became a superforecaster by his second year and followed the successful methodology. As Hatch notes, not everyone has a Wall Street or statistics background, nor does that matter. 

“Superforecasters are from all walks of life,” says Hatch, who is now also the senior vice president of the Good Judgment Project. “Having expertise [in a field] does not mean you will have better forecasting accuracy.”

What makes superforecasters successful is how they approach and analyze predictions. 

“What you think is much less important than how you think,” Tetlock said in a September interview with The Wall Street Journal. The elite forecasters consider their predictions as “hypotheses to be tested, not treasures to be guarded.” That means forecasts are subject to change, quickly and often following a rigorous collection and analysis of accessible data (see “5 traits,” below). 

Superforecasters begin by gathering as much information as possible. This includes an emphasis on identifying data that could affect and disprove the likelihood of their forecasts.

Next, they nurture and develop the habit of thinking in terms of probabilities when exploring the likelihood of specific events. But these probabilities are not back-of-the-envelope projections like “50-50” or “73-27.” As they assess data and gather more information, 
they make incremental changes that might produce a probability of “54 to 46” or “73 to 27.” 
That critical lesson helped separate the better forecasters from the field.

“Some people had a little bit of training, and some did not,” Hatch says. “The results were dramatic when individuals learned what a probability is, how it works and how to be comfortable with it. You could see the divergence a month later. A little training was an important part of improving accuracy for everyone, including the superforecasters.”

Next, forecasting improves when individuals work in teams. Hatch says that the variety of perspectives adds value to determining a range of probabilities and provides unique insights that one person might not have considered given their background or approach to questions.

Fourth, superforecasters ensure that they are regularly keeping score of their projections. After they reached a conclusion and a specific date passed, they spent a significant amount of time determining why forecasts had been correct or incorrect, and explored the ranges of their probabilities to determine what might have been a better projection (i.e., 71-29 or 79-21).

Finally, and perhaps most important, the most successful forecasters are willing to admit error and quickly change course on their projections. “If something changes, you want to update your beliefs accordingly,” says Hatch. “And the best way to internalize that process is by getting direct feedback about the forecast itself. So some people might get wedded to a particular idea. And they can stay wedded to that idea. Because oftentimes there’s no incentive or reason for them to necessarily abandon that idea.” Given the competitive nature of the project, good forecasters quickly abandoned inaccurate sentiments and altered their outlooks accordingly.

That last lesson is critical to the development of a growth mindset that encourages participants to improve their forecasting skillset.

“Two aspects of the superforecaster mindset are fundamental: Intellectual humility and a desire to improve,” Gardner says. “Here, ‘humility’ does not mean self-doubt or thinking you’re unworthy. It means recognizing that reality is immense and complex, seeing even a little of it accurately is a struggle and foreseeing what is to come is even harder—even when it is possible at all, which it often is not. Intellectual humility could, conceivably, lead one to simply give up. That’s why the desire to improve—the “growth mindset,” to use the term coined by psychologist Carol Dweck—is so important. It’s a belief that with effort comes improvement. Combine intellectual humility with a desire to improve and you get someone who is cautious, careful, willing to acknowledge uncertainty and admit errors, but also who has the drive to keep working and learning from feedback and get better. That’s a potent combination.”

Lessons for the media

Tetlock is optimistic about the future of predictive science. In a recent Wall Street Journal editorial, he predicted that the art and science of forecasting will see a major revolution in the 21st century, comparing it to how medicine advanced “from leeches to antibiotics.” 

But that leap faces challenges in public perception. As The Economist notes in a review of “Superforecasting,” today’s world of prediction is “based on eminence rather than evidence.” 

The bulk of superforecasters have little media experience. It’s unlikely investors will see a housewife or out-of-work engineer appear on CNBC to offer a geopolitical outlook, no matter how accurate they have been in the past. Individuals who have been famously wrong still appear regularly on networks and in print. Shifting toward lesser-known forecasters and improving accountability is challenged by the media’s desire to produce forecast click bait or to fill 30 seconds of air time with hyperbolic headlines. Gardner argues that outrageous predictions are popular because of the certainty of the forecaster. He also notes that listening to a forecaster explain probabilities is not a particularly desirable hobby among audience members.

“There is the traditional negativity bias. If it bleeds, 
it leads,” Gardner says. “If someone says there is a 75% probability that something will happen, that’s not psychologically satisfying. There is a profound aversion to uncertainty. If you can find an answer that eliminates uncertainty, they will steer toward the blowhards who are certain they know what will happen.” 

As the authors admit, forecasts also provide an element of entertainment. They cite Jim Cramer and John McLaughlin’s respective stock and political outlooks. They also cite the use of forecasts by activists to urge action to prevent a statistically unlikely event. 
And some partisans use forecasts to reassure audiences that the viewers’ preexisting biases are justified. Gardner argues the media should do more to establish credibility and revisit such projections in order to incentivize individuals to improve their forecasting 
(see “Separating signal noise,” below).

“Explicitly or implicitly, the media tell people that the forecasters they interview—and help make famous and influential—are accurate. To do that without having good evidence of forecasting accuracy is shoddy reporting,” Gardner says. “To compound that mistake by not reporting whether the forecasts they reported turned out to be accurate or not, or worse, by going back to the same experts after their forecasts fail and not mentioning the failures, they engage in professional malpractice. Unfortunately, it happens routinely.”

Putting reputation on the line

The Good Judgment Project is now an even larger, more dynamic competition with more than 5,000 forecasters offering insight on topics like the Syrian Civil War, the stability of the Eurozone and the Federal Reserve’s monetary policy. But Tetlock and members of the project don’t appear to be finished just yet. A 2014 paper by Tetlock, Mellers, Nick Rohrbaugh and Eva Chen calls for an increase in prediction tournaments to improve accountability, credibility and critical thinking among different schools of thought. 

“Tournaments can nudge players in polarized debates toward the right epistemic direction,” Tetlock writes. “Imagine a world in which the comment that “someone has good psychological intuitions” is more than a vague hunch but is instead based on objective performance in reputational bets against prominent psychologists in major controversies.”

Modern Trader readers can join the Good Judgment competition and find out if they can become the next superforecaster. Visit to take part in the competition’s next round. 

About the Author

Garrett Baldwin is the Managing Editor of the Alpha Pages and the Features Editor of Modern Trader. An author and Baltimore native, he earned a BS in journalism from the Medill School at Northwestern University, an MA in Economic Policy (Security Studies) from The Johns Hopkins University, an MS in Agricultural Economics from Purdue University.