*“Hegel was right when he said that we learn from history that man can never learn anything from history”* – George Bernard Shaw

In 1977, the psychology researchers Daniel Kahneman and Amos Tversky produced an influential paper entitled *Intuitive Prediction: Biases and Corrective Procedures*. In the paper, Kahneman and Tversky described systematic biases that impacted the accuracy of expert forecasts. They also presented some “corrective procedures” that could be employed by forecasters to counter these biases and improve their forecasting accuracy.

Their paper contained a central idea: Forecasters insufficiently consider historical data when attempting to calibrate their predictions. Specifically, the paper identified two different types of data that we use when attempting to formulate a prediction. The first class, known as singular data, is the information unique to the specific case at hand. A second class, known as distributional data (or base-rate information), is the accumulated history of data regarding similar cases.

Let’s look at an example from the world of medicine. A doctor needs to forecast the 2-year survival probability of a patient with a terminal illness. Singular information would include the patient’s age, their medical history and the results of diagnostic tests. Distributional data would include the history of outcomes of similar cases, also known as a reference class. Alternatively, consider a product manager from a footwear company who must decide whether his division should start producing a new basketball sneaker. Singular data would include a description of the product and the feedback provided by focus groups. Distributional data would look at the historical success rate of similar products..

Unfortunately, when making forecasts, people tend to focus too much on singular data, either underweighting or ignoring distributional data. We seem to have a tendency to place too much importance on the “case at hand”, feeling it is unique, and that it provides sufficient data to forecast an accurate prediction. Let’s take another look at our product manager, attempting to decide whether that new sneaker should be offered by his division.

After looking at demos of the product and reviewing market research performed on prospective customers, our product manager is very impressed. In his opinion, this sneaker is better than 9 out of 10 that he has reviewed. He estimates that it will sell at least 100,000 units. Kahneman and Tversky would label this behavior “prediction by impression”. That is, the prediction reflects the impression that the product made on our product manager. This decision making strategy represents an over-reliance on singular information.

Let’s look at three possible scenarios related to our new product introduction and see the prediction strategy that best matches each situation:

Perfection – Imagine that the product manager has a fantastic “crystal ball”. His track record has been 100% in determining the exact sales volumes of new sneakers. In this highly improbable instance, it would make sense to simply rely on singular information related to the new product to make a forecast (Note: This is the equivalent of the “prediction by impression” method). Unfortunately, virtually no forecasting domain exhibits this level of predictability.

Throwing Darts – The product manager has no ability to actually forecast results. His historical track record is no better than chance. He can not beat the proverbial “dart throwing monkey”. In this instance, it would make sense to rely exclusively on distributional data. The product manager should examine a “reference class” of similar products, and utilize their historical sales data. Many forecasters are loathe to adopt this model, as it minimizes their role in the process. However, there are many documented instances where expert forecasters, in fact, fair no better than chance when making predictions.

Blended – The product manager has a track record that is better than chance, but less than perfect. In this case, he should use a blended model, that combines his own forecast with the historical averages for the reference class. If the manager has been close to perfection, the prediction would be closer to his impression for the new item. If the manager had been just better than a “dart thrower”, the prediction should be closer to the reference class average.

As Kahneman and Tversky described in their paper, forecasters tend to take an overconfident approach to predictions. Virtually no one admits to being a “dart thrower” with no actual forecasting expertise. Many experts will simply rely on the singular data presented by the case they are evaluating. Even the more conservative experts, using the blended approach, will typically rely too heavily on case data.

In order to combat faulty intuitive predictions, Kahneman and Tversky recommended a five-step procedure. Let’s walk through the procedure, using our new product introduction problem as an example:

**1) Establish Reference Class** – The first step in the process is identifying a class of items that is similar to your current prediction challenge to provide historical guidelines. In our case we are trying to predict the sales of new basketball sneaker. A reference class with attributes that are closest to the case at hand should provide the most accurate predictive guidance. Examples of potential reference classes, with varying levels of similarity, comprehensiveness and history would include the following:

- All shoes introduced by the company over the last five years
- Basketball sneakers introduced by the company and its competitors last year
- High end basketball sneakers (similar in price to the contemplated new model) produced by the company over the last three years

For arguments sake, let’s assume our product manager chooses the third example, feeling it is the closest representation of our case and contains adequate history. For that reference class, the average sneaker sale has been 40,000 units.

**2) Assess Distribution of Reference Class** – Now that the reference class has been selected, it’s important to identify how data is distributed throughout the class. Such measures as variance, or standard deviation will show whether typical outcomes are closely clustered to the mean or broadly distributed across the the range. In their paper, Kahneman and Tversky didn’t clearly explain why this step was important. I believe it is a useful piece of data to determine our confidence level with the “corrected estimate” that we’ll identify in Step 5.

**3) Intuitive estimation** – In this step our forecaster attempts to make an initial estimate utilizing their intuition, combined with data that is specific to the case at hand. In our scenario, the product manager has seen samples of the new sneaker and has reviewed feedback from focus groups. He would also have a sense of the current market for new, high-end sneakers. Based on this information, the product manager estimated sales of 100,000 units.

**4) Assessment of predictability** – Here’s where things start to get kind of squishy. In this step, the forecaster attempts to estimate the accuracy of their prognosticating capabilities in this domain. Kahneman and Tversky offered two different methods for computing this value. For simplicity’s sake, I’ll demonstrate one method. To complete this step, our product manager contemplates the following problem: “If you were to consider two different prospective sneaker offerings, how often could you correctly predict which shoe would be the better seller?” That percentage is then turned into a number, ranging from 0 to 1. Zero would represent “dart thrower”, with no better ability than chance. One would represent perfect forecasting ability. For our example, let’s say our product manager estimates his ability at .75, meaning he believes he could pick the better selling sneaker three out of four times.

**5) Correction of intuitive estimate** – Here’s where we put it all together. Kahneman and Tversky developed a formula, using the information from steps 1-4 to computed a corrected estimate:

First, using the assessment of predictability from Step 4, compute a correlation coefficient –>

(2 * Assessment of Predictability) – 1 = Correlation Coefficient

For our example, the calculation would be —> (2 * .75) – 1 = .5

Next calculate the corrected estimate as follows –>

Reference Class Average + Correlation Coefficient * (Intuitive Estimate – Reference Class Average) = Corrected Estimate

For our example —> 40,000 + .5 * (100,000 – 40,000) = 70,000

**Adding in the variability from the distribution class** – As discussed, although Kahneman and Tversky didn’t include it in their formula, they included a step that looked at the distribution of the reference class. We could apply some basic concepts from statistics to use this data to create a range for our estimate. We could arbitrarily pick a confidence level that is appropriate for our problem. For example, we could observe that 90% of sneakers in the reference class had sales that were within 20% of the mean. We could then apply this to our corrected estimate to build a range of values:

70,000 * .2 = 14,000

Therefore, we could enhance our estimate by saying we expect it to range between 56,000 and 84,000 (i.e. 70,000 + or – 14,000).

Here is a graphic showing the steps from our example:

**Some Additional Thoughts/Caveats ** – While Kahneman and Tversky based their paper on scientific research, their 5-step process, containing several estimates and guesses, should not be mistaken for a rigorous statistical analysis. My use of the term “confidence level”, for estimating variability, should not be confused with the statistical concept of confidence interval. However, on the whole, their process for “corrected estimates” should offer improvements in accuracy for intuitive predictions. It helps by countering our profound bias towards single case forecasting, providing a moderated prediction.

As an alternative to “corrected estimates”, firms could also consider statistical prediction rules (SPR). This algorithmic approach to prediction standardizes the process and also does a good job removing the bias of the expert.