HealthCare.gov – technology, complexity and politics

goodfastcheappieThe headlines and political repercussions from the HealthCare.gov debacle continue unabated. The focus is unsurprising, given the polarizing debate that proceeded the passage of the Affordable Care Act (ACA). As this is a technology management blog, I’ll refrain from adding to the angry, partisan debate. Instead I’d like to focus on the lessons IT leaders and project managers can glean from this unfortunate event. Let’s start with some historical background.

The ACA was signed into law by the president on March 23, 2010. It was a dramatic piece of legislation, long envisioned by Democrats and critical to Obama’s legacy. It represented the most significant overhaul of the nations’s healthcare system since the implementation of Medicare and Medicaid in 1965. But unlike Medicare and Medicaid, the ACA was enacted in the Internet era, with high expectations of web based enrollment  capabilities. Spinning up a massive agency with paper-based sign-up processes would be expensive and a logistical nightmare.

A centerpiece of the legislation was the concept of state based insurance marketplaces, or exchanges. These websites would give citizens the ability to comparison shop different policies and to apply for coverage. However, only 23 states (and the District of Columbia) decided to set up an independent, state-run exchange. The remainder, defaulted to the federally managed exchange or engaged in a state-federal partnership. This left the federal government with the responsibility of creating and operating an exchange on behalf of 37 states.

In order for the ACA to be effective, Americans would have to be registered, with active policies, by January 1, 2014. To accommodate this deadline, the Obama administration determined that the federal marketplace, aka HealthCare.gov, would open for enrollment on October 1st. Furthermore, this would be a “big bang” event, without any phasing in of enrollment populations. There was also a requirement that individuals first sign-up prior to using the site to evaluate  and compare policies.

The project to create the website was massive in scope. While run by the Centers for Medicare and Medicaid Services (CMS), it involved the coordination of numerous federal and state agencies. It also involved 47 different contracting firms. It is fair to say that it was the most ambitious public facing, transactional website ever created by the federal government.

Even prior to the site’s high-profile startup failures in October, it was apparent that potential trouble was looming. Several months earlier, the project manager, Henry Chao sent emails to other officials at CMS warning that the site could “crash the plane upon takeoff”. He described serious issues with software, contractors and staffing shortages. The emails, made public as part of congressional hearings, should have served notice to rational individuals that a successful October launch was highly unlikely. In mid-September, with the launch date only two weeks out, the system failed a test that simulated a mere 500 concurrent users.

As they say, “the rest is history”. The launch was an abysmal failure, with virtually no usable services. The sign-up portion of the site was overwhelmed with performance issues, with only 6 people able to successfully establish coverage on the first day! The response was angry and dramatic. Even proponents of the new law were publicly critical of the Obama administration and their handling of this project. His approval ratings are down, and his legacy is potentially threatened if this “ship is not righted”.

As of this blog post, the administration has moved into crisis management mode, spending  the last two months remediating the numerous issues plaguing the website. They added “SWAT” team resources from private industry who have a track record of building high volume websites. All of this, to get the site in working order by a promised date of the end of November. This morning, the administration is boldly claiming that the site is now fixed. I’ll reserve judgement till we get through the enrollment period. Only then will it be apparent whether the dramatic repair efforts were sufficient to meet the original design requirements of this system.

So how did we get here? How did such an important initiative get so radically off-track, leading to a disastrous “opening night performance”? In hindsight, there were several contributing factors that made failure a predictable outcome:

Complexity – This was an enormously complex project, involving the coordination of numerous organizations and resources. Complexity has a tendency to increase geometrically, quickly leading to challenges in forecasting, coordination, status assessment, and ultimately to delivery. That’s one reason behind the growth in iterative delivery models. Progressive organizations understand that one way to address complexity is to limit the amount of functionality that is delivered with a given wave of development. Unfortunately, the politics surrounding HealthCare.gov provided back pressure against any adjustments in features or delivery dates.

Novelty – The ability to accurately plan and deliver a complex project is hampered by the novelty of that effort. That is, projects with no similar historical parallel lack reference points for accurate project forecasting. Furthermore, human nature often leads to a gross underestimation of completion times. In fact, this phenomenon has a name – the planning fallacy. In the case of HealthCare.gov, the project had no reasonably similar forerunners. Therefore, a more conservative stance was warranted regarding estimations of timelines.

Politics – The contentious nature of the ACA meant that the stakes would be high regarding the success of the website. The Obama administration desperately needed the site to be available on-time and to perform as advertised. There was little room for compromise on requirements or deadlines. This led to two well known phenomenons:

First, there is the problem of the “slain messenger”. When people are told that a “damn the torpedoes” directive is in play, they will ultimately stop raising issues about an initiative. No one wants to be labeled a complainer or a poor team player. An atmosphere that discourages constructive feedback will ultimately result in projects that move forward without adequate risk assessment. It is typically the technicians – the folks at “ground level” – who have the most accurate understanding of a project’s issues. If those people are muffled, management will cluelessly believe that their high priority initiative is on track. Like the Emperor who was assured that he had new clothes, management will be forced into reality when their project walks “naked” through the streets.

Secondly, the leaders of the project, ultimately including the president, fell victim to plan continuation bias. This well documented phenomenon occurs when people refuse to cancel an effort or change direction, despite obvious issues. Human nature dictates that people will be influenced by inertia when dealing with key initiatives. They will continually look for evidence confirming their original plans and will discount information critical to their efforts.

A common idea used to frame the priorities of a project is – “good, fast, cheap; pick any two”. This saying highlights the notion that you can’t “have your cake and eat it”. That is, all projects involve tradeoffs. A project of very high quality will require a greater investment in time and resources. A project done quickly may not have the same level of quality. Unfortunately, there are additional limitations to this “rule”. Some projects are so complex, or have such tight deadlines, that no amount of additional resources can make them successful. In the case of HealthCare.gov, the government set constraints on two of the dimensions. They had unyielding expectations on fast (an October 1st deadline) and good (specific features, capacity and service levels). Trying to add additional resources at the 11th hour was wholly insufficient to offset the other firmly determined dimensions.

What lessons can your organization learn from HealthCare.gov? Are you unwittingly setting the stage for your own similar failures? When you are planning important initiatives, keep the following concepts in mind:

  • Avoid setting inflexible deadlines and functional requirements. While there are times (e.g. complying with a new law or regulation) when this is necessary, it is often done for political reasons.
  • Utilize an iterative approach to project delivery with functionality and user populations implemented in a progressive fashion.
  • Create a culture that encourages constructive criticism from all project participants.
  • Establish a “pre-mortem” process (in the planning stage) where project participants brainstorm possible scenarios that could derail an initiative.
  • Utilize historical examples as part of the project planning process. Those efforts with a limited historical base should be planned in a more conservative fashion.
  • Be mindful of plan continuation bias. It is better to make course corrections than to plow into an iceberg.

 

Posted in Complexity, General Management, Organization, Project Management | Comments Off

Accidents, Errors and Swiss Cheese

On Saturday, Asiana Airlines flight 214, a Boeing 777, crashed upon landing at San Francisco Airport. Through a combination of luck and robust aircraft design, only 2 people were killed; a surprising outcome given the photos of the wreckage. Unfortunately, many more were injured, a number in serious or critical condition. Intense speculation surrounds the cause of the crash, with theories ranging from pilot error to inoperative runway guidance systems to mechanical failure. What is known so far is that the plane made an abnormal final approach, striking it’s tail against a sea wall that precedes the runway. What’s unclear is why?

The NTSB, the government body in charge of the investigation, has numerous experts on scene. Their process is methodical, sometimes taking years to establish a series of root causes and recommendations following a plane crash. In this case, the NTSB has a number of factors in its favor. The remaining pieces of the plane are immediately observable, unlike crashes that happen at sea. The black boxes have already been recovered. The crash happened in a populated area, with numerous eyewitnesses as well as video. The crew survived the crash and will be able to provide important information about the final fateful minutes of the flight.

When accidents like Asiana 214 happen, people hunger for immediate and simplistic answers. Our need to feel secure fuels a desire for straightforward explanations that bring closure to mysteries. Our culture of real-time social media rapidly turns people into aviation experts, investigative sleuths and conspiracy theorists. Unfortunately, armchair speculation is rarely accurate, and ultimately the contributing factors of a crash can be diverse and complex.

The intense media focus creates a false sense of insecurity, leading many to feel that commercial air travel is risky. In truth, aviation safety efforts have produced remarkable results — commercial air travel is an incredibly safe mode of transportation. Prior to Saturday, the last fatal plane crash on US soil — of a commercial jet of at least regional size — was back in 2006. The last previous fatal domestic crash involving a larger, “transatlantic” class jet — think 737′s and larger — was in 2001. During that timeframe there have been 10′s of millions of safe, uneventful flights. Much of this is due to the disciplined, evidence-based approach to safety of modern aviation.

One of the foundational safety principles of modern aviation is known as the Swiss cheese model. It is a concept that was first described by the cognitive psychologist and researcher James Reason. In his seminal book Human Error, Reason chronicled a number of famous disasters, including Three Mile Island and the Challenger space shuttle accident. But instead of merely reviewing the underlying factors of each incident, he proposed an integrated theory of accident causation. Reason had several profound insights:

  • Accidents involving complex systems are often the result of the confluence of multiple contributing factors.
  • Contributing factors can occur in a wide range of domains from unsafe acts — such as a pilot approaching a runway at an improper altitude — to organizational errors — such as a culture of fiscal austerity that does not prioritize training activities.
  • As opposed to the active errors that occur at the time of an incident many contributing factors are in fact latent errors. These latent errors lie dormant, waiting for an active error to turn them into a trigger for an incident.
  • Human beings, lacking unlimited concentration, focus and memory, will always be prone to operational errors. Properly designed systems account for this limitation, expect a level of human error, and ultimately keep these errors from resulting in an actual incident.

Reason summarized his integrated theory of accident causation with an excellent visual known as the Swiss cheese model.

swiss cheese model

Let’s consider the model in the context of an investigation into a a crash landing such as Asiana flight 214. As a disclaimer, this example is not intended to represent a factual analysis of this tragic event. It is simply an example of how the Swiss cheese model can be used by investigators to gain a deeper perspective of the root cause of an accident. Instead of merely focusing on the immediate visible possibilities (e.g. pilot error), the Swiss cheese model forces investigators to look at the latent failures lurking deep within the organizational body.

In the case of a crash landing, examples of unsafe acts could include items such as:

  • Incomplete use of mandatory checklists
  • Insufficient intra-crew or crew-to-tower communications

Stepping back a level is a layer of failures known as preconditions for unsafe acts. A classic example in a plane crash would be fatigue, as when pilots on a long flight have had insufficient sleep. The next layer, unsafe supervision could be represented by the following examples:

  • Insufficient training
  • Incorrect pairing of flight personnel — for example, two junior pilots

At the deepest background level are organizational influences. As an example, an organization that has a strong focus on growth may not be as invested in extensive training programs for new personnel. An airline undergoing margin pressures may be disinclined to make investments in state-of-the-art safety programs.

Reason’s profound contribution was the idea that an unsafe act was simply the hole in the final layer of cheese that allowed the ultimate accident.  This unsafe act was unlikely to happen without a series of previous failures lying dormant in the background. Prior to Reason, the predominant focus of an accident investigation was on the operators (e.g. pilots, nuclear plant technicians) themselves. They were easily attached to the accident events. The simple answer was to fire, discipline or train the offending employees. Reason argued that doing so would not solve the deeper organizational issues that led to the problem.

Since the publication of Human Error, Reason’s Swiss cheese model has been adopted by a number of high-risk industries. Along with commercial aviation, it has become a key source of guidance in hospitals and nuclear power plants. Unfortunately, outside of these high-profile industries — where human life is on the line — the Swiss cheese model is relatively unknown. In my field of information technology, few professionals are aware of its existence. This leads to inadequate error investigation protocols that frequently focus on the operator.

Our inherent human nature has us seeking simple answers when tragedy strikes. In the case of commercial “accidents” — whether they are plane crashes or computer failures — its easiest to find causes, and direct blame, at the actual site of the incident. Progressive organizations will take the Swiss cheese model to heart and adopt a more holistic approach to accident prevention and investigation. They will recognize that culture, organization, and process design are all needed to provide adequate defensive layers for inevitable human errors.

Posted in Complexity, General Management, Organization, Risk | Comments Off

Bits vs. Atoms – The Driver of Many IT Trends

atomIt’s almost the halfway point of 2013 and time for pundits to take stock of their New Year’s predictions. Sadly, you are unlikely to see this, unless of course the pundit happened to be “dead on” with most of their guesses. As the great Yogi Berra once quipped, “Prediction is very hard, especially about the future”. I’ve written before about the difficulties of prediction and the poor track record of recognized experts. We live in an inordinately complex world. Numerous unforeseeable factors ultimately determine outcomes that weren’t seen in a crystal ball or on a ouija board. In the information technology (IT) world, many pundits attempt to make very bold or specific predictions. This will be “the year” of public cloud adoption. Company X will get a 32% share of mobile operating systems. Desktop PC shipments will decline by x%.

I understand the allure of these predictions. Decision makers want to hear specific, near-term, actionable forecasts to guide their choices. But while these types of predictions make good headlines, they’re not particularly valuable for CIO’s and other enterprise IT decision makers. A better way to think of the future is in terms of general direction, long-term trends and the “laws” underlying these ideas.

The futurist, Nicholas Negroponte coined the phrase “bits vs. atoms” in his 1995 book, Being Digital. Negroponte made the profound claim that digital information — bits — had many qualities that made them vastly superior to their physical counterpart — atoms. His ideas presciently predicted the rise of digital music and on-demand video, and the decline of their physical analogues — CD’s and DVD’s. His ideas are also at the heart of emerging technologies such as 3D printing.

Negroponte saw that bits were superior to atoms in a number of dimensions:

  • They have incidental costs of replication, obviating the need for “inventory” and eliminating out-of-stock conditions.
  • They have incidental storage costs.
  • They move at the speed of light allowing for instant, global access.

Additionally, bits ride the virtuous curve of “Moore’s law“. While the cost of physical items doesn’t dramatically change over short timeframes, the price of processing power, storage and bandwidth continues to shrink geometrically. Examples in the business world abound. The cost of printing and mailing paper statements continues to rise. The cost of online statements continues to drop; while the features and capabilities improve. The cost of business travel continues to grow while videoconferencing gets cheaper and more capable.

The “bits vs. atoms” idea can be used to consider the likely direction of a number of key trends in the IT world. I’ll now examine several trends through this lens. My intent here is not to replicate the hubris of pundits who try to make exact or provocative predictions. Instead, I’d like to think of each of these “megatrends” as a strong, reliable undercurrent that can inform technology strategy and decisions.

The Virtual OrganizationThere is likely to be a continued growth in virtual organizations that are geographically independent, span corporate boundaries and are transient in nature. While this concept is dependent on technology and influences the structure and challenges of IT organizations, it has applicability to the organization-at-large. Easily implemented, powerful collaborative toolsets, effective remote computing solutions, and declining bandwidth costs are all driving this trend. It will result in the continued growth of the following specific phenomena:

  • Geographically dispersed teams
  • Temporary teams, built from multiple departments, frequently aligned to project initiatives
  • Outsourcing
  • Telecommuting and a general reduction in assigned physical office (sorry Marissa Mayer!)

Decentralization – There is likely to be a continued growth in decentralized decision-making and management of technology. This trend is closely linked to both consumerization of technology and cloud computing capabilities. The ability of departments and individuals to rapidly procure and provision low-cost, effective technology solutions continues to grow. These solutions range from bring-your-own-device (e.g. iPad) to personal productivity tools (e.g. dropbox) to enterprise class SaaS applications (e.g. SalesForce). Central organizations (Corporate IT, Audit) have rightfully noted the security and regulatory concerns surrounding these solutions. And there have been small historical pendulum swings back towards centralization in the past. But the compelling nature of these tools will ultimately keep the momentum in the direction of more decentralized capability. In a world dominated by “bits”, these tools grow in power, and more importantly, in accessibility to small organizations.

Virtualized technologyThere will continue to be a strong trend towards virtualized technologies. The idea that platforms or solutions should be implemented in software, abstracted from their underlying hardware, has been around for 40 years. IBM popularized this idea with their VM (Virtual Machine) operating system. Since then, other virtual operating systems such as VMware and Xen have become extremely popular in the enterprise. Additionally, virtualization concepts have also become popular in the storage and networking worlds. The benefits are significant, and demonstrate the superiority of “bits”:

  • Rapid provisioning, modifying and deprovisioning of resources and features
  • Rapid two way scaling for “right-sized ” capacity
  • Ease of automation for maintenance and administration

In contrast, the “atoms” world, represented by hardware-based implementations suffers from the following:

  • Slow provisioning that involves the ordering, delivery, installation and setup of physical devices
  • Locked-in capacity with minimal downsizing capability and cumbersome physical upgrade processes
  • The need for continued physical maintenance

The “Document-less” worldThere will continue to be a decline in the production and use of physical documents. Yes, I know, we’ve heard unfulfilled predictions of a paperless world for decades. I believe this is more a case of when as opposed to whether this will happen. I intentionally entitled this trend document-less to take a broader view than the world of paper reports. It includes books, application forms, snail mail, tickets and any other paper-based documents.  Within the organization, paper-based processes will continue to grow in expense (both direct and hidden) relative to digital processes. Of course, a complete disappearance of documents won’t happen in entirety overnight. But it is a very strong and compelling trend. Judged thru the lens of “bits vs atoms”, it should be accelerating. Tablets, now owned by approximately 1/3 of Americans, —  along with other digital devices — offer a compelling and constantly improving alternative to paper. This capability gap will continue to grow at an accelerating pace.

 The continued security arms raceEnterprises will need to make continual investments in security technology to combat the constantly improving capabilities of the “bad guys”. Unfortunately, in the security space, the bad guys get to take advantage of the bits vs. atoms story as well. They can use tools to remotely attack from anywhere. They use improved processing power to crack previously safe encryption methods. They leverage the increased processing power of hijacked “zombie” hosts and increasingly available bandwidth to conduct denial of service attacks. They use collaboration tools to instantly share knowledge and code to enable hacking activities. Enterprises will need to constantly upgrade their defensive posture to stay a step ahead of this continuously growing threat capability.

Bits vs. atoms is a powerful notion with broad impact; likely to remain a force for many years. When making key IT decisions, it’s helpful to swim with — not against — the prevailing currents. Here are some useful checkpoint questions to ask as you consider your next decision:

  • Which option takes best advantage of the bits vs. atoms story?
  • Are there hidden costs related to atoms in any option?
  • What is length of the investment cycle of the decision?
    • Given the growing advantage of the bits model, what is the advantage of each option over the lifetime of the investment?

Highly specific predictions are difficult, fraught with error and not useful to guide IT decision making. Using mega-trends — such as bits vs. atoms — as a general guidepost to inform your decisions is a more effective enterprise strategy.

 

Posted in General Management, History, Organization | Comments Off

Sticky Decisions – Anchoring and Adjusting

It’s the end of a long day of Saturday shopping. You’re at the big box appliance store, looking to replace that old refrigerator. After several rounds of negotiating with your salesperson, you relent, agreeing to his latest offer. Heck – it seems like a good deal, it’s $200 off of the original price!

This scene probably feels familiar to you. Often, when purchasing an item, we have a difficult time determining if we have done an adequate job of negotiating. Marketing and sales professionals, with insights into human behavior, do a good job “stacking the deck” against the buyer. One technique that is common is to exploit a flaw in human reasoning known as anchoring and adjusting.

Anchoring and adjusting is a heuristic, or rule-of-thumb, used for rapid, intuitive judgements. Unfortunately, it can lead to inaccurate judgments and estimations across a wide range of domains. In addition to our price negotiation example, the effect of anchoring and adjusting is seen in contract negotiating, project planning, budgeting and forecasting. It is a well studied phenomenon that should be understood by any professional interested in improving their decision making and prediction skills. A particularly insidious nature of anchoring and adjusting is that it can be influenced by subconscious exposure to irrelevant data. This enables unscrupulous actors to exploit unwitting counterparties.

The research team of Kahneman and Tversky first coined the term anchoring and adjusting  in a seminal 1974 paper entitled Judgment under uncertainty: Heuristics and biases. They described a consistent two-stage process by which people estimated values. First, an individual establishes a reference point, based on an initial value. This starting point is known as the anchor. In the case of our refrigerator example above, it would be the list price of the item. Next, people will make one, or a series of, revisions (i.e. adjustments) to the anchor to arrive at a final estimation. In our example, this would be represented by the rounds of negotiation that involved changes to the list price of the item.

Kahneman and Tversky noted that the initial values that people were exposed to had a dramatic effect, influencing subsequent revisions. Their “adjustments” were typically too small, resulting in inaccurate final estimations. In one study, subjects were given five seconds to estimate the total of a mathematical equation. One group was presented with the following problem:

1 x 2 x 3 x 4 x 5 x 6 x 7 x 8

Another group saw the same equation, but in reverse:

8 x 7 x 6 x 5 x 4 x 3 x 2 x 1

The median estimate for the first group was 512. The second group had a median estimate over four times as high at 2,250. The correct value was 40,320. Kahneman and Tversky concluded that the subjects were demonstrating the anchoring and adjustment heuristic. Given limited time, the subjects needed to perform some quick multiplication (i.e. establish an anchor value) and then extrapolate (i.e. adjust to) a final estimate. The folks presented with the first sequence were starting with lower values, and arrived at a lower anchor value. But both groups did insufficient adjusting; the correct value was almost 20 times the median estimate of the second group.

In a second study described in the paper, Kahneman and Tversky rigged a wheel of fortune to always land on either 10 or 65. Subjects were asked two questions:

  • Was the percentage of African countries in the United Nations higher or lower than the number from the wheel of fortune?
  • What percentage of  African countries are in the United Nations?

Subjects presented with the number 10 from the wheel had an average guess of 25%. Amazingly, subjects presented with the number 65 had an average guess of 45%. The only difference between the two groups was the “random” spin of a wheel, producing what should have been an irrelevant “cue”. However, as with the order of the math problem, the wheel spin acted as an anchor, serving to limit the effects of future adjustments.

Numerous follow-up studies have confirmed the pervasive effect of the anchoring and adjustment heuristic.  In another experiment, similar to the wheel of fortune example, subjects were asked to estimate Mahatma Gandhi’s age of death. However, first they were asked if it was higher or lower than  an anchoring value (9 or 150). Both of these anchor values were clearly irrelevant, yet influenced the subject’s estimates. The first group averaged a guess of 50 while the second group averaged 67.

In both the wheel of fortune example, and the Gandhi study, researchers were demonstrating an element of the anchoring and adjusting process known as priming. The initial number, while completely irrelevant, acted as a primer, and as an anchor to the estimating process. In both these cases, the primer was presented directly to the subjects. Further research has shown that the primer can actually be presented subliminally, or outside of conscious perception.

In a 2005 study in by Mussweiler et al. at a German university, students were asked to estimate the average price of a midsize car. However, prior to being asked for their estimate, they were told to stare at a computer screen. The screen appeared to contain a series of nonsense strings such as “$§?#ß#”. Periodically, subliminal priming information would replace the nonsense string, appearing for 33 milliseconds, below the level of conscious awareness. One group was presented with a subliminal “low” anchor number of 10,000 while the second group was presented with the number 30,000. The average estimate for the low group was 17,150 euros, while the high group estimated 21,219. A third group that received no priming provided a mean estimate of 18,312.

These are but a few of the studies that have consistently demonstrated the robust nature of the anchoring and adjustment heuristic. It is a reasoning flaw that is very hard to extinguish, even when people are explicitly aware of its dynamics. In the workplace it can have profound implications across a number of common functions:

Project Planning – A common practice amongst project planners is to create an initial estimate of duration and cost. This first “cut” is arrived at by adding together estimates for each phase or task of the project. These estimates typically represent best case or normal timeframes for completion of the individual items. As a next step, the planner adds a “fudge factor” or padding, to account for unexpected complications. However, the initial estimate acts as an anchor, and the fudge factor typically represents an insufficient adjustment to that anchor. This problem is so common, that it has its own name, the planning fallacy. A method for countering this issue involves utilizing historical information for estimations. Instead of attempting to guess the completion time and costs for tasks/phases, a planner should use actual historical data from similar projects.

Budgeting – A common budgeting practice involves taking the previous period’s budget and adjusting it to reflect the goals and requirements of the current cycle. Similar to our project plan example, this causes budgets to be anchored to the previous cycle’s number, with adjustments that are not reflective of current realities. A way to counter the effects of anchor and adjusting in the budgeting process is to utilize a zero-based approach. Rather than starting with last cycle’s number, a budget proposal should be built from the ground up, with no reference to previous budgets.

Contract Negotiations – As we saw with our refrigerator example, anytime we are negotiating over the pricing or terms of a deal, anchor and adjusting can effect the outcome. Whether the topic is cost, or cancellation terms, the first figure that is aired will serve as a “stake in the ground”. All negotiations from that point forward will be anchored to the “stake”, typically with adjustments that don’t adequately reflect market realities. There are a couple of good practices that one can use to counter this problem:

  • Ignore list pricing! Use your own research (e.g advisory services, historical deal pricing, competitor pricing) to establish market pricing.
  • Use an RFP or bidding process to force the provider to submit honest pricing
  • Be the first mover. Offer your own pricing or terms first, to create your own anchor.

As mentioned previously, anchoring and adjusting is an insidious reasoning flaw that is difficult to counteract. Here are two general ideas that can give you a fighting chance:

  • Awareness – Understand the mechanics and hallmarks of anchoring and adjusting so that you can spot situations where it is in play. Utilize a standard defense (e.g. attaining market pricing data) to eliminate anchors from the picture.
  • Standardized Processes – Employ a disciplined, standardized approach to vulnerable  processes — such as the ones highlighted above — that enables you to sidestep the effects of anchoring and adjusting.

 

Posted in Cognitive Bias, General Management, Project Management | 1 Comment

Better Predictions Through History

“Hegel was right when he said that we learn from history that man can never learn anything from history” - George Bernard Shaw

crystal ballIn 1977, the psychology researchers Daniel Kahneman and Amos Tversky produced an influential paper entitled Intuitive Prediction: Biases and Corrective Procedures. In the paper, Kahneman and Tversky described systematic biases that impacted the accuracy of expert forecasts. They also presented some “corrective procedures” that could be employed by forecasters to counter these biases and improve their forecasting accuracy.

Their paper contained a central idea: Forecasters insufficiently consider historical data when attempting to calibrate their predictions. Specifically, the paper identified two different types of data that we use when attempting to formulate a prediction. The first class, known as singular data, is the information unique to the specific case at hand. A second class, known as distributional data (or base-rate information), is the accumulated history of data regarding similar cases.

Let’s look at an example from the world of medicine. A doctor needs to forecast the 2-year survival probability of a patient with a terminal illness. Singular information would include the patient’s age, their medical history and the results of diagnostic tests. Distributional data would include the history of outcomes of similar cases, also known as a reference class. Alternatively, consider a product manager from a footwear company who must decide whether his division should start producing a new basketball sneaker. Singular data would include a description of the product and the feedback provided by focus groups. Distributional data would look at the historical success rate of similar products..

Unfortunately, when making forecasts, people tend to focus too much on singular data, either underweighting or ignoring distributional data. We seem to have a tendency to place too much importance on the “case at hand”, feeling it is unique, and that it provides sufficient data to forecast an accurate prediction. Let’s take another look at our product manager, attempting to decide whether that new sneaker should be offered by his division.

sneakerAfter looking at demos of the product and reviewing market research performed on prospective customers, our product manager is very impressed. In his opinion, this sneaker is better than 9 out of 10 that he has reviewed. He estimates that it will sell at least 100,000 units. Kahneman and Tversky would label this behavior “prediction by impression”. That is, the prediction reflects the impression that the product made on our product manager. This decision making strategy represents an over-reliance on singular information.

Let’s look at three possible scenarios related to our new product introduction and see the prediction strategy that best matches each situation:

Perfection – Imagine that the product manager has a fantastic “crystal ball”. His track record has been 100% in determining the exact sales volumes of new sneakers. In this highly improbable instance, it would make sense to simply rely on singular information related to the new product to make a forecast (Note: This is the equivalent of the “prediction by impression” method). Unfortunately, virtually no forecasting domain exhibits this level of predictability.

Throwing Darts – The product manager has no ability to actually forecast results. His historical track record is no better than chance. He can not beat the proverbial “dart throwing monkey”. In this instance, it would make sense to rely exclusively on distributional data. The product manager should examine a “reference class” of similar products, and utilize their historical sales data. Many forecasters are loathe to adopt this model, as it minimizes their role in the process. However,  there are many documented instances where expert forecasters, in fact, fair no better than chance when making predictions.

Blended – The product manager has a track record that is better than chance, but less than perfect. In this case, he should use a blended model, that combines his own forecast with the historical averages for the reference class. If the manager has been close to perfection, the prediction would be closer to his impression for the new item. If the manager had been just better than a “dart thrower”, the prediction should be closer to the reference class average.

intuitive prediction graphic

As Kahneman and Tversky described in their paper, forecasters tend to take an overconfident approach to predictions. Virtually no one admits to being a “dart thrower” with no actual forecasting expertise. Many experts will simply rely on the singular data presented by the case they are evaluating. Even the more conservative experts, using the blended approach, will typically rely too heavily on case data.

In order to combat faulty intuitive predictions, Kahneman and Tversky recommended a five-step procedure. Let’s walk through the procedure, using our new product introduction problem as an example:

1) Establish Reference Class - The first step in the process is identifying a class of items that is similar to your current prediction challenge to provide historical guidelines. In our case we are trying to predict the sales of new basketball sneaker. A reference class with attributes that are closest to the case at hand should provide the most accurate predictive guidance. Examples of potential reference classes, with varying levels of similarity, comprehensiveness and history would include the following:

  • All shoes introduced by the company over the last five years
  • Basketball sneakers introduced by the company and its competitors last year
  • High end basketball sneakers (similar in price to the contemplated new model) produced by the company over the last three years

For arguments sake, let’s assume our product manager chooses the third example, feeling it is the closest representation of our case and contains adequate history. For that reference class, the average sneaker sale has been 40,000 units.

2) Assess Distribution of Reference Class - Now that the reference class has been selected, it’s important to identify how data is distributed throughout the class. Such measures as variance, or standard deviation will show whether typical outcomes are closely clustered to the mean or broadly distributed across the the range. In their paper, Kahneman and Tversky didn’t clearly explain why this step was important. I believe it is a useful piece of data to determine our confidence level with the “corrected estimate” that we’ll identify in Step 5.

3) Intuitive estimation - In this step our forecaster attempts to make an initial estimate utilizing their intuition, combined with data that is specific to the case at hand. In our scenario, the product manager has seen samples of the new sneaker and has reviewed feedback from focus groups. He would also have a sense of the current market for new, high-end sneakers. Based on this information, the product manager estimated sales of 100,000 units.

4) Assessment of predictability – Here’s where things start to get kind of squishy. In this step, the forecaster attempts to estimate the accuracy of their prognosticating capabilities in this domain. Kahneman and Tversky offered two different methods for computing this value. For simplicity’s sake, I’ll demonstrate one method. To complete this step, our product manager contemplates the following problem: “If you were to consider two different prospective sneaker offerings, how often could you correctly predict which shoe would be the better seller?” That percentage is then turned into a number, ranging from 0 to 1.  Zero would represent “dart thrower”, with no better ability than chance. One would represent perfect forecasting ability. For our example, let’s say our product manager estimates his ability at .75, meaning he believes he could pick the better selling sneaker three out of four times.

5) Correction of intuitive estimate – Here’s where we put it all together. Kahneman and Tversky developed a formula, using the information from steps 1-4 to computed a corrected estimate:

First, using the assessment of predictability from Step 4, compute a correlation coefficient –>

(2 * Assessment of Predictability) – 1 = Correlation Coefficient

For our example, the calculation would be —> (2 * .75) – 1 = .5

Next calculate the corrected estimate as follows –>

Reference Class Average + Correlation Coefficient * (Intuitive Estimate – Reference Class Average) = Corrected Estimate

For our example —> 40,000 + .5 * (100,000 – 40,000) = 70,000

Adding in the variability from the distribution class – As discussed, although Kahneman and Tversky didn’t include it in their formula, they included a step that looked at the distribution of the reference class. We could apply some basic concepts from statistics to use this data to create a range for our estimate. We could arbitrarily pick a confidence level that is appropriate for our problem. For example, we could observe that 90% of sneakers in the reference class had sales that were within 20% of the mean. We could then apply this to our corrected estimate to build a range of values:

70,000 * .2 = 14,000

Therefore, we could enhance our estimate by saying we expect it to range between 56,000 and 84,000 (i.e. 70,000 + or – 14,000).

Here is a graphic showing the steps from our example:

Slide1

Some Additional Thoughts/Caveats  -  While Kahneman and Tversky based their paper on scientific research, their 5-step process, containing several estimates and guesses, should not be mistaken for a rigorous statistical analysis. My use of the term “confidence level”, for estimating variability, should not be confused with the statistical concept of confidence interval. However, on the whole, their process for “corrected estimates” should offer improvements in accuracy for intuitive predictions. It helps by countering our profound bias towards single case forecasting, providing a moderated prediction.

As an alternative to “corrected estimates”, firms could also consider statistical prediction rules (SPR). This algorithmic approach to prediction standardizes the process and also does a good job removing the bias of the expert.

Posted in Uncategorized | Comments Off