by Annie Duke
One day, Nick the Greek didn’t show up to the game. When I asked where he was, another player muttered, confidentially (though it seemed everyone in the game already knew about this), “Oh, he got sent back.”
“Sent back?”
“Yeah, to Greece. They deported him.”
I can’t say that Nick the Greek’s deportation was the result of his wacky poker beliefs, but I have my suspicions. Other players speculated that he went broke, or dipped into the till at the hotel, or lost his work visa because he was playing poker every day on company time.
I can say that Nick the Greek lost a lot of money based on his beliefs—or, more accurately, because he ignored lots of feedback that his strategy was a losing one. He eventually went broke because he didn’t recognize learning opportunities as they arose.
If Nick the Greek were unique in his resistance to learning from the outcomes he was having at the poker table, I suppose he would just be a footnote for me, a funny story of a guy unique in his ability to hold tight to his strategy despite that strategy resulting in a lot of losing. But, while an extreme case to be sure, Nick the Greek wasn’t all that unique. And that was a puzzle for me. I was taught, as all psychology students are, that learning occurs when you get lots of feedback tied closely in time to decisions and actions. If we took that at face value, poker would be an ideal learning environment. You make a bet, get an immediate response from opponents, and win or lose the hand (with real-money consequences), all within minutes.
So why was Nick the Greek, who had been playing for years, unable to learn from his mistakes? Why was a novice like me cleaning up in the game? The answer is that while experience is necessary to becoming an expert, it’s not sufficient.
Experience can be an effective teacher. But, clearly, only some students listen to their teachers. The people who learn from experience improve, advance, and (with a little bit of luck) become experts and leaders in their fields. I benefited from adopting the learning habits of some of the phenomenal poker players I was exposed to along the way. We can all benefit from those practical strategies to become better decision-makers. Thinking in bets can help us get there.
But before getting to the solutions, we must first understand the problem. What are the obstacles in our way that make learning from experience so difficult? We all clearly have a desire to reach our long-term goals. Listening to what our outcomes have to teach us is necessary to do that. So what is systematically getting in the way?
Outcomes are feedback
We can’t just “absorb” experiences and expect to learn. As novelist and philosopher Aldous Huxley recognized, “Experience is not what happens to a man; it is what a man does with what happens to him.” There is a big difference between getting experience and becoming an expert. That difference lies in the ability to identify when the outcomes of our decisions have something to teach us and what that lesson might be.
Any decision, whether it’s putting $2 on Count de Change at the racetrack or telling your kids they can eat whatever they want, is a bet on what will likely create the most favorable future for us. The future we have bet on unfolds as a series of outcomes. We bet on staying up late to watch the end of a football game and we sleep through our alarm, wake up tired, get to work late, and get reprimanded by the boss. Or we stay up late and any of the myriad other outcomes follows, including waking up perfectly on time and making it to work early. Whichever future actually unfolds, when we decide to stay up late to see the end of the game, we are making a bet that we will be happier in the future for having seen the final play. We bet on moving to Des Moines and we find our dream job, meet the love of our life, and take up yoga. Or, like John Hennigan, we move there, hate it within two days, and have to buy our way home for $15,000. We bet on firing a division president or calling a pass play, and the future unfolds as it does. We can represent this like so:
As the future unfolds into a set of outcomes, we are faced with another decision: Why did something happen the way it did?
How we figure out what—if anything—we should learn from an outcome becomes another bet. As outcomes come our way, figuring out whether those outcomes were caused mainly by luck or whether they were the predictable result of particular decisions we made is a bet of great consequence. If we determine our decisions drove the outcome, we can feed the data we get following those decisions back into belief formation and updating, creating a learning loop:
We have the opportunity to learn from the way the future unfolds to improve our beliefs and decisions going forward. The more evidence we get from experience, the less uncertainty we have about our beliefs and choices. Actively using outcomes to examine our beliefs and bets closes the feedback loop, reducing uncertainty. This is the heavy lifting of how we learn.
Ideally, our beliefs and our bets improve with time as we learn from experience. Ideally, the more information we have, the better we get at making decisions about which possible future to bet on. Ideally, as we learn from experience we get better at assessing the likelihood of a particular outcome given any decision, making our predictions about the future more accurate. As you may have guessed, when it comes to how we process experience, “ideally” doesn’t always apply.
Learning might proceed in a more ideal way if life were more like chess than poker. The connection between outcome quality and decision quality would be clearer because there would be less uncertainty. The challenge is that any single outcome can happen for multiple reasons. The unfolding future is a big data dump that we have to sort and interpret. And the world doesn’t connect the dots for us between outcomes and causes.
If a patient comes into a doctor’s office with a cough, the doctor must work backward from that one symptom, that one outcome of a possible disease process, to decide among the multiple reasons the patient might have that cough. Is it because of a virus? Bacteria? Cancer? A neurological disorder? Because a cough looks roughly the same whether it is from cancer or a virus, working backward from the symptom to the cause is difficult. The stakes are high. Misdiagnose the cause, and the patient might die. That is why doctors require years of training to properly diagnose patients.
When the future coughs on us, it is hard to tell why.
Imagine calls to a customer by two salespeople from the same company. In January, Joe pitches the company’s products and gets $1,000 in orders. In August, Jane calls on the same customer and gets $10,000 in orders. What gives? Was it because Jane is a better salesperson than Joe? Or was it because the company updated its product line in February? Did a low-cost competitor go out of business in April? Or is the difference in their success due to any of a variety of other unconsidered reasons? It’s hard to know why because we can’t go back in time and run the controlled experiment where Joe and Jane switch places. And the way the company sorts this outcome can affect decisions on training, pricing, and product development.
This problem is top of mind for poker players. Most poker hands end in a cloud of incomplete information: one player bets, no one calls the bet, the bettor is awarded the pot, and no one is required to reveal their hidden cards. After those hands, the players are left guessing why they won or lost the hand. Did the winner have a superior hand? Did the loser fold the best hand? Could the player who won the hand have made more money if they chose a different line of play? Could the player who lost have made the winner forfeit if they chose to play the hand differently? In answering these questions, none of the players knows what cards their opponents actually held, or how the players would have reacted to a different sequence of betting decisions. How poker players adjust their play from experience determines their future results. How they fill in all those blanks is a vitally important bet on whether they get better at the game.
We are good at identifying the “-ER” goals we want to pursue (better, smarter, richer, healthier, whatever). But we fall short in achieving our “-ER” because of the difficulty in executing all the li
ttle decisions along the way to our goals. The bets we make on when and how to close the feedback loop are part of the execution, all those in-the-moment decisions about whether something is a learning opportunity. To reach our long-term goals, we have to improve at sorting out when the unfolding future has something to teach us, when to close the feedback loop.
And the first step to doing this well is in recognizing that things sometimes happen because of the other form of uncertainty: luck.
Luck vs. skill: fielding outcomes
The way our lives turn out is the result of two things: the influence of skill and the influence of luck. For the purposes of this discussion, any outcome that is the result of our decision-making is in the skill category. If making the same decision again would predictably result in the same outcome, or if changing the decision would predictably result in a different outcome, then the outcome following that decision was due to skill. The quality of our decision-making was the main influence over how things turned out. If, however, an outcome occurs because of things that we can’t control (like the actions of others, the weather, or our genes), the result would be due to luck. If our decisions didn’t have much impact on the way things turned out, then luck would be the main influence.*
When a golfer hits a tee shot, where the ball lands is the result of the influence of skill and luck, whether it is a first-time golfer or Rory McIlroy. The elements of skill, those things directly in the golfer’s control that influence the outcome, include club choice, setup, and all the detailed mechanics of the golf swing. Elements of luck include a sudden gust of wind, somebody yelling their name as they swing, the ball landing in a divot or hitting a sprinkler head, the age of the golfer, the golfer’s genes, and the opportunities they received (or didn’t receive) up to the moment of the shot.
An outcome like losing weight could be the direct result of a change in diet or increased exercise (skill), or a sudden change in our metabolism or a famine (luck). We could get in a car crash because we didn’t stop at a red light (skill) or because another driver ran a red light (luck). A student could do poorly on a test because they didn’t study (skill) or because the teacher is mean (luck). I can lose a hand of poker because I made poor decisions, applying the skill elements of the game poorly, or because the other player got lucky.
Chalk up an outcome to skill, and we take credit for the result. Chalk up an outcome to luck, and it wasn’t in our control. For any outcome, we are faced with this initial sorting decision. That decision is a bet on whether the outcome belongs in the “luck” bucket or the “skill” bucket. This is where Nick the Greek went wrong.
We can update the learning loop to represent this like so:
Think about this like we are an outfielder catching a fly ball with runners on base. Fielders have to make in-the-moment game decisions about where to throw the ball: hit the cutoff man, throw behind a base runner, throw out an advancing base runner. Where the outfielder throws after fielding the ball is a bet.
We make similar bets about where to “throw” an outcome: into the “skill bucket” (in our control) or the “luck bucket” (outside of our control). This initial fielding of outcomes, if done well, allows us to focus on experiences that have something to teach us (skill) and ignore those that don’t (luck). Get this right and, with experience, we get closer to whatever “-ER” we are striving for: better, smarter, healthier, happier, wealthier, etc.
It is hard to get this right. Absent omniscience, it is difficult to tell why anything happened the way it did. The bet on whether to field outcomes into the luck or skill bucket is difficult to execute because of ambiguity.
Working backward is hard: the SnackWell’s Phenomenon
In the nineties, millions of people jumped on the SnackWell’s bandwagon. Nabisco developed these devil’s food cookies as a leading product to take advantage of the now-discredited belief that fat, not sugar, makes you fat. Foods made with less fat were, at the time, considered healthier. With the blessing of the U.S. government, companies swapped in sugar for fat as a flavoring ingredient. SnackWell’s came in a green package, the color associated with “low fat” and, therefore, “healthy”—like spinach!
For all those people trying to lose weight or make healthier snacking choices, SnackWell’s were a delicious godsend. SnackWell’s eaters bet their health on substituting these cookies for other types of snacks like, say, cashews, which are high in fat. You could ingest sugar-laden SnackWell’s by the box, because sugar wasn’t the enemy. Fat was the enemy, and the packaging screamed “LOW FAT!”
Of course, we know now that obesity rose significantly during the low-fat craze. (Michael Pollan used the phrase “SnackWell’s Phenomenon” in describing people increasing their consumption of something that has less of a bad ingredient.) As those SnackWell’s eaters gained weight, it wasn’t easy for them to figure out why. Should the weight gain be fielded into the skill bucket, used as feedback that their belief about the health value of SnackWell’s was inaccurate? Or was the weight gain due to bad luck, like a slow metabolism or something else that wasn’t their fault or at least didn’t have to do with their choice to eat SnackWell’s? If the weight gain got fielded into the luck bucket, it wouldn’t be a signal to alter the choice to eat SnackWell’s.
Looking back now, it seems obvious how the weight gain should have been fielded. But it is only obvious once you know that SnackWell’s are an unhealthy choice. We have the benefit of twenty years of new research, more and better-quality information about what causes weight gain. The folks on the low-fat bandwagon had only their weight gain to learn from. The cards remained concealed.
Working backward from the way things turn out isn’t easy. We can get to the same health outcome (weight gain) by different routes. One person might choose SnackWell’s; another might choose Oreos (also a Nabisco product, developed by the same person who invented SnackWell’s); a third might choose lentils and kale. If all three people gain weight, how can any of them figure it out for sure?
Outcomes don’t tell us what’s our fault and what isn’t, what we should take credit for and what we shouldn’t. Unlike in chess, we can’t simply work backward from the quality of the outcome to determine the quality of our beliefs or decisions. This makes learning from outcomes a pretty haphazard process. A negative outcome could be a signal to go in and examine our decision-making. That outcome could also be due to bad luck, unrelated to our decision, in which case treating that outcome as a signal to change future decisions would be a mistake. A good outcome could signal that we made a good decision. It could also mean that we got lucky, in which case we would be making a mistake to use that outcome as a signal to repeat that decision in the future.
When Nick the Greek won with a seven and a deuce, he fielded that outcome into the skill bucket, taking credit for his brilliant strategy. When he lost with that hand—a much more common occurrence—he wrote it off as bad luck. His fielding error meant he never questioned his beliefs, no matter how much he lost. We’re all like Nick the Greek sometimes. Uncertainty—luck and hidden information—gave him the leeway to make fielding errors about why he was losing. We all face uncertainty. And we all make fielding errors.
Rats get tripped up by uncertainty in a way that should appear very familiar to us. Classical stimulus-response experiments have shown that the introduction of uncertainty drastically slows learning. When rats are trained on a fixed reward schedule (for example, a pellet for every tenth press of a lever), they learn pretty fast to press that lever for food. If you withdraw the reward, the lever-pressing behavior is quickly extinguished. The rats figure out that no more food is on its way.
But when you reward the rats on a variable or intermittent reinforcement schedule (a pellet that comes on average every tenth lever press), that introduces uncertainty. The average number of lever presses for the reward is the same, but the rat could get a reward on the next press or not for thirty presses. In other words, the r
ats are rewarded the way humans usually are: having no way to know with certainty what will happen on the next try. When you withdraw the reward from those rats, the lever-pressing behavior extinguishes only after a very long time of fruitless lever pushing, sometimes thousands of tries.
We might imagine the rats thinking, “I bet the next lever press will get me a pellet. . . . I’ve just been getting unlucky . . . I’m due.” Actually, we don’t even have to imagine this. We can hear it if we listen to what people say while they play slot machines. Slot machines operate on a variable-payoff system. It’s no wonder that, despite those machines being among the worst bets in the casino, the banks of slots in a casino are packed. In the end, our rat brains dominate.
If this all doesn’t seem difficult enough, outcomes are rarely all skill or all luck. Even when we make the most egregious mistakes and get appropriately negative outcomes, luck plays a role. For every drunk driver who swerves into a ditch and flips his car, there are several who swerve harmlessly across multilane highways. It might feel like the drunk driver in the ditch deserved that outcome, but the luck of the road conditions and presence or absence of other drivers also played a role. When we do everything right, like drive through a green light perfectly sober and live to tell the tale, there is also an element of luck. No one else simultaneously ran a red light and hit us. There wasn’t a patch of ice on the road to make us lose control of our vehicle. We didn’t run over a piece of debris and blow a tire.
When we field our outcomes as the future unfolds, we always run into this problem: the way things turn out could be the result of our decisions, luck, or some combination of the two. Just as we are almost never 100% wrong or right, outcomes are almost never 100% due to luck or skill. Learning from experience doesn’t offer us the orderliness of chess or, for that matter, folding and sorting laundry. Getting insight into the way uncertainty trips us up, whether the errors we make are patterned (hint: they are) and what motivates those errors, should give us clues for figuring out achievable strategies to calibrate the bets we make on our outcomes.