Home attacking ability × Away defensive weakness × Home advantage factor
Here the “home advantage factor” accounts for the boost teams often get when playing at home. In a similar fashion, the expected number of away goals was equal to the away team’s attacking ability multiplied by the home defensive weakness (the away team didn’t get any extra advantage).
To estimate each team’s attacking and defensive prowess, Dixon and Coles collected several years of data on English soccer games from the top four divisions, which among them contained a total of 92 teams. Because the model included an attack and defensive ability for each team, plus an extra factor that specified the home advantage, this meant estimating a total of 185 factors. If every team had played every other team the same number of times, estimation would have been relatively straightforward. However, promotions and relegations—not to mention cup games—meant some match-ups were more common than others. Much like the races at Happy Valley, there was too much hidden information for simple calculations. To estimate each of the 185 factors, it was therefore necessary to enlist help from computational methods like the ones developed by the researchers at Los Alamos.
When Dixon and Coles used their model to make predictions about games that had been played in the 1995–1996 season, they found that the forecasts lined up nicely with the actual results. But would the model have been good enough to bet with? They tested it by going through all the games and applying a simple rule: if the model said a particular result was 10 percent more likely than the bookmakers’ odds implied, it was worth betting on. Despite using a basic model and betting strategy, the results suggested that the model would be capable of outperforming the bookmakers.
Not long after publishing their work, Dixon and Coles went their separate ways. Dixon set up Atass Sports, a consultancy firm that specialized in the prediction of sports results. Later, Coles would join Smartodds, a London-based company that also worked on sports models. There are now several firms working on soccer prediction, but Dixon and Coles’s research remains at the heart of many models. “Those papers are still the main starting points,” said David Hastie, who co-founded soccer analytics firm Onside Analysis.
As with any model, though, the research has some weaknesses. “It’s not an entirely polished piece of work,” Coles has pointed out. One problem is that the measurements for teams’ attacking and defending abilities don’t change over the course of a game. In reality, players may tire or launch more attacks at certain points in the game. Another issue is that, in real life, draws are more common than a Poisson process would predict. One explanation might be that teams that are trailing put in more effort, with the hope of leveling the score line, whereas their opponents get complacent. But, according to Andreas Heuer and Oliver Rubner, two researchers at the University of Münster, there’s something else going on. They reckon the large number of draws is because teams tend to take fewer risks—and hence are less likely to score—if the score line is even in the later stages of a game. When the pair looked at matches in the German Bundesliga from 1968 to 2011, they found that the goal-scoring rate decreased when the score was a draw. This was especially noticeable when the score was 0–0, with players preferring to settle for the “coziness of a draw.”
It turned out that certain points in a game created particularly draw-friendly conditions. Heuer and Rubner found that Bundesliga goals tended to follow a Poisson process during the first eighty minutes of the match, with teams finding the net at a fairly consistent rate. It was only during the last period of play that things became more erratic, especially if the away team was leading by one or two goals in the dying minutes of the match.
By adjusting for these types of quirks, sports prediction firms have built on the work of Dixon, Coles, and others and have turned soccer betting into a profitable business. In recent years, these companies have greatly expanded their operations. But though the industry has grown, and new firms have appeared, the scientific betting industry is still relatively new in the United Kingdom. Even the most established firms started post-2000. In the United States, however, sports prediction has a much richer history—sometimes quite literally.
TO PASS TIME DURING dull high school classes, Michael Kent would often read the sports section of the newspaper. Despite living in Chicago, he followed college athletics from all over the country. As he leafed through the scores, he would get to thinking about the winning margin in each game. “A team would beat another team 28–12,” he recalled, “and I would say, well how good is that?”
After high school, Kent completed a degree in mathematics before joining the Westinghouse Corporation. He spent the 1970s working in the corporation’s Atomic Power Laboratory in Pittsburgh, Pennsylvania, where they designed nuclear reactors for the US Navy. It was very much a research environment: a mixture of mathematicians, engineers, and computer specialists. Kent spent the next few years trying to simulate what happens to a nuclear reactor that has coolant flowing through its fuel channels. In his spare time, he also started writing computer programs to analyze US football games. In many ways, Kent’s model did for college sports what Bill Benter’s did for horse races. Kent gathered together lots of factors that might influence a game’s result, and then used regression to work out which were important. Just as Benter would later do, Kent waited until he had his own estimate before he looked at the betting market. “You need to make your own number,” Kent said. “Then—and only then—do you look at what other people have.”
STATISTICS AND DATA HAVE long been an important part of American sport. They are particularly prominent in baseball. One reason is the structure of the game: it is split into lots of short intervals, which, as well as providing plenty of opportunities to grab a hotdog, makes the game much easier to analyze. Moreover, baseball innings can be broken down into individual battles—such as pitcher versus batter—that are relatively independent, and hence statistician-friendly.
Most of the stats that baseball fans pore over today—from batting averages to runs scored—were devised in the nineteenth century by Henry Chadwick, a sports writer who’d honed his ideas watching cricket matches in England. With the growth of computers in the 1970s, it became easier to collate results, and people gradually formed organizations to encourage research into sports statistics. One such organization was the Society for American Baseball Research, founded in 1971. Because the society’s acronym was SABR, the scientific analysis of baseball became known as “sabermetrics.”
Sports statistics grew in popularity during the 1970s, but several other ingredients are needed to cook up an effective betting strategy. It just so happened that Michael Kent had all of them. “I was very fortunate that a whole bunch of things came together,” he said. The first ingredient was data. Not far from Kent’s atomic laboratory in Pittsburgh was the Carnegie Library, which had a set of anthologies containing several years’ worth of college sports scores and schedules. The good news was that these provided Kent’s model with information it needed to generate robust predictions; the bad news was that each result had to be input manually. Kent also had the technology to power the model, with access to the high-speed computer at Westinghouse. His university had been one of the first in the country to get a computer, so Kent already had far more programming experience than most. That wasn’t all. As well as knowing how to write computer programs, Kent understood the statistical theory behind his models. At Westinghouse, he’d worked with an engineer named Carl Friedrich, who’d shown him how to create fast, reliable computer models. “He was one of the most brilliant people I ever met,” Kent said. “The guy was unbelievable.”
Even with the crucial components in place, Kent’s gambling career didn’t get off to the best start. “Very early on, I had four huge bets,” he said. “I lost them all. I lost $5,000 that Saturday.” Still, he realized that the misfortunes did have some benefits. “Nothing motivated me more than losing.” After working on his model at night for seven years, Kent finally decided t
o make sports betting his full-time job in 1979. While Bill Benter was making his first forays into blackjack, Kent left Westinghouse for Las Vegas, ready for the new college football season.
Life in the city involved a lot of new challenges. One of them was the logistics of placing the actual bets. It wasn’t like Hong Kong, where bettors could simply phone in their selections. In Las Vegas, gamblers had to turn up at a casino with hard currency. Naturally, this made Kent a little nervous. He came to rely on valet parking, because it stopped him having to walk through poorly lit parking lots with tens of thousands of dollars in cash.
Because it was tricky to place bets, Kent teamed up with Billy Walters, a veteran gambler who knew how Las Vegas worked and how to make it work for them. With Walters taking care of the betting, Kent could focus on the predictions. Over the next few years, other gamblers joined them to help implement the strategy. Some assisted with the computer model, while others dealt with the bookmakers. Together, they were known as the “Computer Group,” a name that would become admired by bettors almost as much as it was dreaded by casinos.
Thanks to Kent’s scientific approach, the Computer Group’s predictions were consistently better than Las Vegas bookmakers’. The success also brought some unwanted attention. Throughout the 1980s, the FBI suspected the group was operating illegally, conducting investigations that were driven partly by bemusement at how the group was making so much money. Despite years of scrutiny, however, the investigations didn’t come to anything. There were FBI raids, and several members of the Computer Group were indicted, but all were eventually acquitted.
It has been estimated that between 1980 and 1985, the Computer Group placed over $135 million worth of bets, turning a profit of almost $14 million. There wasn’t a single year in which they made a loss. The group eventually disbanded in 1987, but Kent would continue to bet on sports for the next two decades. Kent said the division of labor remained much the same: he would come up with the forecasts, and Walters would implement the betting. Kent pointed out that much of the success of his predictions came from the attention he put into the computer models. “It’s the model-building that’s important,” he said. “You have to know how to build a model. And you never stop building the model.”
Kent generally worked alone on his predictions, but he did get help with one sport. An economist at a major university on the West Coast came up with football predictions each week. The man was very private about his betting research, and Kent referred to him only as “Professor number 1.” Although the economist’s estimates were very good, they were different from Kent’s forecasts. So, between 1990 and 2005, they would often merge the two predictions.
Kent made his name—and his money—predicting college sports such as football and basketball. But not all sports have received this level of attention. Whereas Kent was coming up with profitable football models in the 1970s, it wasn’t until 1998 that Dixon and Coles sketched out a viable method for soccer betting. And some sports are even harder to predict.
ONE AFTERNOON IN JANUARY 1951, Françoise Ulam came home to find her husband Stanislaw staring out of the window. His expression was peculiar, his eyes unfocused on the garden outside. “I found a way to make it work,” he said. Françoise asked him what he meant. “The Super,” he replied. “It is a totally different scheme, and it will change the course of history.”
Ulam was referring to the hydrogen bomb they had developed at Los Alamos. Thanks to the Monte Carlo method and other technological advances, the United States possessed the most powerful weapon that ever existed. It was the early stages of the Cold War, and Russia had fallen behind in the nuclear arms race.
Yet grand nuclear ideas weren’t the only innovations appearing during this period. While Ulam had been working on the Monte Carlo method in 1947, a very different kind of weapon had emerged on the other side of the Iron Curtain. It was called the “Avtomat Kalashnikova” after its designer Mikhail Kalashnikov. In subsequent years, the world would come to know it by another name: the AK-47. Along with the hydrogen bomb, the rifle would shape the course of the Cold War. From Vietnam to Afghanistan, it passed through the hands of soldiers, guerrillas, and revolutionaries. The gun is still in use today, with an estimated 75 million AK-47s having been built to date. The main reason for the weapon’s popularity lies in its simplicity. It has only eight moving parts, which means it’s reliable and easy to repair. It might not be that accurate, but it rarely jams and can survive decades of use.
When it comes to building machines, the fewer parts there are, the more efficient the machine is. Complexity means more friction between the different components: for example, around 10 percent of a car engine’s power is wasted because of such friction. Complexity also leads to malfunctions. During the Cold War, expensive Western rifles would jam while the simple AK-47 continued to function. The same is true of many other processes. Making things more complicated often removes efficiency and increases error. Take blackjack: the more cards a dealer uses, the harder it is to shuffle properly. Complexity also makes it harder to come up with accurate forecasts about the future. The more parts that are involved, and the more interactions going on, the harder it is to predict what will happen from limited past data. And when it comes to sport, there is one activity that involves a particularly large number of interactions, which can make predictions very difficult.
US President Woodrow Wilson once described golf as “an ineffectual attempt to put an elusive ball into an obscure hole with an implement ill adapted to the purpose.” As well as having to deal with ballistics, golfers must also contend with their surroundings. Golf courses are littered with obstacles, ranging from trees and ponds to sand bunkers and caddies. As a result, the shadow of luck is never far away. A player might hit a brilliant shot, sending the ball toward the hole, only to see it collide with the flagstick and ricochet into a bunker. Or a player could slice the ball into a tree and have it bounce back into a strong position. Such mishaps are so common in golf that the rulebook even has a phrase to cover them. If the ball hits a random object or otherwise goes astray by accident, it’s just the “rub of the green.”
Whereas horse races in Hong Kong resemble a well-designed science experiment, golf tournaments are more likely to require one of Ronald Fisher’s statistical postmortems. Over the four days of a tournament, players tee off at all sorts of different times. The location of the hole also changes between rounds—and if the tournament is in the United Kingdom, so will the weather. If that isn’t bad enough, the field of potential winners is huge in a golf tournament. Whereas the Rugby World Cup has twenty teams competing for the trophy, and the UK Grand National has forty horses running, ninety-five players compete for the US Masters each year, and the other three majors are even larger.
All these factors mean that golf is particularly difficult to predict accurately. Golf has therefore been a bit of an outlier in terms of sports forecasting. Some firms are rising to the challenge—Smartodds now has statisticians working on golf prediction—but in terms of betting activity, the sport still lags far behind many others.
Even among different team sports, some games are easier to predict than others. The discrepancy comes partly down to the scoring rates. Take hockey. Teams playing in the NHL score two or three goals per game on average. Compare that to basketball, where NBA teams will regularly score a hundred points in a game. If goals are rare—as they are in hockey—a successful shot will have more impact on the game. This means that a chance event, such as a deflection or lucky shot, is more likely to influence the final result. Low-scoring games also mean fewer data points to play with. When a brilliant team beats a lousy team 1–0, there is only one scoring event to analyze.
Fortunately, it’s possible to squeeze extra information out of a game. One approach is to measure performance in other ways. In hockey, pundits often use stats such as the “Corsi rating”—the difference between the number of shots directed at an opponent’s net and number targeted at a team’s own goal�
��to make predictions about score lines. The reason they use such rating systems is that the number of goals scored in previous games does not say much about a team’s future ability to score goals.
Scoring is far more common in games such as basketball, but the way in which the game is played can affect predictability, too. Haralabos Voulgaris has spent years betting almost exclusively on basketball and is now one of the world’s top NBA bettors. At the MIT Sloan Sports Analytics Conference in 2013, he pointed out that the nature of scoring in basketball was changing, with players attempting more long-distance three-point shots. Because randomness plays a bigger role in these types of shots, it was becoming harder to predict which team would score more points. Traditional forecasting methods assume that team members work together to get the ball near the basket and score; these approaches are less accurate when individual players make speculative attempts from farther away.
Why does Voulgaris bet on basketball rather than another sport? It comes partly down to the simple fact that he likes the game. Sifting through reams of data doesn’t make for a great lifestyle if it’s not interesting. It also helps that Voulgaris has lots of data to sift through. Models need to process a certain amount of data before they can churn out reliable predictions. And in basketball plenty of information is available to analyze. The same cannot be said for other sports, however. In the early days of English soccer prediction, it was a struggle to dig up the necessary data. Whereas American pundits were dealing with a flood of information, in the United Kingdom there was barely a puddle. “We don’t realise how easy we have it sometimes these days,” Stuart Coles said.
The Perfect Bet Page 9