Smart Baseball

Home > Other > Smart Baseball > Page 27
Smart Baseball Page 27

by Keith Law


  By my last spring with the Blue Jays, I’d developed additional scripts to strip play-by-play game logs from college sites that offered them so we could estimate groundball rates and swinging-strike percentages for pitchers and very basic splits for position players. Within a few years, however, an independent data provider, collegesplits.com, began offering these data and much more (including left/right splits for pitchers and hitters) to teams for a fee, and by 2010 or so at least half of MLB teams were using this kind of information in their draft processes.

  The other major part of my job at the time was to work with the data MLB provided for all professional players through delivery of daily flat files and the posting of game logs for every player. I wrote further scripts to deal with all of these, so we could easily identify, say, pitchers who were particularly effective at retiring left-handed hitters, or who had high groundout rates, or hitters whose value might be obscured by tough home ballparks. I spent more time working to collect, clean, and format these data than I did to “analyze” them, because the latter part was so straightforward—applying park effects, for example—while tiny glitches in the format of a Web page could throw a beautifully designed Perl script (if I do say so myself) into disarray.

  MLB’s Pitch f/x data set just became available in my last year with Toronto, and I left to join ESPN before I got to do much of anything with it. Had I stuck around, my old tools would have been inadequate to the job; where I could store everything in Microsoft Access and export it to Excel for formatting, Pitch f/x had too many rows of data for basic desktop software. That was a job for a database programmer, and I am not one of those. This was the first inflection point, where hiring more than one person to staff an analytics department—and hiring someone with greater technical skills than I possessed—started to make sense.

  Before the 2015 arrival of Statcast data, there were already teams that employed departments of six or more analysts, handling Pitch f/x data, college data, and some of the TrackMan data available for high school players from showcase events. Now Statcast data and its sheer size—the aforementioned terabyte of data per season—have led to an even greater rise in hiring; one department head estimated to me that all thirty MLB teams in total employ about two hundred people in analytics departments, from directors to entry-level programmers. I was one of the only full-time employees of any MLB team in 2002 whose job was to work with data; now, there are fifty to a hundred times more such people working for clubs, and I am completely underqualified for the job I used to hold.

  As teams get smarter, the gap widens between what teams know about players and what we know about players—and by we, I mean not just the fans, but those of us who cover the industry for a living. Where fifteen or twenty years ago, the idea of even employing a single consultant to provide insight via statistical analysis was unorthodox, today teams employ entire departments of a half dozen or more analysts, some sporting Ph.D.s, to help gather, organize, and process data and queries to improve their decision-making on players. Increasing the accuracy of player projections—that is, what the player’s performance is likely to be next year, the year after, or over the life of a long-term contract—has long been a sort of holy grail for front offices, which is why you’re seeing so many resources thrown into analytics departments. Projections can never be perfect and should always have confidence intervals around them (“We’re 95 percent confident his OBP will be between .340 and .360”), but even marginal improvements in their accuracy can mean millions of dollars in value to a team.

  This puts the fan (in other words, you) in a different place today than in 2007 or 1997. It was reasonable in prior eras to think that when it came to player stats, we all knew what the teams knew, and in certain cases we seemed to know more, or simply to consider it more than the front offices in question did. Today there is no question that the teams have more data than we have, and that they are drawing conclusions that we won’t know about until much later, if at all. We may certainly still disagree with team decisions on players, but we don’t have the same information they do.

  I still take hope in the recent statistical revolution and the ongoing changes promised by Statcast and any future data sources. Where once the discussion and coverage of baseball was ruled by superstition and myth, today more fans demand some rational underpinning to arguments over whether the Nationals gave up too much for Adam Eaton, whether Mike Trout is having the best start to a career in baseball history, whether Manny Machado or Bryce Harper will end up the best player from the 2010 draft, and so on. You can still try to write arrant nonsense or spew it on television, but you’ll be picked apart for doing so, because the rise of the analysts has led to a more educated fan base.

  Every player’s stat line tries to tell the story of his season, so if you want to get the story right, you have to use the right stats. Using the old-fashioned, outdated stats I broke down in Part One meant getting the story wrong. They ascribed credit to one player for the actions of another, and sometimes led writers and fans to believe that players had mythical powers like the ability to play better in a clutch situation. We know better now, whether it’s how to value what a player did or how to dismiss quackery like clutch hitters and lineup protection.

  Understanding more modern statistics, even those as simple as OBP or slugging percentage, allows everyone to better understand what’s happening on the field, whether it’s going well or poorly, or the moves that teams make off the field. If your favorite team just acquired a player you’ve never heard of before, you’re going to want to know whether he’ll help. The better the statistics you look at to answer this question, the more confident you can be in your answer. And now you’re better armed to watch the watchmen, to read the work of people who cover the game (like me) and see if we’re telling the right kind of stories about the game, or ignoring statistical information that leads to a different conclusion. When a broadcaster tells you that some player “just knows how to win” or is “a great RBI guy,” your BS detector will light up like a Christmas tree. When a manager or GM claims that a low-OBP player can lead off because he’s fast, you know why speed is a red herring. You’re armed to think rationally about a sport that, for most of its 150 or so years, was covered and treated and discussed in the most irrational terms.

  This will still be true for the savvy fan even as the information gap I mentioned above grows. You don’t need to know or understand the importance of exit velocity or launch angle or spin rate to watch and enjoy a game, or to follow a player or team through a season. This information may help you—for example, it appears that a fastball with high velocity but just average spin rate isn’t going to be as effective as the velocity alone might imply, missing fewer bats and leading to more hard contact. And you, the savvy fan (you’re welcome), should keep an open mind about new advances; ten years ago we never thought about putting a value on catcher framing, but now it’s driving transactions and pushing the worst framers out of regular jobs.

  Teams are developing better tools to drive their player projections, regressing performances to mean levels or employing mixed models to try to incorporate random effects into metrics for pitchers, but you don’t have to understand any of this to be an educated fan. You only have to accept that the search for knowledge within baseball never ends, so what appears to be a complete story of a player today may turn out to be incomplete tomorrow. I said in the chapter on pitching metrics that my 2009 NL Cy Young vote may end up looking wrong as we learn more about how much credit or blame falls on a pitcher when a ball in play becomes a hit. Using the best knowledge we have right now while remembering that we may know a lot more in the future is the essence of Smart Baseball.

  Acknowledgments

  I’d like to thank my editor, Matt Harper, for shepherding this project from concept to completion, taking a set of essays and helping me weave them together into something coherent and cogent.

  My agents, Eric Lupfer for literary and Melissa Baron for anything else, helped make
this book more than just some idea I had in the middle of thirty other ideas I had that never went anywhere. Eric in particular turned the elevator pitch into a written document and then into a formal proposal, one that landed me with HarperCollins faster than I could have hoped for.

  Meredith Wills provided some essential research help, especially early in the process, which formed a lot of the foundation of the early chapters on ERA and fielding, although much of the work she did doesn’t appear directly in the book. The commentary about catchers whose proficiency at throwing out runners might hurt their apparent defensive value because runners stop attempting to steal against them comes from research Meredith did for this project.

  I spoke to many people inside the industry to research this book, folks who made more time for me than I could have expected. The Statcast team at Major League Baseball Advanced Media, including Cory Schwartz, Greg Cain, Tom Tango (he exists!), Mike Petriello, and Daren Willman spent an afternoon walking me through the product’s history and capabilities. I felt like a kid walking through a science museum for the first time.

  Molly Knight was especially helpful with advice and a critical eye that helped make the final book cleaner and more polished.

  There are more team executives who helped than I can list, and some requested that they remain anonymous, but among those I can thank publicly are David Forst, Theo Epstein, Alex Anthopoulos, John Mozeliak, Chris Long, Sig Mejdal, Jason Pare, James Click, Dan Fox, Matt Klentak, John Coppolella, Mitchel Lichtman, and Farhan Zaidi, who’d like me to say that he was especially unhelpful.

  My editors and colleagues at ESPN, especially at ESPN.com and Insider, were gracious enough to give me the time I needed to write a book while maintaining a full-time job and regular presence across ESPN’s various platforms. I appreciate their constant support and understanding.

  My entire career in baseball has been something of a happy accident, and it only occurred at all thanks to J.P. Ricciardi, who gave me my first job in the game (and, among other things, made “Joey Bagodonuts” a permanent part of my vocabulary), and Billy Beane, who helped convince J.P. to give me a shot. I also worked with some wonderful people in my four-plus years in Toronto, and have to single out Tony Lacava and Tommy Tanous for the time they spent with me at games, teaching the most basic aspects of scouting to someone who, for all my comfort with numbers, could barely tell a slider from a changeup when I first got there.

  And finally, I’d like to thank my wife and daughter for their incredible patience throughout the writing process, for all the times I was there but not really there, buried in my computer or stuck on the phone, turning out a 275-page book inside of nine months.

  Index

  The pagination of this electronic edition does not match the edition from which it was created. To locate a specific entry, please use your e-book reader’s search tools.

  Aaron, Hank, 33, 118, 119, 126

  Adcock, Joe, 28

  Adjusted Batting Runs (ABR), 15, 190–91

  African-American players, 214–15

  Alfonzo, Edgardo, 39

  Alien & Sedition Acts, 5

  Allen, Cody, 54–55

  Alomar, Roberto, 35, 35, 211, 213

  Altitude, playing at, 25. See also Coors Field

  Altuve, Jose, 11

  Alvarez, Dario, 48

  American Sports Medicine Institute, 238, 266

  Anaheim Angels, 79, 190, 251, 262

  Angel Stadium, 190

  Aparicio, Luis, 118, 172

  Applied math, 207–29

  Arbitration Projection Model, 50–51

  Area scouts, 232–34

  Arizona Diamondbacks, 24–25, 48, 100, 199, 208, 241

  Arm, The (Passan), 238

  Arm strength, 239

  Arrieta, Jake, 154, 154–55, 161, 197

  Arthur, Rob, 252

  “Artificial intelligence,” 250

  Assists, 79–80, 164–65, 169–70, 172

  At bat, 178–79

  batting average and, 9–17

  At Bat app, 247

  Athleticism, 234–35

  Atlanta Braves, 16, 24, 28, 31, 47–48, 145–46, 146, 199

  Atlanta Journal-Constitution, 31

  Aurilia, Rich, 34

  AVG. See Batting average

  BABIP (Batting Average on Balls In Play), 148–53, 155

  common formula, 148–49, 148n

  2015 NL Cy Young Award, 154–55

  Baker, Dusty, 33–34

  Ball, Trey, 234–35

  Baltimore Orioles, 43, 44, 131, 142

  Barfield, Jesse, 142

  “Barrels,” 253

  Baseball Between the Numbers, 65–66

  Baseball Hall of Fame. See Hall of Fame

  Baseball Info Solutions (BIS), 165, 166, 168

  Baseball myths, 85–106

  clutch hitters, 86–89

  “hot hand,” 102–5

  lineup construction, 93–96

  lineup protection, 90–93

  productive outs, 96–102

  Baseball Prospectus, 2, 63–64, 111, 134, 178, 179, 180

  Baseball-Reference, 15, 50, 72, 117, 143n, 164, 166, 172, 192, 196n, 201

  Baseball Research Journal, 86, 88

  Baseball scouts. See Scouting

  Baseball title creep, 202–3

  Baseball traditions, 2–3

  Baseball Writers’ Association of America (BBWAA), 3, 209

  Whitaker and, 210, 212

  Base-stealing

  catcher and, 176–77

  times caught stealing, 32, 60, 62–63, 64, 66–68, 67, 187

  Base Stealing Runs, 189–90

  Batting average, 9–17

  calculating, 11–12

  correlation analysis, 13–14

  history of use, 10

  slugging percentage compared with, 122, 126

  Batting Average on Balls In Play. See BABIP

  Batting Runs, 15, 115n, 133, 188–91

  Batting titles, 9–10, 15

  Bauer, Trevor, 55, 243

  Beane, Billy, 30

  Bellos, Alex, 103

  Beltre, Adrian, 147

  Benard, Marvin, 34, 34

  Bench, Johnny, 176

  Bequeathed runners, 144–45

  Berger, Mike, 241

  Betances, Dellin, 48–49, 50, 145, 176–77

  Betts, Mookie, 267

  Beyond the Box Score (blog), 177

  Biggio, Craig, 118

  Biomechanical analysis, 266

  Blair, Willie, 24

  Blyleven, Bert, 29–30, 215

  Boggs, Wade, 10, 40, 124

  Bolt, Usain, 251, 257

  Boltzmann, Ludwig, 267

  Bonds, Barry, 33–34, 92

  batting average, 12, 33

  NL MVP, 16

  OBP, 117, 126

  RBIs, 39

  slugging percentage, 122, 126

  2001 season, 33–34

  Book, The: Playing The Percentages In Baseball (Tango, Lichtman, and Dolphin), 87, 90–91, 94, 95

  Borowski, Joe, 49

  Boston (TV show), 4, 97–98

  Boston Red Sox, 27, 53, 110, 116, 142–43, 207, 226–27, 234–35, 267, 270

  Boswell, Thomas, 134

  Boxberger, Brad, 49

  Brach, Brad, 43–44

  Brenly, Bob, 100

  Brett, George, 80

  Britton, Zach, 43–44, 51

  Brock, Lou, 60, 67, 67–68, 118

  Brooklyn Dodgers, 60

  Brooks, Harold, 88

  Brown, Kevin, 81, 215–19, 216, 218

  Bryant, Kris, 253

  Buchter, Ryan, 263

  Bumgarner, Madison, 208

  Bunning, Jim, 216

  Bunts, 97–102

  Cabrera, Mauricio, 47–48

  Cabrera, Miguel, 11, 88, 100–101, 126, 135, 135, 253

  Carpenter, Chris, 199–200

  Carter, Joe, 35, 35–36, 39

  Cary, Chuck, 142

  Castellanos, Nick, 147

 
Castillo, Luis, 66, 122, 122

  Castro, Jason, 264–65

  Catcher defense metrics, 176–81

  Catcher errors, 73–74

  Catcher framing, 177–81

  Caught stealing, 32, 60, 62–63, 64, 66–68, 67, 187

  Chadwick, Henry, 10, 12

  Chance, Dean, 222

  Chapman, Aroldis, 54, 145, 254

  Chase Field, 190

  Chass, Murray, 45

  Chesbro, Jack, 21–22

  Chicago Cubs, 4–5, 113–14, 115, 151, 154, 161, 207–8, 258

  Chicago Sun-Times, 45

  Chicago White Sox, 52, 53, 77, 85, 117, 175, 233, 237, 239

  ChyronHego, 247–48

  Cincinnati Reds, 160, 160–61, 175, 254

  Clark, Jack, 35, 35, 114, 257

  Clemens, Roger, 23–24, 225–26

  Cleveland Indians, 6, 54, 101–2, 174, 207–8

  Closers, 21, 219–21

  Proven Closers, 47, 49, 50–55, 145

  save rule and, 47–55

  Clutch and Win Probability Added (WPA), 157–59

  Clutch hitters, 86–89, 157–59, 274

  Clutch pitching, 47, 161

  Cobb, Ty, 10, 67, 118, 118

  Coleman, Vince, 39–40, 40, 60, 61

  Collective bargaining agreement, 113

  College players, 98, 100, 242–43, 271–72

  Collegesplits.com, 272

  Colon, Bartolo, 25

  Colorado Rockies, 25, 49, 66, 116, 196

  Complete games, 20, 27, 141

  Concepcion, Davey, 210

  Cook, Earnshaw, 109

  Coors Field, 49, 100, 136, 187–88, 195, 196

  Correa, Carlos, 169

  Correlation analysis, 13–14, 38

 

‹ Prev