Wednesday, October 1, 2008

Open Bar's man-crush on Nate Silver, plus some stuff about sabermetrics, and yet another sneaky insult of Side Bar

The brain is the biggest erogenous zone.

First off, you may have noticed on the left side of this here blog a lil' website called FiveThirtyEight. If you're following the current election, shame on you if you haven't already RSS'd that sucker. Because if you haven't -- and I'll put this bluntly -- you don't know shit about what's going on.

Second, I've occasionally referenced another lil' site over there called Fire Joe Morgan. That's prolly my favorite site, as it combines awesome baseball writing with funnier jokes than even I can come up with. (You'll also notice, observant reader, that FJM is in both the funny section and the sports section. Yeah, it's bilingual, and it's tops on both lists. Kinda like Salma Hayek. Wait, I don't mean I find Salma Hayek funny and sporty; I was going more for the bilingual thing. And that she's really hot. Maybe she is funny too, I dunno, I've never met her...that I know of.

Anyway, FJM's writers view and analyze baseball in a nontraditional way, rooted heavily in sabermetrics, which sounds nerdy but ultimately makes for a much clearer and accurate analysis of how good baseball players are. At one point on this site, I mocked Side Bar's usage of batting average -- traditionally the primary statistic used to determine how good a hitter is -- because sabermetrics has revealed that batting average, while interesting and somewhat revealing, is way way way overrated and overused by all the announcers we all grew up hearing and all the writers we grew up reading. If you want to know how good a hitter is, before batting average you should be looking at on-base percentage and slugging percentage. Since, after all, not making outs is really good -- OBP -- and a home run is way better than a single -- slugging -- so a stat that ignores such obviousness -- batting average -- shouldn't be held in such high regard, should it?

(And for the record, I wasn't very funny when I mocked Side Bar; it was actually rather snotty and I regret that. We all know how I hate to point out when he says something stupid around here -- cough, cough, Facebook, Sarah Palin, cough. My bust, SB. I should treat your wisdom with more respect. Mea culpa.)

But as our awesome new subheading indicates, we love talking baseball and politics, so believe it or not, I have a point here. As I said before, the guys at FJM were heavily influenced by sabermetrics, and one of the more prominent sabermetricians of the past decade or so is a guy named Nate Silver. Nate invented something called PECOTA, which, essentially, is a complex statistical model that predicts the future performance of baseball players and teams. 

While anyone paying attention could tell right away that someone like Albert Pujols was gonna be the shit because the dude could very clearly hit the crap out of a baseball at a very early point in his career, it's much harder to predict marginal players. Or basically anyone unless it's Pujols-level obvious. PECOTA's genius was to realize that since people have been playing baseball for over a hundred years and there are enormous amounts of data available about every player and team dating way the hell back, it's possible to discover patterns in that data that can help you a great deal in predicting how good your new first baseman is gonna be or how many games your team will probably win. PECOTA has been remarkably accurate in its ability to predict these things. 

A fairly well-known recent example was the 2007 Chicago White Sox. Just two years removed from winning the World Series, PECOTA nonetheless predicted the White Sox to finish with a pathetic record of 72-90. Many people unfamiliar with PECOTA who heard this prediction laughed. But lo and behold, the 2007 Chicago White Sox finished 72-90. PECOTA was exactly right.

Nate Silver's model took into account the various ages of the players on the White Sox, their recent performances, the players they had lost and gained, and compared it to all of baseball's history, trusting that the patterns baseball had established over the last hundred years or so would hold. And they did. 

It wasn't PECOTA's intent to give a big fucking middle finger to the conventional wisdom espoused by baseball "writers" like Mike Lupica and Jay Mariotti. It just happens to be that if you can properly analyze empirical data, that often leads to you to more accurate conclusions than "going with your gut" or any other such subjective "analysis." 

This PECOTA-style sort of analysis, as anyone who has read Moneyball knows, is slowly overtaking the old ways in baseball. Teams like the Oakland A's in the late '90s/early 2000s and, more recently, the Boston Red Sox have reached new levels of success by valuing things like OBP and slugging in player evaluation, rather than the old standbys like batting average and RBI's. While other folks such as Bill James and Rob Neyer, for example, have long been at the forefront of this revolution in baseball analysis, Nate Silver's PECOTA deserves its place as a monumental achievement in that field.

And fortunately for us political junkies, Nate has turned his attention to politics. FiveThirtyEight takes the same sort of heavily data-based approach to predicting elections. I've been following the site since around March or so, when Hillary Clinton won the Texas and Ohio primaries over Barack Obama. Six weeks later, she took Pennsylvania. This was when Hillary appeared to be peaking. 

At the time, though I knew from folks like Chuck Todd that the delegate numbers told the true story -- that Hillary couldn't possibly catch Barack -- I was getting a little worried. After that night in Pennsylvania, there were two weeks until the North Carolina and Indiana primaries. If Hillary could somehow capture those, who knows? Maybe she might be able to win this thing.

If you were reading the papers and watching cable news at the time, you might remember hearing a lot of people saying that Hillary would probably win Indiana by 5 or 6 points. And since North Carolina had always been considered a lopsided Obama state, if Hillary could pull even within 7 or 8 -- as many predicted she would -- that could seriously turn things around for her.

So there you had it: The conventional wisdom was that Hillary would win Indiana easily, by 5 or 6 at least, and cut Barack's once-enormous 15-20 point lead down to 7 or 8 in North Carolina. It was gonna be a huge night for Hillary, cementing her "comeback kid" persona following her Ohio, Texas, and Pennsylvania wins.

But there was one site that said, in the words of Lee Corso, Not so fast, my friend.

Nate Silver made the absurd prediction that Barack would win North Carolina by 17 points and lose Indiana by a mere 2.

I remember reading those predictions that night and having two thoughts: 1. That's ridiculous; everyone else is saying something completely different; and 2. PECOTA is awesome, so maybe -- just maybe -- Nate could be onto something.

As it turned out, he was remarkably accurate. Obama won North Carolina by 14 points. He lost Indiana by 2. He got Indiana exactly right. He was far closer on North Carolina than anyone else. It was oddly reminiscient of his 2007 Chicago White Sox call. Once again, it seemed, in-depth data-based analysis had won out over the conventional wisdom. Just as reading Moneyball and Nate's stuff over at Baseball Prospectus had drastically changed my way of interpreting baseball, it now seemed like I had to look at political polling and demographics in a brand new way. A better way.

Since then, I've followed FiveThirtyEight daily. Nate's analysis goes beyond simple numbers-crunching; he also happens to be a great writer. His posts often use humor to illuminate the more mundane statistical stuff. If people on the Obama and McCain campaigns don't read this site as often or more than I do, then that's political malpractice.

Just as I used to believe that if you could hit .300, you were therefore a great hitter, I used to listen to political analysts who told me that -- regardless of demographics or recent polling data -- they knew what was gonna happen. Not anymore. 

And, in conclusion, here's a perfect look at the marriage of baseball stats and political data, and how Nate Silver understands and explains those things better than just about anyone. And as a bonus, it's filmed at Shea Stadium (R.I.P.). 

(Full disclosure: This is a 20-minute interview with Dan Rather and Nate Silver. So, ah, you may want to take lunch.)

1 comment:

ChuckJerry said...

That was very interesting.

A similar site for poll lovers is