A couple of days ago, I wrote about how the cluster of players after the big four was itself cementing its hold on the next few ranking spots. Since then, Ferrer, Berdych, and Tsonga have all lost. Oops!
Of that group, only Juan Martin del Potro has survived, and according to my numbers, he poses a serious challenge to Roger Federer tomorrow. But that isn’t the tightest match. That honor goes to Isner/Simon, the most dramatic contrast of playing styles in the quarterfinals. They’ve played once before, at last year’s US Open, when Isner won in four sets, including three tiebreaks. It was even closer than it sounds–Simon won more than half of the points that day.
Here are the full odds for the rest of the tournament:
Player SF F W (1)Novak Djokovic 84.8% 69.0% 44.5% (12)Nicolas Almagro 15.2% 6.7% 1.6% (13)Gilles Simon 48.1% 11.5% 3.4% (11)John Isner 51.9% 12.8% 4.0% (9)Juan Martin Del Potro 44.6% 21.0% 8.9% (3)Roger Federer 55.4% 29.2% 13.9% David Nalbandian 23.3% 6.5% 1.8% (2)Rafael Nadal 76.7% 43.2% 21.9%
I really enjoy this blog, having discovered it maybe 6 months ago and re-discovered it a couple of weeks ago. But with Indian Wells we see the limits of stats. Stats meet life, and life in all its messiness conquers. More simplistically, stats seem more useful for some things (how soon a great player is likely to emerge) and less useful for others (predicting individual tourney winners). I am reading a couple of books right now that emphasize the blind artificiality of many commonly abused statistical constructions, e.g. there is in fact no “average man.” But hey I still enjoy the blog.
You mean that we see the limits of stats because the guys my algorithm identifies as favorites do not always win?
If you put it that way, my comment sounds obtuse. I’d rather throw it back to you as a question: why pick favorites? I know it’s the habit of sportswriters, who do it very badly and only because fans seem to want it. Betting lines interest me a little more, because there is risk involved and if there is a house (as I guess there is with many sporting bets) the house needs to make a profit. So there is a point there. But what’s the point, really, of trying to develop an algorithm for tourney favorites?
Or to put it another way, I like the pieces that analyze enough data to draw useful general conclusions – e.g. “Breaking In and Breaking Through,” etc. But this piece about Indian Wells? It puts you in the company of the sportswriters and the fans and everyone else.
In other words it’s more to me a question of what interesting things math can reveal – not whether math can essentially duplicate existing efforts that are not interesting and often wrong.
Although I will quickly add, my interest is not yours. Looking through the “research” section, I see you are trying to refine techniques & the game of doing so is almost more interesting than the game of tennis. Far enough. I guess I’m just expressing my own preferences as a lay reader.
Or put it this way (and then, really, I will lay off), I’d rather see you apply your talents to something that is relatively unexplored & therefore intriguing, then to something obvious. E.g., the ATP ranking system is an economic instrument trying to sustain multiple stakeholders; what are the interesting differences between this ranking system and your own efforts to create a system that is more predictive? MUCH more interesting to me than “who’s going to win.”
I appreciate that you think highly enough of my ‘talents’ to care about what I do with them.
That said, I think you’re underestimating the value and interest of these numbers. No, it’s not groundbreaking research, and maybe you personally don’t care at all. That’s fine with me.
However, there are a lot of interesting questions to be answered with numbers like these — it isn’t about ‘picking favorites’, it’s about quantifying the degree to which a player is favored, both for the immediate next match and for the tournament itself. Obviously Isner was the underdog against Djokovic, but by how much? When a player enters a Masters event as a dominating top seed, what is his actual chance of winning the event? When a top-four player is knocked out early, what effect does it have on the chances of a player who might not have met him until the semifinals? Which betting lines might reflect bias on the part of the betting public? Sportswriters and fans like to speculate about that stuff; this is one of the few places you can answer those questions with something concrete.
Well, maybe I was speaking partly from envy. I love conceptual modeling, but don’t have good math understanding (although I did teach myself the math for combinations and permutations and variance, as part of playing poker). And I used to hack around with Python. So your work has interest for me just around those things, in addition to the tennis. Anyway I look forward to reading more.