In his remarkable comeback this year, Brian Baker has already recorded two top-20 scalps, along with seven other victories against players in the top 100. In the same span of six months, he’s also lost to a player barely inside the top 400, and suffered another six defeats against guys outside the top 100.
This is inconsistency of historic magnitude. The list of players he’s beaten may actually be more impressive than the list of those who have beaten him! Adding to the confusion, we don’t have any other recent results from him. We can’t just wave our hands and point to his 2011 performance level as an accurate indicator of his current level.
One measurement of player ability, the ATP ranking system, places him at #78, a number that seems just as ridiculous when he’s beating Philipp Kohlschreiber at a Masters event as when he’s losing to Maxime Authom at a challenger. But overall, the ATP estimate doesn’t seem too far-fetched. It’s certainly better than what jrank (my rating system) spits out. That algorithm doesn’t know what to do with such a limited track record, so it places him far outside the top 100.
We can do better. As we’ll see, Baker’s results suggest he belongs on the cusp of the top 50.
Uniquely limited results
Imagine a completely unknown player is given a wild card into a major event. We don’t know where he came from or who he might have beaten in the past. He’s a completely blank slate. If we wanted to estimate his ability level, we would have to wait until we got some results.
If that player won an opening-round match against the 17th-best player in the world, our best guess would be that he is better than #17, but we wouldn’t know how much. If he lost that opening round match, we would assume he is worse than #17. We might use statistics from that match to estimate how much better or worse than #17.
As our unknown kept playing more matches, we would update our estimate, using additional data as it came in.
(You might protest that in the early going, we should regress our estimate to the mean, since if some random guy came out of nowhere, he probably isn’t one of the 16 best tennis players in the world–there was a reason he was nowhere. And, in such a real-world scenario, you would be right. But such a case, what is the mean? If a baseball player is called up from Triple-A, an intelligent observer, such as a scout or team executive, considers him at least marginally MLB-level, so we would regress our estimate to the level of marginal MLB players. But if a player receives a wild card into a tennis tournament, what do we know?)
Few tennis players in history have come closer to this unknown than Brian Baker. Sure, everyone has to start somewhere, but usually “somewhere” is a long string of futures tournaments, followed by an even longer string of challengers. By the time a player bags his first top-20 scalp, we have lots and lots of data to work with.
When other players were racking up several dozen matches every year, Brian Baker was rehabbing injuries and coaching college tennis. We can only judge him based on a small number of recent results. And those results are particularly contradictory.
Working backward
Intuitively, it’s tough to accept that a single player has beaten a bunch of good players and lost to several weaker ones. No matter how good that guy is, such a set of outcomes is unlikely.
But how unlikely? That question is the key to estimating Baker’s current level.
Rather than assuming Baker is playing at a certain level (like that of #78) and scratching our heads at his inconsistency, we can work backwards–take his results and determine the likelihood that he is playing at various levels.
For instance, we could assume that Baker is #5 in the world. If so, some of his results would be very predictable (like the two wins against Blake Strode) and others would be particularly jarring. We could go further and calculate the probability that the #5 player in the world would amass Baker’s specific match record. Those odds, of course, are vanishingly small.
If you repeat the process for every possible ranking, you get a probability that #5, or #12, or #77 would win the matches Baker has won and lose the matches he has lost. One of those probabilities will be higher than the others, and that’s our best guess of how highly we should regard the American.
(If you’re interested in methodology, click “Continue Reading” below.)
Using this method, we discover that Baker has played at the level of someone with about 820 ATP ranking points, putting him around #54, in a tight pack with Grigor Dimitrov, Gilles Muller, Alejandro Falla, and Lukas Lacko. With every match he plays, we can continue to fine-tune our estimate.
There are many factors we need to ignore to do an analysis like this, largely because of the limited data that led us to the topic in the first place. Many of Baker’s worst results have come on hard courts; perhaps he will prove over a longer period to be stronger on clay and grass. If his ability level has changed over the last six months, as seems very likely, this approach fails to take it into consideration.
But because of the unique nature of Baker’s comeback, which makes it difficult to assume anything about his ability level–this approach allows us to a make a reasonably good guess. And with such a strange mix of great wins and rough losses, a good guess is all we can hope for.
Methodology
For this post, I tried to keep things as simple as possible. To estimate the skill of each of Baker’s opponents, I used ATP ranking points from the week of their match. To calculate the probability of winning a match, I used a variation the simple formula a/(a+b), where a is one player’s ranking points and b is the opponent’s. (I add a bit of accuracy by raising a and b to the power 1.1.)
To determine the most likely ranking point total for Baker, I tested every possible ranking point total to calculate the probability that such a player ended up with Baker’s results. These resulting odds are tiny, being the probability that a player wins and loses a specific sequence of 34 matches. Nonetheless, one ranking point total is more likely than the others, and that’s the result.
The 34 matches cover all of Baker’s 2012 ATP, ATP qualifying, and Challenger matches, including his first-round win over Kohlschreiber on Monday.
WOW – this is really good work. Good thing it’s not this time consuming for players with a longer history of results at this level. However, it suggests what might be done for other levels of the game, college, junior, where players are constantly switching into new age groups, starting over, if you will.
Rick