Breaking In and Breaking Through

Yesterday we looked at players who broke into the top 100 when they were teenagers.  As expected, those guys generally went on to great success–17 of the last 25 eventually reached the top 10, and at least two more may still do so.

We can gain a broader perspective by analyzing more than just teenagers.  If a 19-year-old entering the top 100 is likely to become a top-10 player, what chances do 22-year-olds or 26-year-olds have?  By examining a few decades of the ATP ranking system, we can begin to answer these questions.

I used a sample of 590 players–everyone who entered the top 100 between 1980 and 2005.  (It’s possible that a few recent players will continue to improve, but the vast majority of players get close to their peak within five years, so 2005 seems like  a reasonable cutoff date.)  A bit less than half of those 590 broke into the top 100 between the ages of 20 and 22, about a third were older, and the remainder were teenagers.

As you can see in the table below, there is a clear correlation between breaking into the top 100 at an early age and reaching the higher echelons of the pro game.  In the last 30 years, only one #1-ranked player (Pat Rafter) hadn’t reached the top 100 as a teenager, and he made it into the top 100 when he was 20.  Almost every eventual top-10 player had broken into the top 100 by age 21.

Age  Players  Top50  Top20  Top10  Top5  Top1
16         4   100%   100%   100%   75%   50%
17        16   100%    88%    69%   56%   38%
18        38    87%    76%    61%   34%   11%
19        61    89%    48%    41%   20%    8%
20        88    86%    48%    25%   13%    1%
21        99    63%    22%    12%    7%    0%
22        83    47%     8%     5%    2%    0%
23        61    44%    16%     3%    0%    0%
24        62    31%     3%     0%    0%    0%
25        32    25%     0%     0%    0%    0%
26        10    60%    10%     0%    0%    0%
27        16    31%     0%     0%    0%    0%
28        10    20%     0%     0%    0%    0%
29+       10     0%     0%     0%    0%    0%

It’s not entirely clear that these trends are consistent from decade to decade–yesterday, I noted that fewer teenagers had reached the top 100 in the last ten years or so.  It’s possible that as the quality of the game improves and a larger amount of training is necessary to prepare for the pro tour, there will be fewer prodigies like Nadal, who broke in at age 16, and Richard Gasquet, who arrived as a 17-year-old.

But even if the ages shift by a year or two, the overall conclusions should hold.  The older you are when you arrive in the top 100, the less likely it is that you will advance considerably further.

One obvious application of this data is to make predictions regarding players as they enter the top 100.  The last two men to break in are Benoit Paire (age 22) and Matthias Bachinger (age 24).  Paire is still young enough to have an outside shot at the top 10; Bachinger will have a hard time doing much better than #50.  Another recent newbie is Go Soeda, a 26-year-old.  To find someone who made a top-20 success out of so late a breakthrough, you have to go back to Steve Denton in the mid-80s.

Another way to use this information is to find top prospects among current players.  Among active tour pros, the four men who broke in at the youngest ages are Nadal, Gasquet, Juan Martin del Potro, and Novak Djokovic.  The next two might surprise you: Kei Nishikori and Donald Young.  Nishikori has only now recovered from battles with injury–perhaps he will start to make good on his promise.  Young may be a unique case–were it not for his many, many wildcards, he would not have reached the top 100 so early.

Another surprise is the active player with the 10th-youngest age-of-reaching-100: Evgeny Korolev.  The Russian has also struggled with injury, but he did crack the top 50 last year.

The more oft-mentioned “prospects” are a little further down the list.  Grigor Dimitrov broke in at 19.7 years of age, while Milos Raonic appeared just after his 20th birthday–a few days older than the first appearance of Mischa Zverev.  Alexander Dolgopolov is further down than you might expect, having broken in at age 21.3, while Ryan Sweeting didn’t get there until 23.5.

Of course, “age of first appearance in the top 100” is just one metric, and it doesn’t tell the whole story.  Perhaps players who spend several years in college account for that blip in the table at age 23–John Isner, for instance, didn’t reach the top 100 until he was nearly 23, and he has already hit a peak ranking of #18.  The metric might also underrate the chances of those who suffer prolonged injury at an early age–perhaps if Nishikori had lost his two years to injury one season sooner, he would have only recently reached the top 100 with the same skills and potential.

Warts and all, this angle is a good reminder of why we should keep a close eye on youngsters in the futures and challenger tours–the latest, greatest 23-year-old is almost guaranteed not to be the future of the sport.

Teenagers in the Top 100

If Ryan Harrison qualifies for the French Open and reaches the second round, he’ll probably break into the top 100. I wouldn’t bet on that degree of success at Roland Garros, but the relevant point is that the young American is close–if he falters in Paris, a couple of deep runs at challenger events will do the trick.

Harrison just turned 19, and he is the youngest player in the top 150. When Grigor Dimitrov turns 20 next Monday, Harrison will the be the top-ranked 19-year-old in the world. There is a widespread sense that reaching the top 100 is one measure of “making it,” and an equally popular notion that if a player hits that benchmark at such a young age, he is probably destined for success.

Indeed, hitting the top 100 as a teenager is rare, and it’s getting even less common.  Of the 940 players who have spent time in the top 100 in the history of the ATP ranking computer, fewer than 150 (16%) broke in when they were teenagers.  Since the beginning of 2001, 209 players have broken in, including only 25 teenagers (12%).

As you might expect, those 25 have generally gone on to very successful careers.  20 have reached the top 20, and 17 have climbed into the top 10.  It’s even better than that, since in time, Dimitrov and Kei Nishikori seem likely to make those numbers 22 and 19 out of 25.

If Harrison breaks into the top 100 by the end of July, he’ll become the 20th youngest player to do so since the beginning of 2001.  If we want to get technical and limit the span to exactly 10 years, he’ll become the 16th youngest player since mid-2001.  (Early 2001 was a good time for teenagers, with Jose Acasuso, Andy Roddick, Mikhail Youhzny, and Tommy Robredo all reaching the top 100 in the span of three months.)

Incidentally, Bernard Tomic has a chance to make an even more impressive mark, as he is five and a half months younger than Harrison.  However, he’s 50 spots and 130 points lower on the ranking computer, so his appearance in the top 100 as a teenager seems far less assured.

After the jump, see the full list of teenagers who reached the top 100 since 2001.

Continue reading Teenagers in the Top 100

The Odds of Breaking Back

Perhaps the most unquestioned piece of conventional wisdom in tennis is this: after breaking serve, a player is particularly vulnerable to being broken himself.  It certainly seems to be true–to take just one example, in the Isner/Karlovic match last week, there were only two breaks of serve, and they were consecutive.

As with most bits of conventional wisdom, it’s not clear exactly what people mean by it.  When Djokovic crushes someone 6-0 6-1, do we really think his serve is more vulnerable after each of his five or six breaks than it is after the one game his opponent holds?  When a player does break back, is he then more vulnerable in his next service game?

Today, I’ll try to address the more basic versions of the cliche.  The results are a bit surprising.

The dataset

I’m working with all of the 2011 Australian Open matches from courts where Hawkeye was in place.  That’s about 80 of the men’s singles matches, and roughly the same number of women’s matches.  I’ve run the numbers on both genders but will keep them separate, for reasons that will become clear.

These matches give us over 2,700 men’s games across about 300 sets, and nearly 2,000 women’s games over a bit more than 200 sets.

Breaking back: Men

At this year’s Aussie Open, 24% of all men’s games were service breaks.  If we take the conventional wisdom literally, we would hypothesize that in the game following a service break, another break would occur more than 24% of the time.

But it doesn’t.  In the game following a service break, the server is broken only 19.5% of the time.  (I’m excluding service breaks that end a set or take a set to a tiebreak.)  In other words, in the aggregate, a player is more likely to hold serve after breaking serve than he is after his opponent holds.

Of course, as I suggested by mentioning Djokovic a moment ago, there’s a huge selection bias here.  A player who breaks serve is (all else equal) likely to be a better player than one who doesn’t.  The best players in the most lopsided matches are breaking serve frequently, and because they are the better player, it makes sense that they are more likely (again, all else equal) to hold their own serve.

Without looking at individual matchups, it’s not immediately clear how to address this problem.  For one thing, I’m not convinced it’s a problem.  When Federer broke Kohlschreiber today, a commentator may have said, “Roger is particularly vulnerable here, let’s see if he can consolidate the break.”  One could easily respond: “Roger just showed us he’s in tremendous form; the very fact he just broke serve is an indication that he’s less vulnerable than usual on serve right now.”  And so it proved: Roger broke four times; Kohlschreiber never broke back.

What might be more instructive is to look at situations where the player who broke serve is considered to be roughly equal or inferior to his opponent.  Had Kohlschreiber broken serve early in the match, even given the assumption that he must be playing well in order to do so, the conventional wisdom would suggest that Federer is more likely to break back.  Perhaps that’s true.  It’s not something I can answer today–quantifying the matchups is beyond the scope of this afternoon project.  It’s also problematic in that it would also shrink the size of our already-small dataset.

In any event, it is clear that we can’t take this bit of conventional wisdom at face value.  It may be true in certain scenarios–some players may crumple under the pressure of consolidating a break, and others may rise to the occasion after losing serve.  But it is wrong to say that, in general, players are more vulnerable on serve after a break.

Breaking back: Women

As you might expect, breaks of serve are more prevalent in the women’s game, as are breaks-following-breaks.

At the 2011 AO, women broke serve 36.5% of the time.  In games following breaks of serve, they broke 36.0%.  In contrast to the men’s results, this suggests that in the women’s game, a service break doesn’t tell us as much about the strength of the player who has accomplished the break–or, if it does, that a server is more vulnerable after breaking serve.  Anecdotally, it certainly seems that differences in mental strength play a larger role in WTA matches, so I would expect that the break-back rate would be higher.

As I’ve said, this is far from the final word.  As usual, the conventional wisdom masks many subtleties that only further analysis can unearth.

Monte Carlo Projections

Clay court rankings and projections are tough.  Most of the top 50 ATP players only compete on clay for a couple months of the year, so at the beginning of the clay-court swing, we’re using surface-specific results from almost a year ago; at the end, we’re depending heavily on each player’s recent results.

Of course, the very top of the list is easy.  Rafael Nadal hasn’t lost a clay court match since the 2009 French Open.  It gets messy soon after that, since Nadal left the rest of the field fighting for crumbs.  Roger Federer is the clear #2 in this field, with Andy Murray a distant third.  (And maybe he should be even more distant.)

These projections are clay-specific, as you can probably tell by some of the percentages.  My clay rankings, however, are heavily regressed back to overall rankings so, for instance, Milos Raonic gets plenty of credit for his recent success on hard courts.  (And today he justified that credit.)

The tournament organizers made it tough for me to do pure projections, since four main draw matches were complete by the time qualifiers were placed.  Thus, the numbers are below are done as if I didn’t know anything about the outcome of today’s main draw matches.  I used rankings generated last Monday, so I might be selling Potito Starace (and, to a lesser extent, Victor Hanescu) a little short by excluding results from Casablanca.

Enjoy!

 

Player                   R32   R16    QF    SF     F     W
(1)Rafael Nadal         100% 92.7% 79.9% 66.8% 52.4% 36.6%
(q)Julien Benneteau      54%  4.4%  1.7%  0.6%  0.2%  0.0%
Jarkko Nieminen          46%  2.9%  1.0%  0.3%  0.1%  0.0%
(q)Vincent Millot        26%  5.6%  0.3%  0.1%  0.0%  0.0%
Guillermo Garcia-Lopez   74% 31.8%  5.5%  2.1%  0.7%  0.2%
Denis Istomin            27% 12.8%  1.3%  0.4%  0.1%  0.0%
(13)Richard Gasquet      73% 49.8% 10.3%  5.3%  2.4%  0.9% 

Player                   R32   R16    QF    SF     F     W
(12)Jo-Wilfried Tsonga   59% 35.2% 17.0%  4.0%  1.7%  0.6%
Juan Monaco              41% 21.6%  8.9%  1.8%  0.7%  0.2%
Ivan Ljubicic            55% 25.5% 10.5%  2.2%  0.9%  0.2%
Jeremy Chardy            45% 17.7%  6.3%  1.1%  0.3%  0.1%
(q)Olivier Rochus        46%  8.7%  2.6%  0.3%  0.1%  0.0%
Juan Ignacio Chela       54% 12.4%  4.2%  0.6%  0.1%  0.0%
(5)(WC)Tomas Berdych    100% 78.9% 50.5% 14.7%  7.8%  3.2% 

Player                   R32   R16    QF    SF     F     W
(3)(WC)Andy Murray      100% 70.6% 54.3% 38.4% 16.5%  8.7%
(WC)Radek Stepanek       30%  5.9%  2.5%  0.9%  0.2%  0.0%
Marcos Baghdatis         70% 23.5% 14.5%  7.8%  2.3%  0.8%
Albert Montanes          64% 30.9%  8.9%  4.0%  0.8%  0.2%
Xavier Malisse           36% 11.1%  2.0%  0.6%  0.1%  0.0%
Thomaz Bellucci          52% 30.9% 10.1%  4.7%  1.1%  0.3%
(16)Gilles Simon         48% 27.0%  7.7%  3.4%  0.7%  0.2% 

Player                   R32   R16    QF    SF     F     W
(10)Mikhail Youzhny      52% 38.1% 18.3%  7.0%  1.7%  0.5%
Florian Mayer            48% 33.3% 13.8%  4.8%  1.1%  0.3%
(q)Frederico Gil         41% 10.2%  2.1%  0.3%  0.0%  0.0%
Sergiy Stakhovsky        59% 18.4%  5.0%  1.1%  0.1%  0.0%
Daniel Gimeno-Traver     40%  7.9%  2.9%  0.5%  0.1%  0.0%
Santiago Giraldo         60% 15.6%  6.6%  1.8%  0.2%  0.1%
(8)Gael Monfils         100% 76.5% 51.3% 24.7%  7.7%  3.0% 

Player                   R32   R16    QF    SF     F     W
(6)Fernando Verdasco    100% 71.3% 54.5% 31.0% 13.6%  5.2%
Tommy Robredo            73% 24.6% 15.1%  5.7%  1.6%  0.4%
Ivan Dodig               27%  4.0%  1.5%  0.3%  0.0%  0.0%
Kevin Anderson           44% 16.8%  3.4%  0.7%  0.1%  0.0%
Fabio Fognini            56% 24.1%  6.0%  1.5%  0.3%  0.0%
(WC)Jean-Rene Lisnard    14%  3.4%  0.3%  0.0%  0.0%  0.0%
(11)Viktor Troicki       86% 55.6% 19.2%  6.5%  1.7%  0.4% 

Player                   R32   R16    QF    SF     F     W
Ernests Gulbis           67% 44.8% 22.0% 12.2%  5.0%  1.8%
(14)Alexandr Dolgopolov  33% 16.9%  5.5%  2.2%  0.6%  0.1%
Milos Raonic             59% 24.4%  8.8%  3.9%  1.1%  0.3%
Michael Llodra           41% 13.9%  4.1%  1.4%  0.3%  0.1%
Janko Tipsarevic         55% 13.5%  5.3%  1.9%  0.4%  0.1%
Feliciano Lopez          45% 10.7%  3.7%  1.2%  0.3%  0.0%
(4)David Ferrer         100% 75.8% 50.7% 31.4% 14.3%  6.0% 

Player                   R32   R16    QF    SF     F     W
(7)Jurgen Melzer        100% 54.7% 29.3%  9.3%  4.2%  1.3%
Nikolay Davydenko        72% 37.5% 20.3%  6.8%  3.2%  1.1%
Robin Haase              28%  7.8%  2.6%  0.5%  0.1%  0.0%
(q)Maximo Gonzalez       55% 12.3%  2.8%  0.3%  0.1%  0.0%
Victor Hanescu           45%  8.6%  1.8%  0.2%  0.0%  0.0%
Marcel Granollers        16%  8.5%  2.1%  0.3%  0.1%  0.0%
(9)Nicolas Almagro       84% 70.5% 41.1% 15.2%  7.7%  2.8% 

Player                   R32   R16    QF    SF     F     W
(15)Marin Cilic          86% 68.2% 19.9% 10.4%  4.8%  1.5%
(q)Filippo Volandri      14%  5.7%  0.6%  0.1%  0.0%  0.0%
(q)Pere Riba             43%  9.7%  1.0%  0.2%  0.0%  0.0%
Potito Starace           57% 16.4%  1.7%  0.3%  0.1%  0.0%
Philipp Kohlschreiber    61% 11.6%  6.4%  3.0%  1.2%  0.3%
Andrey Golubev           39%  5.2%  2.4%  0.9%  0.3%  0.1%
(2)Roger Federer        100% 83.2% 68.1% 52.6% 39.0% 22.1%

Net-rushing, or The Stats We Don’t Have

In yesterday’s morning recap, I made the following comment about Nadal’s baseline game:

The one baffling thing is Nadal’s reluctance to come to net.   He was often standing right on the baseline, even hitting groundstrokes from a step inside the baseline.  Yet he almost never came forward unless forced.  Even with an imperfect net game, even against the passing-shot machine that is Djokovic, I think he would’ve been more successful taking advantage of some of those offensive positions.

In the comments, Tom Welsh laid out the flip side of the argument concisely:

During the Nadal-Djokovic match yesterday I noticed several occasions when each of those brilliant players came in to the net and was left looking like a hopeless beginner – either by a passing shot, or a sizzling ground stroke to the short ribs, or by a perfect lob landing just a couple of feet inside the baseline. I’m not tennis player, but it seems to me that no one can afford to come in these days unless the opponent is stretched to the breaking point. Even then, it’s taking a big risk.

That’s the argument in a nutshell.  Even more briefly:

  • PRO: Players should be more aggressive and come forward more often.
  • CON: In the modern-day game, approaching the net is usually too risky.

Which is it?

Pick your poison

The first thing that needs to be understood is that, against an elite tennis player, anything is a risk.  Short of a decisive smash, any shot you hit is likely to come back, and there’s a non-zero chance that what comes back is going to be a winner.  Choosing to come forward isn’t a decision between risk and no risk, it’s a matter of degree.

The main difference is that, if you come forward and fail, you’ll look like a fool, and your opponent will look brilliant, in the ways Tom described.  If you stay back and fail, it’s somehow more understandable–in a 15-stroke rally between top players, somebody has to lose.  Of course, you lose the point either way.

One of the problems of arguing this point with anecdotal evidence is that I think we, as both fans and players, remember the brilliant passing shots and jaw-dropping lobs.  If you rush the net and your opponent misses what would’ve been a sensational running forehand, you remember the amazing shot-that-almost-was.  Human brains don’t default to probabilistic calculations, while brilliant moments catch and keep our attention.

Commentators, steeped in strategy of the 70’s and 80’s, will always want too much net-rushing.  Most players will tend to stay back too much.  If we can ever establish the proper opportunities to come forward, the “correct” answer will turn out to be somewhere in between.

Where the stats fail

Answering the question analytically will be very difficult, and given the information currently out there, it’s flat-out impossible.

In the meantime, let’s think through what it would take to answer the question.  Starting with what I take to be the question itself:

Given his skillset, his opponent’s skillset, and each player’s position on the court, when should a player come forward?

That’s a lot of stuff we can’t quantify.  Even if we posit a couple of generic pro players, it’s still an unanswerable question.

Particularly useless are existing net stats.  Occasionally during a match, a broadcast will show us that so-and-so has won 5 out of 8 points at net.  The commentators reliably chime in, usually suggesting that the player has better net skills that we give him credit for (perhaps he’s been playing some doubles lately), and that he would benefit by coming in more.

In most cases, those 8 points couldn’t be less relevant.  Think back to Sunday’s Nadal-Djokovic match.  Much of the time Nadal came forward, it was in response to a Djokovic drop shot.  In other words, Nadal came forward on the defense!  I’m guessing he lost most of those points.  On the flip side, imagine Del Potro cracking a serve out wide, then coming in behind it to hit a swinging volley winner.  That’s 1-for-1 on the net point tally, but it doesn’t say a thing about Delpo’s deftness of touch around the net.

A framework

Let’s imagine that we suddenly had access to Hawkeye’s shot-by-shot data.  We’d know the hit point for every ball of every point of every match where the Hawkeye system was installed.  (Drool.)

If we knew that, we could come up with a fairly simple model to estimate the likelihood of winning a point from any position on the court, against a certain quality of shot.  Standing at the middle of the baseline smacking a 60 mph service return, you might have a 70% chance of winning the point.  Stuck in the backhand corner after your opponent has cracked a 90mph groundstroke, and it might be more like 20%.

The details of the model aren’t important.  What matters is that, with a certain data set, we could estimate the probability of winning a point given a variety of conditions.

Extending this framework to analyze a tactic like net-rushing wouldn’t be that complicated.  Let’s say Nadal is standing on the baseline with a 70% chance of winning the point.  No matter what he does afterward, he will probably hit a forehand into one corner or the other, after which we can once again estimate his probability of winning the point.  From there, he has two choices: Come forward, or stay back.  Some game theory might get involved, since his opponent will probably see him approach the net and may change his own strategy accordingly.

Again, we can work out the details when there is data to play with.  Given these relatively simple figures, we could estimate Nadal’s probability of winning the point coming forward behind his forehand and staying back after hitting the shot.

The numbers would give a better way of judging whether a particular play is advisable.  For Nadal, it may turn out that staying back is always smarter–after all, the numbers will probably tell us that, from any given position, he has a better chance than nearly anyone else of winning the point.  But, say, Ivo Karlovic may be better off coming in behind the exact same shot from the same position.  There’s a continuum between the extremes, of course, and we’ll need to know a lot more before we know what that looks like.

In the meantime, I’d still like to see Nadal come forward–and I’ll try harder to remember the times when his opponent goes for a blistering passing shot and misses.

Miami Projections

While Roger Federer still holds a very slight edge in my hard-court rankings, Novak Djokovic is the favorite to win in Miami.  My simulation gives the Serbian a 27.4% chance of winning back-to-back tournaments, while Federer comes in at 19.7%.

For more background on how I generate these projections, click here.  I’ve tweaked the system a bit since then; most notably, I discovered that my rankings were slightly underrating the chances of younger players and overrating those of older players.  I’ve adjusted my forecasts accordingly.

Enjoy!

Player               R64  R32   R16    QF    SF     F     W 
(1)Rafael Nadal      100% 84% 76.8% 62.0% 48.8% 28.5% 14.3% 
Jeremy Chardy        50%   9%  5.2%  2.3%  0.9%  0.2%  0.0% 
Kei Nishikori        50%   8%  4.5%  1.9%  0.7%  0.2%  0.0% 
Feliciano Lopez      69%  43%  7.3%  2.7%  0.9%  0.2%  0.0% 
Richard Berankis     31%  12%  1.5%  0.4%  0.1%  0.0%  0.0% 
(26)Juan I Chela     100% 45%  4.8%  1.4%  0.4%  0.0%  0.0% 
(21)Dolgopolov       100% 74% 40.0% 12.8%  5.8%  1.7%  0.4% 
Andreas Seppi        52%  14%  4.3%  0.7%  0.2%  0.0%  0.0% 
Mischa Zverev        48%  12%  3.7%  0.7%  0.2%  0.0%  0.0% 
Teymuraz Gabashvili  44%   9%  2.4%  0.2%  0.0%  0.0%  0.0% 
Daniel Brands        56%  13%  3.6%  0.6%  0.1%  0.0%  0.0% 
(15)JW Tsonga        100% 78% 45.9% 14.4%  7.5%  2.6%  0.6% 

Player               R64  R32   R16    QF    SF     F     W
(11)Nicolas Almagro  100% 87% 54.4% 25.3%  8.3%  2.7%  0.6% 
Federico Gil         61%  10%  2.8%  0.4%  0.0%  0.0%  0.0% 
(q)Paul Capdeville   39%   4%  0.8%  0.1%  0.0%  0.0%  0.0% 
Leonardo Mayer       35%  12%  3.6%  0.9%  0.1%  0.0%  0.0% 
(WC)Ivo Karlovic     65%  30% 11.7%  4.2%  1.0%  0.2%  0.0% 
(20)Albert Montanes  100% 58% 26.7%  9.5%  2.5%  0.6%  0.1% 
(28)Ernests Gulbis   100% 91% 34.6% 17.1%  5.0%  1.3%  0.3% 
Carlos Berlocq       82%   8%  0.5%  0.1%  0.0%  0.0%  0.0% 
(WC)Jack Sock        18%   1%  0.0%  0.0%  0.0%  0.0%  0.0% 
Adrian Mannarino     72%  11%  3.5%  0.9%  0.1%  0.0%  0.0% 
Ramirez Hidalgo      28%   0%  0.0%  0.0%  0.0%  0.0%  0.0% 
(7)Tomas Berdych     100% 89% 61.4% 41.5% 17.4%  7.5%  2.3% 

Player               R64  R32   R16    QF    SF     F     W
(3)Roger Federer     100% 92% 80.4% 64.1% 50.0% 34.3% 19.7% 
Fabio Fognini        39%   3%  1.4%  0.4%  0.1%  0.0%  0.0% 
Radek Stepanek       61%   5%  2.5%  0.9%  0.3%  0.1%  0.0% 
Sergiy Stakhovsky    56%  18%  1.8%  0.5%  0.1%  0.0%  0.0% 
(q)Grigor Dimitrov   44%  11%  0.9%  0.2%  0.0%  0.0%  0.0% 
(32)Juan Monaco      100% 70% 13.0%  5.5%  2.2%  0.6%  0.1% 
(22)Marcos Baghdatis 100% 85% 54.0% 18.1%  9.6%  4.0%  1.1% 
Blaz Kavcic          43%   7%  1.9%  0.2%  0.0%  0.0%  0.0% 
(q)Olivier Rochus    57%   8%  2.2%  0.1%  0.0%  0.0%  0.0% 
Pere Riba            39%   6%  0.8%  0.0%  0.0%  0.0%  0.0% 
Yen-Hsun Lu          61%  14%  2.2%  0.2%  0.0%  0.0%  0.0% 
(13)Mikhail Youzhny  100% 81% 38.9%  9.7%  4.3%  1.4%  0.3% 

Player               R64  R32   R16    QF    SF     F     W
(10)Jurgen Melzer    100% 76% 37.0% 16.3%  4.8%  1.8%  0.4% 
Philipp Petzschner   64%  18%  3.9%  0.9%  0.1%  0.0%  0.0% 
Florent Serra        36%   6%  0.9%  0.1%  0.0%  0.0%  0.0% 
Janko Tipsarevic     57%  15%  6.5%  2.2%  0.5%  0.1%  0.0% 
Robin Haase          43%   7%  2.5%  0.7%  0.1%  0.0%  0.0% 
(18)Marin Cilic      100% 78% 49.1% 26.4%  9.1%  4.0%  1.2% 
(25)Gilles Simon     100% 78% 34.7% 16.9%  4.7%  1.4%  0.3% 
(WC)Ryan Harrison    62%  17%  3.7%  0.9%  0.1%  0.0%  0.0% 
(q)Rainer Schuettler 38%   5%  0.3%  0.0%  0.0%  0.0%  0.0% 
Pablo Cuevas         52%  10%  3.9%  1.1%  0.2%  0.0%  0.0% 
Michael Berrer       48%   6%  2.0%  0.5%  0.0%  0.0%  0.0% 
(8)Andy Roddick      100% 83% 55.3% 33.9% 13.5%  6.4%  2.3% 

Player               R64  R32   R16    QF    SF     F     W
(6)David Ferrer      100% 96% 65.3% 40.1% 18.5%  5.9%  2.5% 
(q)Robert Kendrick   49%   2%  0.3%  0.0%  0.0%  0.0%  0.0% 
(q)Igor Kunitsyn     51%   2%  0.3%  0.0%  0.0%  0.0%  0.0% 
Somdev Devvarman     49%   6%  0.8%  0.1%  0.0%  0.0%  0.0% 
Potito Starace       51%   8%  0.7%  0.1%  0.0%  0.0%  0.0% 
(31)Milos Raonic     100% 86% 32.6% 16.0%  5.9%  1.6%  0.5% 
(23)Michael Llodra   100% 67% 26.5%  9.1%  2.6%  0.5%  0.1% 
Xavier Malisse       67%  25%  5.6%  1.2%  0.2%  0.0%  0.0% 
(q)Ryan Sweeting     33%   8%  1.4%  0.2%  0.0%  0.0%  0.0% 
Benjamin Becker      54%   9%  3.7%  0.6%  0.1%  0.0%  0.0% 
Marcel Granollers    46%   8%  2.4%  0.4%  0.1%  0.0%  0.0% 
(12)Stan Wawrinka    100% 83% 60.3% 32.1% 14.5%  4.5%  1.8% 

Player               R64  R32   R16    QF    SF     F     W
(14)Mardy Fish       100% 73% 40.4% 12.3%  5.5%  1.4%  0.4% 
Julien Benneteau     61%  18%  6.0%  0.8%  0.2%  0.0%  0.0% 
Gimeno-Traver        39%   8%  1.9%  0.2%  0.0%  0.0%  0.0% 
Ivan Ljubicic        85%  39% 19.3%  4.8%  1.7%  0.3%  0.1% 
(q)Paolo Lorenzi     15%   1%  0.1%  0.0%  0.0%  0.0%  0.0% 
(17)Richard Gasquet  100% 60% 32.3%  9.5%  4.0%  0.8%  0.2% 
(29)Kohlschreiber    100% 16%  5.1%  2.2%  0.7%  0.1%  0.0% 
(PR)Del Potro        100% 83% 50.8% 38.6% 25.8% 11.5%  6.1% 
Richardo Mello       0%    0%  0.0%  0.0%  0.0%  0.0%  0.0% 
Ivan Dodig           38%   5%  0.4%  0.1%  0.0%  0.0%  0.0% 
Andrey Golubev       62%  12%  1.6%  0.5%  0.1%  0.0%  0.0% 
(4)Robin Soderling   100% 84% 42.1% 31.0% 20.1%  8.5%  4.2% 

Player               R64  R32   R16    QF    SF     F     W
(5)Andy Murray       100% 99% 87.6% 65.1% 30.5% 18.9% 10.8% 
Victor Hanescu       55%   0%  0.1%  0.0%  0.0%  0.0%  0.0% 
(q)Alex Bogomolov    45%   0%  0.0%  0.0%  0.0%  0.0%  0.0% 
Santiago Giraldo     56%  21%  2.4%  0.6%  0.1%  0.0%  0.0% 
Igor Andreev         44%  15%  0.9%  0.2%  0.0%  0.0%  0.0% 
(30)John Isner       100% 65%  8.9%  3.5%  0.5%  0.2%  0.0% 
(24)Garcia-Lopez     100% 35% 15.9%  3.1%  0.5%  0.1%  0.0% 
Nikolay Davydenko    87%  61% 39.6% 15.1%  4.9%  2.3%  1.0% 
Kevin Anderson       13%   3%  0.8%  0.0%  0.0%  0.0%  0.0% 
(WC)Bernard Tomic    68%  11%  1.8%  0.2%  0.0%  0.0%  0.0% 
Pablo Andujar        32%   3%  0.3%  0.0%  0.0%  0.0%  0.0% 
(9)Verdasco          100% 86% 41.5% 12.0%  3.1%  1.1%  0.4% 

Player               R64  R32   R16    QF    SF     F     W
(16)Viktor Troicki   100% 77% 43.0%  5.9%  1.4%  0.4%  0.1% 
Tobias Kamke         51%  13%  3.3%  0.1%  0.0%  0.0%  0.0% 
(q)Marsel Ilhan      49%  10%  2.6%  0.2%  0.0%  0.0%  0.0% 
Jarkko Nieminen      59%  22%  9.6%  0.8%  0.1%  0.0%  0.0% 
Mikhail Kukushkin    41%  12%  4.1%  0.4%  0.1%  0.0%  0.0% 
(19)Sam Querrey      100% 66% 37.4%  6.6%  1.9%  0.5%  0.1% 
(27)Thomaz Bellucci  100% 69%  8.5%  3.9%  0.8%  0.2%  0.0% 
(q)Michael Russell   31%   7%  0.1%  0.0%  0.0%  0.0%  0.0% 
(WC)James Blake      69%  25%  0.4%  0.1%  0.0%  0.0%  0.0% 
(q)Donald Young      47%   3%  1.1%  0.3%  0.1%  0.0%  0.0% 
Denis Istomin        53%   0%  0.2%  0.1%  0.0%  0.0%  0.0% 
(2)Novak Djokovic    100% 97% 89.7% 81.5% 55.9% 41.2% 27.4%

Hard Court Singles Rankings, 3/21/11

About two weeks ago, I introduced my ranking system.  Much of the rationale is explained here.  The important thing to keep in mind is that the system is designed to be predictive–that is, it values the things that tend to correctly forecast the outcome of matches.

Since then, I’ve made a few tweaks under the hood.  For the most part, the changes don’t affect the rankings, they just adjust the differences between players to better reflect surface-specific skills.

Still, Roger Federer is hanging on at the top, though it’s so close that it should be considered virtually a tie.  My algorithm to predict the outcome of individual matches also takes head-to-head results into account, and given Novak Djokovic’s recent dominance, that algorithm now gives Djokovic the slight edge in a battle with Federer.

The real value, here, is a little further down the list, as this system is much better than the ATP rankings at measuring the skill level of players who are scoring big upsets and enjoying recent success.  To wit, Ivo Karlovic is up to #33 here, in part thanks to his giant-killing run last week.  Also, my system places Ryan Harrison at #71 and Donald Young at #82 for similar reasons.

I intended for this to be a top 100, but #101 is Somdev Devvarman, notable due to his string of upsets, which moved him all the way up from #147.

1   Roger Federer          8191 
2   Novak Djokovic         8076 
3   Andy Murray            4749 
4   Rafael Nadal           4654 
5   Robin Soderling        4205 
6   Juan Martin del Potro  4047 
7   Nikolay Davydenko      2853 
8   David Ferrer           2772 
9   Stanislas Wawrinka     2660 
10  Andy Roddick           2494 
11  Tomas Berdych          2268 
12  Gael Monfils           2088 
13  Marcos Baghdatis       1879 
14  Mardy Fish             1838 
15  Marin Cilic            1666 
16  Fernando Verdasco      1603 
17  Jurgen Melzer          1565 
18  David Nalbandian       1547 
19  Jo-Wilfried Tsonga     1475 
20  Ivan Ljubicic          1449 

21  Michael Llodra         1385 
22  Richard Gasquet        1367 
23  Florian Mayer          1335 
24  Milos Raonic           1308 
25  Mikhail Youzhny        1276 
26  Gilles Simon           1235 
27  Nicolas Almagro        1209 
28  Alexander Dolgopolov   1124 
29  Guillermo Garcia-Lopez 1038 
30  Philipp Kohlschreiber  1030 
31  Viktor Troicki         1020 
32  Juan Monaco            1011 
33  Ivo Karlovic            994 
34  Radek Stepanek          989 
35  Albert Montanes         974 
36  Tommy Robredo           885 
37  Samuel Querrey          840 
38  Lleyton Hewitt          835 
39  John Isner              834 
40  Ernests Gulbis          782 

41  Jeremy Chardy           780 
42  Feliciano Lopez         765 
43  Janko Tipsarevic        728 
44  Julien Benneteau        695 
45  Kei Nishikori           666 
46  Xavier Malisse          634 
47  Jarkko Nieminen         624 
48  Dmitry Tursunov         603 
49  Fernando Gonzalez       597 
50  Juan Carlos Ferrero     596 
51  Thomaz Bellucci         586 
52  Andrei Goloubev         513 
53  Andreas Seppi           484 
54  Benjamin Becker         482 
55  Michael Berrer          465 
56  Thiemo de Bakker        453 
57  Juan Ignacio Chela      450 
58  Olivier Rochus          444 
59  Pablo Cuevas            441 
60  Igor Andreev            430 

61  Fabio Fognini           427 
62  Philipp Petzschner      423 
63  Santiago Giraldo        417 
64  James Blake             416 
65  Sergey Stakhovsky       399 
66  Ivan Dodig              384 
67  Denis Istomin           382 
68  Michael Zverev          369 
69  Robin Haase             364 
70  Arnaud Clement          364 
71  Ryan Harrison           360 
72  Daniel Gimeno           350 
73  Marcel Granollers       346 
74  Leonardo Mayer          343 
75  Robby Ginepri           338 
76  Paul-Henri Mathieu      335 
77  Lukasz Kubot            332 
78  Daniel Brands           330 
79  Alejandro Falla         327 
80  Mikhail Kukushkin       320 

81  Dudi Sela               309 
82  Donald Young            304 
83  Victor Hanescu          296 
84  Teimuraz Gabashvili     295 
85  Grigor Dimitrov         280 
86  Florent Serra           277 
87  Lukas Lacko             276 
88  Horacio Zeballos        276 
89  Ryan Sweeting           273 
90  Adrian Mannarino        272 
91  Yen-Hsun Lu             271 
92  Kevin Anderson          269 
93  Rainer Schuettler       267 
94  Edouard Roger-Vasselin  266 
95  Richard Berankis        266 
96  Bernard Tomic           263 
97  Marco Chiudinelli       261 
98  Nicolas Mahut           261 
99  Simon Greul             259 
100 Frederico Gil           258 
101 Somdev K. Dev Varman    258

Hard-Court Singles Rankings

If you’ve found your way here from the Wall Street Journal, welcome! If you don’t know what I’m talking about, go read what Carl Bialik has to say in today’s paper, and in an online follow-up.  I’ve written at length about my rankings and prediction system and published full odds for Indian Wells here.

As you may have read in the Wall Street Journal, my ranking system rates Federer number one.  The difference between Fed and Nadal is even more striking if we use my hard-court-specific rankings.  However, in the hard-court-specific rankings, Djokovic closes the gap quite a bit.

Before you email me to tell me what an idiot I am for publishing something so blatantly wrong, please read my description of what the system does.

The goal of these rankings isn’t to say who is the greatest of all time, or to say that any player here is guaranteed to beat anyone below him.  Instead, they are the result of an algorithm that is better than anything else I’ve seen at predicting the outcome of tennis matches.

Here are the current top 100 hard-court players, along with the hard-court rankings of several other players who are in the Indian Wells main draw:

1   Roger Federer          8579 
2   Novak Djokovic         6853 
3   Andy Murray            5013 
4   Rafael Nadal           4892 
5   Robin Soderling        4363 
6   Juan Martin del Potro  3624 
7   Nikolay Davydenko      3118 
8   David Ferrer           2913 
9   Andy Roddick           2671 
10  Tomas Berdych          2284 
11  Gael Monfils           2226 
12  Stanislas Wawrinka     2094 
13  Marcos Baghdatis       2062 
14  David Nalbandian       1967 
15  Mardy Fish             1961 
16  Marin Cilic            1779 
17  Fernando Verdasco      1709 
18  Jurgen Melzer          1615 
19  Ivan Ljubicic          1602 
20  Jo-Wilfried Tsonga     1565 

21  Michael Llodra         1475 
22  Mikhail Youzhny        1317 
23  Gilles Simon           1314 
24  Florian Mayer          1312 
25  Nicolas Almagro        1305 
26  Milos Raonic           1231 
27  Alexander Dolgopolov   1223 
28  Guillermo Garcia-Lopez 1109 
29  Juan Monaco            1102 
30  Richard Gasquet        1091 
31  Radek Stepanek         1044 
32  Viktor Troicki         1021 
33  John Isner              901 
34  Lleyton Hewitt          883 
35  Tommy Robredo           867 
36  Albert Montanes         841 
37  Jeremy Chardy           840 
38  Ernests Gulbis          820 
39  Philipp Kohlschreiber   796 
40  Feliciano Lopez         787 

41  Samuel Querrey          773 
42  Janko Tipsarevic        734 
43  Fernando Gonzalez       711 
44  Julien Benneteau        695 
45  Kei Nishikori           686 
46  Jarkko Nieminen         638 
47  Juan Carlos Ferrero     635 
48  Dmitry Tursunov         633 
49  Xavier Malisse          588 
50  Thomaz Bellucci         578 
51  Ivo Karlovic            559 
52  Andreas Seppi           507 
53  Andrei Goloubev         488 
54  Benjamin Becker         487 
55  Michael Berrer          466 
56  Thiemo de Bakker        457 
57  Igor Andreev            455 
58  Olivier Rochus          449 
59  Philipp Petzschner      447 
60  Juan Ignacio Chela      434 

61  Fabio Fognini           434 
62  James Blake             432 
63  Pablo Cuevas            426 
64  Santiago Giraldo        413 
65  Sergey Stakhovsky       402 
66  Denis Istomin           400 
67  Ivan Dodig              389 
68  Arnaud Clement          375 
69  Michael Zverev          367 
70  Robin Haase             367 
71  Leonardo Mayer          352 
72  Robby Ginepri           351 
73  Marcel Granollers       350 
74  Daniel Brands           345 
75  Alejandro Falla         341 
76  Daniel Gimeno           341 
77  Paul-Henri Mathieu      341 
78  Mikhail Kukushkin       330 
79  Dudi Sela               325 
80  Lukasz Kubot            324 

81  Teimuraz Gabashvili     303 
82  Victor Hanescu          288 
83  Grigor Dimitrov         284 
84  Lukas Lacko             282 
85  Adrian Mannarino        279 
86  Kevin Anderson          275 
87  Florent Serra           275 
88  Simon Greul             274 
89  Potito Starace          270 
90  Edouard Roger-Vasselin  269 
91  Frank Dancevic          269 
92  Horacio Zeballos        268 
93  Richard Berankis        266 
94  Marco Chiudinelli       264 
95  Rainer Schuettler       263 
96  Ryan Harrison           262 
97  Frederico Gil           261 
98  Bernard Tomic           260 
99  Nicolas Mahut           259 
100 Tobias Kamke            259 

102 Yen-Hsun Lu             255 
104 Bjorn Phau              248 
106 Chris Guccione          247 
107 Ryan Sweeting           246 
112 Ricardo Mello           240 
114 Ilia Marchenko          236 
116 Matt Ebden              233 
120 Alex Bogomolov          228 
121 Michael Russell         226 
133 Marinko Matosevic       221 
141 Dustin Brown            217 
144 Donald Young            216 
145 Tim Smyczek             215 
147 Somdev Devvarman        215 
156 Rik de Voest            212 
174 Marsel Ilhan            208 
196 Flavio Cipolla          202 
261 Rohan Bopanna           109 
319 Pere Riba                55 
354 Ruben Ramirez-Hidalgo    22

Indian Wells Projections

If you’ve found your way here from the Wall Street Journal, welcome! If you don’t know what I’m talking about, go read what Carl Bialik has to say in today’s paper, and in an online follow-up.

I’ve developed a fairly sophisticated algorithm to predict the outcome of tennis matches.  It seeks to remedy some of the flaws in the present ranking system and do a better job of forecasting which players will perform better at certain times, on certain surfaces, against certain opponents.

In the past, I’ve written about the predictiveness of ATP ranking points–which are pretty darn good, for all their flaws.  By just about any standard, however, my system is better.  It’s not perfect–it’s far, far from it–but it does give you a valid second opinion on a player’s abilities at any given time.

The components

My algorithm does several things that traditional ranking points do not.  Here are a few of the components:

  • Points are awarded based on the quality of opponents, not on the round or tournament.  Thus, beating Mikhail Youhzny in the quarterfinals in Moscow is worth the same as the semifinals of Indian Wells.  Losing to a low-ranked player counts against you more than losing against Roger Federer.
     
  • These points, and everything else, are adjusted for surface.  Beating Federer counts for more on hard courts than on clay; beating Juan Carlos Ferrero is the opposite.
     
  • The algorithm generates a set of overall rankings, and it also generates two sets of surface-specific rankings, one for clay courts, one for everything else.  (There isn’t enough data on indoor hard courts or grass courts to treat them separately from any other type of fast court.)  So for Indian Wells, I’m using the hard-court rankings.  Of course, this drastically impacts the chances of many players.
     
  • The points awarded for any tournament are also based on how recent the event was.  Beating Andy Murray last week is more relevant than beating him last year.  Thus, Milos Raonic does better in my rankings (24th overall) than in the ATP rankings (37th).  Sure, it would help if Raonic had played more ATP-level events last year, but my algorithm recognizes that February results count for more than wins from last June.
     
  • My system considers matches from the last two years, not just one year, as the ATP rankings do.  This and the ‘recency’ adjustment remedy what I consider to be the most ridiculous part of the ATP ranking system.  A player can fall dozens of spots in the rankings simply because a tournament result “falls off.”  
     
     So, a match from 51 weeks ago tells us a lot about a player’s current skill level, but a match from 53 weeks ago does not?  In my system, both are counted; a match from 51 weeks ago counts for about 55-60% of the value of a match from last week, while a match from a few weeks earlier counts for a little less.
     
  • Grand slams count for a bit more, but not a lot more.  The main reason for this is that the winner of a five-setter is more likely to the more skilled player than the winner of a three-setter.  A couple of bad bounces in a tiebreak can turn a three-setter against you, but it’s awfully hard to win a five-setter with luck.
     
  • There is a bit of home court advantage in tennis, though with the increasing use of the challenge system (which limits officiating bias), it seems to be decreasing.  It still exists, and it’s considered.
     
  • For whatever reason, it appears that qualifiers and wild cards do worse in ATP main draw matches than my system would otherwise expect.  So they are penalized a small amount.
     
  • Finally, there is a head-to-head component.  It turns out that the head-to-head component can’t improve that much on the rankings-based algorithm, but it does have some value.  So I do consider the history of each matchup, giving a slight edge to the player who has won more matches in the past.  (Depending, of course, on how long ago it was, what surface the matches were on, and so on.)

Whew!

Thanks for reading this far.

As I post this, a few matches have already been played.  But these numbers were generated this morning, after the full draw was released.  It shows the probability that each player reaches each round of the tournament.  I’ll have a little more to say at the bottom.

Player            R64   R32   R16    QF    SF     F     W 
(1)Nadal         100% 94.6% 78.3% 56.3% 40.1% 24.1% 13.0% 
(q)De Voest       54%  3.1%  0.8%  0.1%  0.0%  0.0%  0.0% 
Riba              46%  2.3%  0.5%  0.1%  0.0%  0.0%  0.0% 
(q)Sweeting       42%  8.4%  0.8%  0.1%  0.0%  0.0%  0.0% 
Granollers        58% 17.2%  2.0%  0.5%  0.1%  0.0%  0.0% 
(27)Monaco       100% 74.4% 17.7%  7.5%  2.9%  0.8%  0.2% 
(19)Baghdatis    100% 86.1% 52.9% 21.3% 11.3%  4.7%  1.6% 
(q)Devvarman      43%  5.0%  1.0%  0.1%  0.0%  0.0%  0.0% 
Mannarino         57%  8.9%  2.2%  0.2%  0.0%  0.0%  0.0% 
(q)Cipolla        28%  4.0%  0.7%  0.1%  0.0%  0.0%  0.0% 
Malisse           72% 22.1%  6.6%  1.5%  0.4%  0.1%  0.0% 
(15)Tsonga       100% 73.9% 36.7% 12.2%  5.9%  2.0%  0.6% 

(11)Almagro      100% 81.5% 51.0% 22.4%  7.8%  2.7%  0.8% 
(q)Russell        45%  8.1%  2.0%  0.3%  0.0%  0.0%  0.0% 
Anderson          55% 10.4%  3.1%  0.6%  0.1%  0.0%  0.0% 
Istomin           41% 13.1%  4.6%  1.0%  0.2%  0.0%  0.0% 
Nieminen          59% 24.4%  9.3%  2.8%  0.6%  0.1%  0.0% 
(23)Montanes     100% 62.5% 30.2% 10.8%  3.1%  0.8%  0.2% 
(28)Simon        100% 73.1% 27.2% 14.5%  4.6%  1.4%  0.4% 
Schuettler        40%  8.3%  1.2%  0.3%  0.0%  0.0%  0.0% 
Haase             60% 18.7%  4.0%  1.3%  0.2%  0.0%  0.0% 
(q)Matosevic      29%  2.7%  0.6%  0.1%  0.0%  0.0%  0.0% 
Karlovic          71% 12.7%  5.0%  1.8%  0.4%  0.1%  0.0% 
(6)Ferrer        100% 84.6% 61.9% 44.1% 22.2% 10.8%  4.4% 

(4)Soderling     100% 89.0% 71.0% 46.8% 27.3% 15.8%  7.6% 
Phau              37%  3.0%  0.9%  0.2%  0.0%  0.0%  0.0% 
Berrer            63%  8.0%  3.4%  0.9%  0.2%  0.0%  0.0% 
(q)Smyczek        48% 10.5%  1.1%  0.2%  0.0%  0.0%  0.0% 
Marchenko         52% 13.4%  1.5%  0.3%  0.0%  0.0%  0.0% 
(32)Kohlsch.     100% 76.1% 22.0%  7.7%  2.3%  0.6%  0.1% 
(20)Dolgopolov   100% 68.8% 24.4%  8.9%  2.8%  0.9%  0.3% 
Hanescu           39% 10.5%  1.8%  0.3%  0.0%  0.0%  0.0% 
Seppi             61% 20.8%  4.9%  1.1%  0.2%  0.0%  0.0% 
Stepanek          30% 12.1%  6.7%  2.3%  0.8%  0.2%  0.1% 
(PR)Del Potro     70% 46.4% 35.6% 20.8% 11.1%  6.1%  2.9% 
(14)Ljubicic     100% 41.6% 26.5% 10.6%  4.4%  1.7%  0.5% 

(9)Verdasco      100% 86.2% 60.7% 23.2% 10.1%  4.2%  1.3% 
(WC)Berankis      52%  7.4%  2.2%  0.3%  0.0%  0.0%  0.0% 
(q)Bogomolov      48%  6.3%  1.7%  0.2%  0.0%  0.0%  0.0% 
Tipsarevic        71% 34.2% 12.2%  3.3%  0.9%  0.2%  0.0% 
Kamke             29%  8.2%  1.7%  0.2%  0.0%  0.0%  0.0% 
(21)Querrey      100% 57.6% 21.5%  5.8%  1.5%  0.4%  0.1% 
(25)Robredo      100% 70.8% 16.9%  7.6%  2.2%  0.6%  0.1% 
Zverev            62% 20.9%  2.9%  0.8%  0.1%  0.0%  0.0% 
(q)Ebden          38%  8.3%  0.8%  0.2%  0.0%  0.0%  0.0% 
(q)Young          37%  2.2%  0.6%  0.1%  0.0%  0.0%  0.0% 
Starace           63%  6.3%  2.6%  0.7%  0.1%  0.0%  0.0% 
(5)Murray        100% 91.4% 76.3% 57.7% 35.6% 21.5% 11.1% 

(8)Roddick       100% 84.9% 63.0% 43.4% 21.7%  8.7%  3.9% 
(WC)Blake         63% 11.3%  4.5%  1.4%  0.3%  0.0%  0.0% 
(q)Guccione       37%  3.8%  1.1%  0.2%  0.0%  0.0%  0.0% 
Ram-Hidalgo       34%  5.1%  0.5%  0.1%  0.0%  0.0%  0.0% 
Mello             66% 16.4%  2.7%  0.6%  0.1%  0.0%  0.0% 
(30)Isner        100% 78.4% 28.1% 12.6%  3.6%  0.8%  0.2% 
(18)Gasquet      100% 73.4% 34.8% 14.2%  4.6%  1.2%  0.3% 
Cuevas            72% 22.8%  6.7%  1.7%  0.3%  0.0%  0.0% 
Andujar           28%  3.9%  0.5%  0.1%  0.0%  0.0%  0.0% 
Benneteau         46% 16.1%  7.1%  2.3%  0.6%  0.1%  0.0% 
Lopez             54% 18.9%  9.0%  3.1%  0.8%  0.2%  0.0% 
(10)Melzer       100% 65.0% 41.9% 20.4%  8.2%  2.7%  0.9% 

(16)Troicki      100% 82.3% 40.1% 10.5%  4.3%  1.1%  0.3% 
(q)Bopanna        30%  3.1%  0.3%  0.0%  0.0%  0.0%  0.0% 
(WC)Tomic         70% 14.6%  3.1%  0.3%  0.1%  0.0%  0.0% 
Giraldo           55% 14.6%  6.0%  1.0%  0.3%  0.0%  0.0% 
Gim-Traver        45% 10.9%  3.8%  0.6%  0.1%  0.0%  0.0% 
(24)Llodra       100% 74.5% 46.7% 15.8%  7.1%  2.2%  0.7% 
(31)Gulbis       100% 56.7% 12.5%  6.0%  2.3%  0.6%  0.1% 
Hewitt            75% 37.3%  7.5%  3.7%  1.4%  0.4%  0.1% 
Lu                25%  6.0%  0.6%  0.1%  0.0%  0.0%  0.0% 
Mayer             66% 12.7%  7.2%  3.8%  1.6%  0.4%  0.1% 
Golubev           34%  3.7%  1.5%  0.5%  0.1%  0.0%  0.0% 
(3)Djokovic      100% 83.6% 70.8% 57.7% 42.5% 24.8% 15.4% 

(7)Berdych       100% 84.1% 64.8% 33.2% 12.6%  5.6%  2.3% 
Kukushkin         48%  7.6%  2.8%  0.5%  0.1%  0.0%  0.0% 
Kubot             52%  8.3%  3.1%  0.5%  0.1%  0.0%  0.0% 
De Bakker         48% 20.6%  5.3%  1.3%  0.2%  0.0%  0.0% 
Becker            52% 21.9%  5.9%  1.5%  0.2%  0.1%  0.0% 
(26)Bellucci     100% 57.4% 18.1%  4.9%  0.9%  0.2%  0.0% 
(17)Cilic        100% 81.7% 37.2% 20.7%  6.6%  2.6%  1.0% 
Gabashvili        49%  9.6%  1.5%  0.3%  0.0%  0.0%  0.0% 
Serra             51%  8.7%  1.2%  0.3%  0.0%  0.0%  0.0% 
Davydenko         84% 49.6% 32.8% 21.0%  8.7%  4.4%  2.1% 
Fognini           16%  3.5%  1.1%  0.3%  0.1%  0.0%  0.0% 
(12)Wawrinka     100% 47.0% 26.2% 15.5%  5.2%  2.2%  0.9% 

(13)Fish         100% 64.5% 41.9% 13.0%  6.4%  2.7%  1.1% 
(WC)Raonic        81% 33.0% 17.9%  4.3%  1.7%  0.6%  0.2% 
Ilhan             19%  2.5%  0.6%  0.0%  0.0%  0.0%  0.0% 
(WC)Harrison      26%  5.7%  1.0%  0.1%  0.0%  0.0%  0.0% 
Chardy            74% 32.1% 12.0%  2.4%  0.8%  0.2%  0.1% 
(22)Garcia-Lopez 100% 62.2% 26.6%  5.9%  2.3%  0.8%  0.2% 
(29)Chela        100% 59.2%  7.7%  2.6%  0.7%  0.2%  0.0% 
Petzschner        66% 30.5%  3.4%  1.1%  0.3%  0.0%  0.0% 
Brown             34% 10.3%  0.7%  0.1%  0.0%  0.0%  0.0% 
Andreev           41%  3.0%  1.4%  0.4%  0.1%  0.0%  0.0% 
Nishikori         59%  6.4%  3.7%  1.4%  0.4%  0.1%  0.0% 
(2)Federer       100% 90.6% 83.1% 68.7% 52.4% 36.7% 24.5%

You’ll probably notice right off that Federer and Djokovic have the best chances of winning. Indeed, they are the top two players on hard courts, according to my rankings. Yes, Nadal has won the slams lately, but he has also lost to a few players he shouldn’t have (Baghdatis, Melzer, Garcia-Lopez) in the recent past. I personally wouldn’t put money on Federer over Nadal in the final, but my algorithm disagrees.

A few other players my system likes are Juan Martin Del Potro, Nikolay Davydenko, and Marcos Baghdatis. It picks out some players for scoring wins over top-ranked players. It likes Del Potro both because of his strong record in the last few weeks and because the algorithm still considers his torrid summer of 2009, leading up to his U.S. Open win.

One more thing, and then I’ll shut up for now. In the first-round matches, there are very few that stray beyond a 70/30 split. Even Tomic-Bopanna is 70/30, and Bopanna barely plays singles. The narrow divides are partly because no top players are involved in the first round, but it also shows you the depth of the men’s game — even someone ranked outside of the top 150, like Flavio Cipolla, has a decent chance of advancing.

Of course, Flavio doesn’t have quite the same odds against Tsonga, and you can tell from Nadal’s second round odds that neither Pere Riba nor Rik de Voest stand much of a chance against him.

Enjoy the tennis … and the numbers.