The High-Quality Cincinnati Draw

It’s tough to imagine a Master’s series event featuring a higher-quality field than the one assembled in Cincinnati this week.  With the exception of Robin Soderling, virtually every “name” player is present.  Just as importantly, almost all of the players awarded wild cards are legitimate competitors at this level.  The same is true of most of the seven men who qualified.

For tennis fans, it’s an enjoyable outcome: With the possible exception of Robby Ginepri, everyone present “deserves” to be here.  The event gave the other three wild cards to Ryan Harrison, Grigor Dimitrov, and James Blake, three men inside the top 85 who excel on hard courts.  Four of the top seeds in qualifying advanced to the main draw, all of whose current rankings put them right on the cusp of making the cut in the first place.

All this made me wonder: How does the Cincinnati draw compare to other 56-player Masters fields?  Is Cinci always this strong?

I’ve previous looked at the field quality of ATP 250s, so it was a small step to point the guns at the bigger tourneys.  Here are all 48- and 56-draw Masters events since 2009, along with the average entry rank and median entry rank of players in the field, sorted by the latter:

Year  Event        Field  AvgRank  MedRank  
2011  Madrid          56     37.7     30.0  
2010  Paris           48     38.1     30.5  
2011  CINCINNATI      56     50.1     31.5  
2010  Shanghai        56     56.5     31.5  
2009  Paris           48     57.5     31.5  
2009  Cincinnati      56     38.5     32.0  
2009  Montreal        56     83.6     32.5  
2010  Cincinnati      56     38.5     33.5  
2009  Rome            56     42.0     33.5  
2009  Shanghai        56     54.8     33.5  
2011  Rome            56     42.2     34.5  
2009  Madrid          56     43.6     34.5  
2011  Montreal        56     50.7     35.5  
2009  Monte Carlo     56     45.1     36.5  
2011  Monte Carlo     56     51.9     36.5  
2010  Rome            56     43.1     38.5  
2010  Toronto         56     57.7     40.5  
2010  Madrid          56     59.5     43.0  
2010  Monte Carlo     56     50.6     43.5

There’s not a huge difference in quality–after all, players are required to show up for most of these events–but there is a noticeable differentiation into “haves” and “have-nots.”  Of course Monte Carlo is near the bottom, as it is not mandatory.  Rome is required, but it does get skipped.  Madrid is an interesting case, as this year’s new schedule meant all the best players showed up, while last year, it was near the bottom of the list.

Setting aside Paris, which is near the top of the list because its field has eight fewer players, Cinci appears to consistently offer one of the best Masters fields.  This makes sense, as even if it weren’t a required stop on the tour, it’s a perfectly scheduled warm-up for the U.S. Open.

How Long Does the Server’s Advantage Last?

In professional tennis, it’s a given that the server has an advantage.  The size of that advantage depends on the abilities of the two players and the surface, but especially in men’s tennis, it’s a sizable edge.  On average, a server in an ATP match starts a point with a roughly 65% chance of winning.

But how long does it last?  It seems that, at some stage in the rally, the server’s advantage has disappeared.  Four or five strokes in, the server may still be benefiting from an off-balance return.  But by ten strokes, one would assume that the rally is neutral–that the advantage conferred by serving has evaporated.

As usual with tennis analysis, one question begets several more.  Does the server’s advantage last longer on faster surfaces?  Do women settle into “neutral” rallies sooner than men do?  Do dominating players, like this year’s edition of Novak Djokovic, take away the server’s advantage faster than the average player?

Using the rally counts provided by Pointstream at the last three majors, we can start to answer these questions.

Neutralizing the serve

The first step is to take all the matches we have rally-count data for, and average them out.  Then, for each point length, we calculate the odds that the server wins a point of at least that length.  So, for instance, we look at all points of five shots or more, and figure out how many of those the server wins.

Each one of these numbers is biased, because a rally of exactly five-strokes is, by definition, won by the server.  The server either hits a winner on his third shot (the fifth overall), or the returner makes an error attempting to hit his own third shot (the sixth overall).  Thus, if we look at all points of at least five strokes, the exactly five-stroke rallies virtually guarantee that the server will have the advantage.

However, the same reasoning shows us that a six-stroke rally will be biased in favor of the returner.  When we do the math for at-least-five, at-least-six, at-least-seven, and so on, we’ll see a yo-yo effect.  When the biases have equal effect, that means the serve is neutralized.

Here are the results for the approximately 150 grand slam matches with Pointstream data so far this year:

At least…  Win%  Notes                          
0          63%   before point begins            
1          66%   if serve goes in               
2          50%   if serve is returned           
3          60%   if server makes second shot    
4          46%   if returner makes second shot  
5          58%                                  
6          45%                                  
7          57%                                  
8          44%                                  
9          56%                                  
10         44%                                  
11         56%                                  
12         43%                                  
13         56%                                  
14         43%                                  
15         56%

In the table, “Win%” refers to the server’s chance of winning the point.  The biases even out somewhere between the 4th and 8th shot, meaning that in that zone, the server’s advantage is neutralized.

While the server retains the advantage at least until the fourth shot, it is interesting to see how quickly it decays.  Dropping from 66% upon making a serve to 56% once the advantage is neutralized, it loses more than half the difference between the first and third shots.  Thus, the returner doesn’t negate the server’s advantage simply by getting the ball back in play, but he does take a large step toward doing so.

Does surface matter?

As usual, it sure does.  The numbers for the Australian and French Opens are similar, and since they make up 2/3 of the data set, they are close to the aggregate numbers shown above.  But Wimbledon, as is so often the case, seems to play by a different set of rules:

At least…    Wimby    Austr    French  
0            66%        62%       62%  
1            68%        64%       67%  
2            52%        50%       48%  
3            62%        59%       58%  
4            48%        46%       45%  
5            61%        57%       57%  
6            47%        44%       44%  
7            61%        55%       56%  
8            47%        44%       44%  
9            59%        55%       54%  
10           47%        43%       43%  
11           60%        54%       55%  
12           46%        43%       43%  
13           59%        55%       55%  
14           43%        45%       42%  
15           56%        54%       56%

The biases don’t balance out until the very bottom–at 14 or more shots!  That’s only about 3% of points.  I’m not sure how to explain this, except perhaps psychologically, that on grass (considered the best surface for servers), players are less successful in return games simply because that’s what they expect to happen.  Regardless of surface, I can’t understand why else the server’s advantage would persist into double-digit shot counts.

What about the ladies?

WTA players (on average) start each service point with a smaller advantage than their male counterparts, and as it turns out, that advantage evaporates more quickly.

We saw a moment ago that, by putting the return in play, an ATP returner gives himself a 50% chance of winning the point–at least until his opponent hits another shot.  Women, however, knock the server’s winning percentage down to 47% by making the return.

The returner clearly neutralizes the point by the fourth stroke overall, and–here’s the good part–takes over a slight advantage herself by her third shot, the sixth stroke overall.  By making that sixth shot, the returner has a 57% chance of winning the point, while the server will never reach 57% again.  The advantage is only a percentage or two, but from the sixth stroke on, the returner has the edge.

Finally, presenting Novak Djokovic

Pointstream has tracked 17 of Djokovic’s slam matches this year, giving us a good set of data to work with.  When a man is having a season like this one–in large part because of his return game–it’s fascinating to see how comprehensively he is outplaying his opponents.

In the same terms as the tables above, here are Djokovic’s serve and return points across those 17 matches.  The return points are shown with the server’s winning percentages:

At least…    ND Sv    ND Ret  
0            70%         57%  
1            72%         60%  
2            59%         42%  
3            68%         50%  
4            56%         39%  
5            68%         49%  
6            57%         38%  
7            68%         47%  
8            58%         38%  
9            68%         45%  
10           55%         34%  
11           65%         44%  
12           55%         33%  
13           66%         38%  
14           52%         29%  
15           65%         40%

After seeing the averages above, you might reasonably conclude that these numbers are out of this world.  Even with the bias of 4-, 6-, and 8-stroke rallies, as discussed above, Djokovic still maintains an edge.  For everyone else, once fifth or sixth shot is struck, the point is a 50/50 proposition.  For Novak, it’s at least 60/40 in his favor.

The amazing stats are on his return.  When he gets his return back in play, he’s more than likely to win the point.  That may not surprise anyone who has watched Djokovic play this year, but consider how remarkable that is in the context of modern men’s tennis.  By the 8th stroke or so, he’s back to the 60/40 odds of the service points that turn into longer rallies.

Thanks to Carl Bialik for suggesting this topic.

The Most (and Least) Consistent Players on the ATP Tour

“Consistency” is one of the many terms that commentators frequently use but rarely define.  It’s often misused, too: we say we want a player to be more consistent, when we really just want him to stop playing badly.

To me, consistency for a tennis player is similar to the notion of “playing up to his ranking.”  In other words, if a player is consistent, he usually beats players ranked lower, and he usually loses to players ranked higher.  No player is perfect in this regard, but clearly, some are much more reliable than others.

A recent poster boy for inconsistency is Ernests Gulbis.  At Roland Garros, he lost to Blaz Kavcic, ranked 82nd in the world.  That was on clay, a surface on which Gulbis had posted some excellent results the previous year.  Two months later, in Los Angeles, he beat Juan Martin del Potro, someone he shouldn’t have even challenged on a hard court.

Quantifying consistency

With my player rankings and match prediction system, I’m able to assign a win probability to each player for every match.  For instance, when Ivan Dodig beat Nadal last week, I had given him a 14.4% chance of doing so.  As you might imagine, that’s a major upset–as I wrote the next day, it was the 10th-biggest upset of the season.

In these terms, an ideally consistent player will never be on either end of an upset.  If he is the favorite, he wins; if he is the underdog, he loses.  In practice, no tour-level player accomplishes this, though over the last two years, Florent Serra and Eduardo Schwank have come very close.

I’ve come up with a metric to measure consistency.  This is how it works:

  • Gather a list of all ATP-level matches for the desired time period.  (Today, I’m using everything from January 2010 through Montreal last week.)
  • Eliminate matches that ended in retirement or walkover, as well as those where we don’t have enough information to make an educated prediction.  (e.g. the first few comeback matches of Tommy Haas, or one with a wildcard playing his first professional match.)
  • For each player, count how many matches he played.
  • For each player, find the matches where he was the favorite and lost, or was the underdog and won.
  • For each of those matches, take the probability than the eventual winner would win (e.g. 18% — always under 50%), multiply by 100 (e.g. 18, not 18%),  subtract it from 50 (e.g. 50 – 18 = 32), and square the result (e.g. 32*32 = 1024).
  • Sum all of the squares, then divide by the number of total matches–not just the ones where the favorite lost.

Whew!  In something more like layman’s terms, we’re taking all the upsets a player was involved in, coming up with a number to represent how big (or surprising) the upset was, then averaging the results.

Using this method, we give big upsets considerably more weight than mini-upsets.  If a player had a 45% chance of winning a match and ends up winning, it barely counts as an upset–and this system treats it accordingly.  By dividing by the total number of matches, we give consistency credit to players who win the matches they’re “supposed to” win, and lose those they are supposed to lose.

Most importantly, the numbers this algorithm spits out are completely believable, matching up well with the conventional wisdom of which players are consistent and inconsistent.

The consistency of the top ten

The most consistent player on the tour, since the beginning of 2010, has been … Florent Serra.  Amazingly, Igor Kunitsyn comes in second.  But I doubt many of you care much about the consistency of guys like that.

Let’s start with the current top 10, ranked from most to least consistent:

Player              Upsets  Matches    Up%  UpsetScore  
David Ferrer            25      119  21.0%          55  
Rafael Nadal            16      131  12.2%          68  
Novak Djokovic          18      119  15.1%          69  
Roger Federer           21      123  17.1%          69  
Jo-Wilfried Tsonga      20       80  25.0%          75  
Mardy Fish              19       77  24.7%          82  
Tomas Berdych           32      113  28.3%         106  
Gael Monfils            24       89  27.0%         107  
Robin Soderling         23      115  20.0%         130  
Andy Murray             24       97  24.7%         151

The relevant column is the rightmost, “UpsetScore,” which is the result of the algorithm described above.  Ferrer has been part of more upsets than any of the top three (“Up%”), but his upsets are more minor.  Except for losses to Ivo Karlovic and Jarkko Nieminen early in the year on hard courts, Ferrer has not lost a match he had a 60% or better chance of winning.

The two ends of this list certainly line up with what I would have expected: Ferrer and Nadal are rock-solid (last week’s loss to Dodig notwithstanding), while Soderling and Murray both can be picked off by anybody, and frequently threaten higher-ranked players.

Right now, you may be tempted to put Djokovic higher on the list–after all, he’s ranked #1 and he’s beating everybody.  However, in the slightly longer term of 20 months, his movement around the top three has included some unexpected results, like losing to Ljubicic at Indian Wells last year, and victories over Federer and Nadal before his ranking suggested he would do so.

Tour wild cards

Outside of the top 10, there are a handful of players who are almost impossible to predict.  Some names that come to mind are Marcos Baghdatis, Ernests Gulbis, and Nikolay Davydenko, men who can take out one of the top three on a good day (well, maybe not Gulbis), but can lose to a qualifier on the next.

Player                 Upsets  Matches  Up%  UpsetScore  
Nikolay Davydenko          32       73  44%         273  
Marin Cilic                27       91  30%         180  
Marcos Baghdatis           38       89  43%         177  
Olivier Rochus             20       52  38%         164  
Milos Raonic               16       41  39%         164  
Juan Martin del Potro      11       48  23%         154  
Andy Murray                24       97  25%         151  
Jurgen Melzer              28       96  29%         150  
Fernando Verdasco          40      104  38%         150  
Ivan Ljubicic              27       69  39%         149  
Florian Mayer              30       72  42%         147  
Samuel Querrey             26       66  39%         146  
Andrei Goloubev            24       55  44%         143  
Ernests Gulbis             22       69  32%         140  
Jeremy Chardy              31       65  48%         133  
Juan Monaco                26       73  36%         131  
Robin Soderling            23      115  20%         130  
Michael Llodra             26       67  39%         130  
Rainer Schuettler          15       42  36%         119  
Mikhail Youzhny            25       78  32%         116

The “upset score” number tells the story for Davydenko.  The man who beat Nadal at the beginning of the year and threatened Djokovic last week recently suffered defeat at the hands of Cedrik-Marcel Stebe (twice!) and Antonio Veic.

While no one is in Davydenko’s league, names like Cilic, Baghdatis, Murray, and Verdasco seem appropriate.  Verdasco, along with Melzer and Milos Raonic suggest a flaw in this approach: the algorithm reads very fast improvement or decline as inconsistency, which isn’t quite right.  Yes, Raonic has shocked the tennis world repeatedly this season, but he hasn’t mixed in too many disastrous losses alongside the surprise upsets.  I tinkered with ways to include that in the model, but nothing worked very well.

A couple more interesting notes from the “most inconsistent” players are found in the upset percentage column.  Guys like Davydenko, Baghdatis, Mayer, Goloubev, and Chardy are involved in upsets nearly half the time.  Chardy is highest in that category.  In fact, if I expanded the study to challenger events, he might rocket to the top of this list, as he plays quite a few, and often manages to lose against players outside the top 100.

The consistent ones

The flip side is considerably less star-studded.  In the 20 most-consistent players of the last 19-20 months, Ferrer is the only top-10 guy present, though #11 Nicolas Almagro is there as well.

Here’s my seat-of-the-pants theory.  In this sense, “consistent” isn’t good.  Yes, “consistent” sounds good, especially when “inconsistent” means Davydenko losing to Antonio Veic or Mayer falling to Federico del Bonis.  But inconsistent means Davy beating Federer and Mayer beating Soderling.  So, the players who show up on as “most consistent” are in fact consistent, but they are also mediocre.  Their consistency (perhaps a mental advantage) has helped them move up from the top 200 to the top 50 or 100, but that’s all they can do.

Ferrer and Almagro are good examples of this, actually.  Neither has the weaponry that makes commentators say, “This guy could be number one!”  But they’ve earned their rankings by regularly reaching the quarters and semis of tournaments, not suffering the boneheaded losses that afflict the likes of Cilic and Baghdatis.

All that said, here’s the list:

Player            Upsets  Matches  Up%  UpsetScore  
Florent Serra         11       56  20%          23  
Igor Kunitsyn         14       40  35%          33  
Ilia Marchenko        14       46  30%          40  
Potito Starace        28       81  35%          46  
Victor Hanescu        26       77  34%          50  
Tobias Kamke          12       41  29%          52  
Andreas Seppi         24       81  30%          53  
Julien Benneteau      23       59  39%          53  
Viktor Troicki        25      101  25%          54  
David Ferrer          25      119  21%          55  
Fabio Fognini         18       71  25%          55  
Pere Riba             13       41  32%          56  
Lukas Lacko           14       44  32%          57  
Igor Andreev          17       62  27%          58  
Lukasz Kubot          26       63  41%          59  
Nicolas Almagro       22      112  20%          59  
Frederico Gil         15       40  38%          60  
Denis Istomin         25       76  33%          65  
Jarkko Nieminen       25       74  34%          66  
John Isner            21       82  26%          67

These lists hardly represent the final word on who is or is not consistent–for one thing, I haven’t said anything about consistency within matches, which may be a completely separate issue.  But this approach does, I think, provide some insight into who is more likely to be part of an upset, and suggests that consistency might not be such a good thing after all.

ATP Cincinnati Predictions

If last week’s tournament in Montreal taught us anything, it’s that predicting the outcome of ATP matches is a fool’s errand.  With that in mind, let’s see what the draw has in store for us in Cinci!

The draw this week is what the Master’s series is all about.  With the exception of a couple of late withdrawals (Tsonga?) that may yet come down the pike, nearly every top player in the men’s game is in Cincinnati.  Andy Roddick is trying to return from injury; David Ferrer makes his summer hard-court debut, and we’re already set for a Federer/del Potro showdown in the second round.

Del Potro’s mere presence makes every tournament a little more interesting.  He’s laid a couple of eggs recently, losing to Gulbis and Cilic, but he tore up the spring hard court circuit and lost only to the best of the best on clay.  My ranking system still gives him a lot of respect, keeping him within the top five, which makes Federer’s route to the semifinals (heck, the third round!) look particularly challenging.

Djokovic (who, once again, is in Fed’s half) could face a slew of Americans on their home turf.  His probable second-round opponent is Ryan Harrison, who I favor heavily over Juan Ignacio Chela.  After that, it’s easy to see John Isner in the third round, and possibly Andy Roddick in the quarters.  It’s theoretically possible, but a little less likely that another American, James Blake, will make it through the semis to be Novak’s opponent in that round.

Here is my full projection.  For purity’s sake, it doesn’t reflect the results of today’s two matches, in which Delpo and Blake both advanced.

Player                        R32    R16     QF         W  
(1)Novak Djokovic          100.0%  90.1%  74.4%    29.61%  
(WC)Ryan Harrison           71.5%   8.6%   3.2%     0.07%  
Juan Ignacio Chela          28.5%   1.3%   0.3%     0.00%  
(q)Radek Stepanek           43.6%  18.2%   3.0%     0.09%  
John Isner                  56.4%  24.7%   5.0%     0.24%  
Andrey Golubev              23.2%   8.4%   1.0%     0.02%  
(16)Stanislas Wawrinka      76.8%  48.7%  13.0%     1.41%  

Player                        R32    R16     QF         W  
(11)Andy Roddick            63.7%  43.3%  23.0%     1.34%  
Philipp Kohlschreiber       36.3%  20.8%   9.1%     0.24%  
Juan Carlos Ferrero         23.9%   4.3%   0.9%     0.00%  
Feliciano Lopez             76.1%  31.6%  13.3%     0.29%  
Ivan Dodig                  39.4%  10.9%   3.9%     0.05%  
(q)Ernests Gulbis           60.6%  24.3%  11.6%     0.32%  
(6)Gael Monfils            100.0%  64.9%  38.2%     2.22%  

Player                        R32    R16     QF         W  
(3)Roger Federer           100.0%  58.9%  44.5%    10.38%  
Juan Martin del Potro       82.9%  38.8%  27.9%     5.00%  
Andreas Seppi               17.1%   2.4%   0.8%     0.01%  
(WC)James Blake             23.3%   8.1%   1.1%     0.01%  
Marcos Baghdatis            76.7%  46.2%  15.2%     1.02%  
Fabio Fognini               26.1%   7.6%   0.9%     0.01%  
(14)Viktor Troicki          73.9%  38.1%   9.6%     0.40%  

Player                        R32    R16     QF         W  
(9)Nicolas Almagro          62.0%  32.1%  14.0%     0.25%  
Albert Montanes             38.0%  13.8%   4.0%     0.03%  
Ivo Karlovic                32.6%  14.0%   4.6%     0.04%  
Florian Mayer               67.4%  40.2%  18.2%     0.56%  
Tommy Haas                  13.2%   0.8%   0.1%     0.00%  
Juan Monaco                 86.8%  28.8%  13.9%     0.21%  
(8)Tomas Berdych           100.0%  70.4%  45.3%     2.43%  

Player                        R32    R16     QF         W  
(5)David Ferrer            100.0%  75.5%  44.7%     2.52%  
(q)Marsel Ilhan             38.3%   7.5%   1.9%     0.00%  
(WC)Grigor Dimitrov         61.7%  17.0%   6.0%     0.05%  
Janko Tipsarevic            70.3%  31.7%  15.3%     0.43%  
(q)Edouard Roger-Vasselin   29.7%   8.2%   2.2%     0.01%  
Jurgen Melzer               44.2%  25.0%  12.0%     0.35%  
(10)Gilles Simon            55.8%  35.2%  17.9%     0.72%  

Player                        R32    R16     QF         W  
(15)Jo-Wilfried Tsonga      57.2%  46.4%  19.8%     2.28%  
Marin Cilic                 42.8%  32.7%  12.2%     0.85%  
(q)Alex Bogomolov Jr        57.7%  13.7%   3.0%     0.04%  
(WC)Robby Ginepri           42.3%   7.2%   1.1%     0.01%  
(q)Kei Nishikori            46.5%  10.6%   4.5%     0.14%  
David Nalbandian            53.5%  13.1%   5.9%     0.27%  
(4)Andy Murray             100.0%  76.3%  53.6%    10.62%  

Player                        R32    R16     QF         W  
(7)Mardy Fish              100.0%  63.6%  42.6%     3.59%  
Nikolay Davydenko           68.5%  28.8%  16.7%     0.99%  
Sergiy Stakhovsky           31.5%   7.6%   3.0%     0.04%  
Xavier Malisse              55.4%  21.5%   6.7%     0.11%  
Kevin Anderson              44.6%  15.4%   4.5%     0.04%  
Alexandr Dolgopolov         47.8%  29.6%  12.4%     0.39%  
(12)Richard Gasquet         52.2%  33.4%  14.0%     0.52%  

Player                        R32    R16     QF         W  
(13)Mikhail Youzhny         54.6%  28.3%   7.7%     0.41%  
Michael Llodra              45.4%  20.5%   4.9%     0.17%  
Thomaz Bellucci             36.9%  15.1%   3.1%     0.05%  
Fernando Verdasco           63.1%  36.0%  10.4%     0.61%  
Guillermo Garcia-Lopez      60.5%  13.3%   6.3%     0.19%  
(q)Julien Benneteau         39.5%   4.9%   1.8%     0.03%  
(2)Rafael Nadal            100.0%  81.8%  65.8%    18.36%

Welcome To Your 30s, Roger Federer

This week, Roger Federer turned 30.  In some sports, that age can represent peak performance; in tennis, it is often a signal that the end is near.

I’m sure Roger wouldn’t appreciate being treated as an age-grouper, but viewing him that way gives us more evidence of his greatness.  Regardless of whether he returns to the top of the ATP rankings, it would seem that he’ll remain the #1 thirty-something for as long as he wants to keep playing.

Here is the current list of best 30-somethings, based on this Monday’s ATP rankings. The only achievement that exceeds Fed’s domination of the 30-and-over set is Ivan Ljubicic’s standing among 32-year-olds. [Edit: That is, if you ignore Radek Stepanek, who is older and higher-ranked.  Never mind…]

3    Roger Federer       SUI    8/8/81
18   Jurgen Melzer       AUT   5/22/81
22   Juan Ignacio Chela  ARG   8/30/79
27   Radek Stepanek      CZE  11/27/78
30   Nikolay Davydenko   RUS    6/2/81
31   Ivan Ljubicic       CRO   3/19/79
33   Michael Llodra      FRA   5/18/80
47   Albert Montanes     ESP  11/26/80
49   Xavier Malisse      BEL   7/19/80
50   Jarkko Nieminen     FIN   7/23/81
63   Potito Starace      ITA   7/14/81
64   Victor Hanescu      ROU   7/21/81
79   Olivier Rochus      BEL   1/18/81
85   Michael Berrer      GER    7/1/80
86   James Blake         USA  12/28/79
88   Eric Prodon         FRA   6/27/81
91   Ricardo Mello       BRA  12/21/80
98   Diego Junqueira     ARG  12/28/80
100  Michael Russell     USA    5/1/78
103  Marc Gicquel        FRA   3/30/77

Andy Murray and The Worst Upsets of the Year

On Tuesday in Montreal, Andy Murray played an ugly, listless match against world #35 Kevin Anderson, losing 6-3 6-1.  While Murray has played some solid matches this year and is in no immediate danger of losing his top-four ranking, the Anderson loss is hardly the first disaster of his season.  Back in Indian Wells and Miami, he managed to lose to Donald Young and Alex Bogomolov in successive matches.  Ouch.

Using my rankings and match projection system, I’ve generated win probabilities for every ATP match of the season.  Combined with match outcomes, that allows us to find the upsets that were least expected on that surface, at that time.

Pre-match, my system gave Anderson a 16.3% chance of beating Murray–only a smidge better than Dodig’s 14.4% against Nadal.  (My system has never given the South African much credit; his hard-court ranking right now is #58.)  In fact, Anderson was the 4th-biggest underdog going into the 2nd round, ahead only of Dodig, Michael Russell, and Vasek Pospisil.

As it turns out, Anderson’s victory was the 14th-biggest upset win of the ATP season.  (I took out retirements and “comeback players,” like Fernando Gonzalez and Tommy Haas, whose rankings aren’t very predictive.)  That’s 14 out of nearly 1,700.

But, as you might guess, 14th-best of the season isn’t enough to be 1st with Murray on the losing end.  The Murray loss to Young in March is the biggest upset of the year–Donald entered the match with an 8.6% chance of winning.  The Bogomolov match comes in 4th overall; the American had a 10.1% chance before play began.

Edit: This is what I get for writing a draft the night before!  Dodig’s upset victory comes in tied for #10 on the season, pushing Murray/Anderson down one more spot on the list.  Nadal will go home having suffered the biggest upset of the Rogers Cup, though he played far and away better than Murray did to achieve the same outcome.

The biggest upsets of the year

I couldn’t possibly give you those numbers without following through with a complete table.  Here are the 36 matches where the winner entered the match with less than a 20% chance of winning.  This list is through last week’s matches, so it doesn’t yet show Murray’s latest meltdown and Dodig’s shocker.

(This site doesn’t show wide tables very well; click here for a clearer version.)

P(UPSET)  WINNER                 LOSER               TOURNEY          SCORE
 8.6%     Donald Young           Andy Murray         Indian Wells     7-6(4) 6-3
 9.4%     Bernard Tomic          Robin Soderling     Wimbledon        6-1 6-4 7-5
 9.7%     Jimmy Wang             Igor Kunitsyn       Newport          4-6 7-5 6-2
10.1%     Alex Bogomolov         Andy Murray         Miami            6-1 7-5
11.6%     Stephane Robert        Tomas Berdych       French Open      3-6 3-6 6-2 6-2 9-7
13.1%     Milos Raonic           Mikhail Youzhny     Australian Open  6-4 7-5 4-6 6-4
13.8%     Denis Kudla            Grigor Dimitrov     Newport          6-1 6-4
14.0%     James Ward             Stanislas Wawrinka  Queen's Club     7-6(3) 6-3
14.1%     Federico del Bonis     Florian Mayer       Stuttgart        6-2 6-3
14.4%     Jo-Wilfried Tsonga     Rafael Nadal        Queen's Club     6-7(3) 6-4 6-1               

P(UPSET)  WINNER                 LOSER               TOURNEY          SCORE
15.0%     Jan Hernych            Thomaz Bellucci     Australian Open  6-2 6-7(11) 6-4 6-7(3) 8-6
15.2%     Nikolay Davydenko      Rafael Nadal        Doha             6-3 6-2
15.6%     Andrey Kuznetsov       Marcos Baghdatis    Casablanca       6-4 4-6 6-4
16.4%     Flavio Cipolla         Andy Roddick        Madrid Masters   6-4 6-7(7) 6-3
16.6%     Lukas Rosol            Jurgen Melzer       French Open      6-7(4) 6-4 4-6 7-6(3) 6-4
16.8%     Antonio Veic           Nikolay Davydenko   French Open      3-6 6-2 7-5 3-6 6-1
17.0%     Sergei Bubka Jr.       Daniel Gimeno       Doha             6-0 6-3
17.1%     Leonardo Mayer         Marcos Baghdatis    French Open      7-5 6-4 7-6(6)
17.6%     Ivan Dodig             Robin Soderling     Barcelona        6-2 6-4
17.6%     Michael Yani           Dudi Sela           Newport          7-6(5) 6-3                   

P(UPSET)  WINNER                 LOSER               TOURNEY          SCORE
17.7%     Frank Dancevic         Feliciano Lopez     Johannesburg     6-7(5) 6-2 7-6(8)
17.8%     Alexander Dolgopolov   Robin Soderling     Australian Open  1-6 6-3 6-1 4-6 6-2
17.9%     James Ward             Samuel Querrey      Queen's Club     3-6 6-3 6-4
17.9%     Jan Hernych            Sergey Stakhovsky   Halle            6-3 6-7(5) 7-6(8)
18.2%     Jan Hernych            Denis Istomin       Australian Open  6-3 6-4 3-6 6-2
18.3%     Jo-Wilfried Tsonga     Roger Federer       Wimbledon        3-6 6-7(3) 6-4 6-4 6-4
19.0%     Milos Raonic           Fernando Verdasco   San Jose         7-6(6) 7-6(5)
19.0%     Lukasz Kubot           Gael Monfils        Wimbledon        6-3 3-6 6-3 6-3
19.2%     Federico del Bonis     Sergey Stakhovsky   Stuttgart        6-4 6-3
19.3%     Lukasz Kubot           Nicolas Almagro     French Open      3-6 2-6 7-6(3) 7-6(5) 6-4    

P(UPSET)  WINNER                 LOSER               TOURNEY          SCORE
19.3%     Bernard Tomic          Feliciano Lopez     Australian Open  7-6(4) 7-6(3) 6-3
19.7%     Thomaz Bellucci        Andy Murray         Madrid Masters   6-4 6-2
19.7%     Pavol Cervenak         Victor Hanescu      Stuttgart        6-3 7-6(6)
19.7%     Richard Gasquet        Roger Federer       Rome Masters     4-6 7-6(2) 7-6(4)
19.9%     Rajeev Ram             Grigor Dimitrov     Atlanta          6-4 6-4
20.0%     Philipp Kohlschreiber  Robin Soderling     Indian Wells     7-6(8) 6-4

Prospect Rankings, 8 August 2011

I can’t believe it’s been three months and two grand slams since I’ve done one of these! Plenty has happened in the meantime, especially for Bernard Tomic, Wimbledon quarterfinalist. Tomic’s achievements have moved him into the top 100, into the top 3 20-and-unders, and the top 10 22-and-unders–quite a mark for an 18-year-old.

Note also that at the bottom of the 18-and-under list, there are a couple of 17-year-olds, plus Jiri Vesely, who only recently turned 18. I wouldn’t be surprised to see Vesely at the top of the 18-and-under list before his next birthday.

18 AND UNDER
68   Bernard Tomic                AUS   10/21/92  
326  Denis Kudla                  USA    8/17/92  
347  Diego Schwartzman            ARG    8/16/92  
356  Benjamin Mitchell            AUS   11/30/92  
390  Guilherme Clezar             BRA   12/31/92  
406  Tiago Fernandes              BRA    1/29/93  
418  Alexander Rumyantsev         RUS    8/16/92  
440  Roberto Carballes-Baena      ESP    3/23/93  
510  Jozef Kovalik                SVK    11/4/92  
531  Victor Baluda                RUS    9/30/92  
549  Jack Sock                    USA    9/24/92  
576  Taro Daniel                  JPN    1/27/93  
628  Micke Kontinen               FIN   12/18/92  
629  Jiri Vesely                  CZE    7/10/93  
648  Suk-Young Jeong              KOR    4/12/93  
669  Liam Broady                  GBR     1/4/94  
673  Edoardo Eremin               ITA  10/5/1993  
684  Jason Kubler                 AUS    5/19/93  
685  Mitchell Frank               USA   10/16/92  
698  Andres Artunedo-Martinavarr  ESP    9/14/93  

20 AND UNDER
26   Milos Raonic         CAN  12/27/90  
56   Grigor Dimitrov      BUL   5/16/91  
68   Bernard Tomic        AUS  10/21/92  
76   Ryan Harrison        USA    5/7/92  
140  Jerzy Janowicz       POL  11/13/90  
150  Cedrik-Marcel Stebe  GER   10/9/90  
185  Pablo Carreno        ESP   7/12/91  
197  Federico del Bonis   ARG   10/5/90  
209  Tsung-Hua Yang       TPE   3/20/91  
211  Facundo Arguello     ARG    8/4/92  
230  Javier Marti         ESP   1/11/92  
233  Marius Copil         ROU  10/17/90  
239  Laurynas Grigelis    LTU   8/14/91  
271  Axel Michon          FRA  12/16/90  
283  Gastao Elias         POR  11/24/90  
284  Alexander Lobkov     RUS   10/7/90  
305  Christian Lindell    SWE  11/20/91  
318  Daniel Cox           GBR   9/28/90  
319  Stefano Travaglia    ITA  12/28/91  
325  Andrey Kuznetsov     RUS   2/22/91  

22 AND UNDER
19   Juan Martin del Potro  ARG   9/23/88  
21   Alexander Dolgopolov   UKR   11/7/88  
26   Milos Raonic           CAN  12/27/90  
29   Marin Cilic            CRO   9/28/88  
48   Kei Nishikori          JPN  12/29/89  
55   Ernests Gulbis         LAT   8/30/88  
56   Grigor Dimitrov        BUL   5/16/91  
68   Bernard Tomic          AUS  10/21/92  
76   Ryan Harrison          USA    5/7/92  
89   Donald Young           USA   7/23/89  
107  Thiemo de Bakker       NED   9/19/88  
115  Thomas Schoorel        NED    4/8/89  
119  Benoit Paire           FRA    5/8/89  
134  Richard Berankis       LTU   6/21/90  
138  Martin Klizan          SVK   7/11/89  
140  Jerzy Janowicz         POL  11/13/90  
150  Cedrik-Marcel Stebe    GER   10/9/90  
155  Vasek Pospisil         CAN   6/23/90  
156  Evgeny Donskoy         RUS    5/9/90  
161  Vladimir Ignatik       BLR   7/14/90

Video: The Present and Future of Statistics in Tennis

Last Tuesday, I gave a talk at the Longwood Cricket Club in Boston about tennis statistics.  Many thanks to Rick Devereaux for extending the invitation, and to everyone at Longwood for their hospitality.  (And for their beautiful grass courts!)

In the talk, I discuss the value of different types of statistics in sports, what tennis stats are out there now, and what we can expect in the not-too-distant future. I also detour into baseball analysis to show some of the potential for research in tennis.  It’s about 36 minutes long.

Apologies for the video quality–the room was dark to accommodate the projector, and my handy little Flip camera could only do so much.  Still, the audio is generally clear.

Enjoy!

ATP Montreal Predictions

The big boys are back in action with this week’s Masters 1000 tournament in Montreal.  I’ve updated my rankings and generated some predictions for this week’s matches.  My system doesn’t give any credit to defending champions, so Andy Murray is in a distant third, while Djokovic’s chances of winning the tournament are lessened a bit by finding himself in the same half of the bracket as Federer.

If you’re visiting for the first time, or the first time since last week, you may be interested in the variety of content I posted over the weekend:

Once you’ve caught up, enjoy my forecast for this week’s Rogers Cup, below.

Player                        R32    R16     QF         W  
(1)Novak Djokovic          100.0%  81.1%  59.3%    23.27%  
Nikolay Davydenko           78.8%  17.5%   7.9%     0.70%  
(q)Flavio Cipolla           21.2%   1.4%   0.2%     0.00%  
Andreas Seppi               30.4%   6.9%   1.0%     0.02%  
Marin Cilic                 69.6%  27.5%   7.2%     0.61%  
Jarkko Nieminen             14.9%   4.9%   0.7%     0.01%  
(16)Juan Martin Del Potro   85.1%  60.6%  23.8%     5.21%  
                                                           
Player                        R32    R16     QF         W  
(12)Viktor Troicki          81.3%  40.4%  19.1%     0.49%  
(q)Michael Yani             18.7%   3.3%   0.6%     0.00%  
Marcos Baghdatis            57.5%  34.7%  17.9%     0.78%  
John Isner                  42.5%  21.7%   9.7%     0.19%  
(q)Alex Bogomolov Jr        41.7%  10.7%   3.4%     0.02%  
Adrian Mannarino            58.3%  19.1%   7.8%     0.10%  
(5)Gael Monfils            100.0%  70.3%  41.5%     2.19%  
                                                           
Player                        R32    R16     QF         W  
(3)Roger Federer           100.0%  91.7%  63.9%    15.48%  
Juan Ignacio Chela          52.4%   4.4%   0.9%     0.00%  
(WC)Vasek Pospisil          47.6%   3.9%   0.7%     0.00%  
(WC)Bernard Tomic           67.8%  29.9%   9.1%     0.51%  
(LL)Yen Hsun Lu             32.2%   9.3%   1.6%     0.02%  
Fabio Fognini               17.3%   5.3%   0.8%     0.01%  
(13)Jo Wilfried Tsonga      82.7%  55.4%  22.9%     2.84%  
                                                           
Player                        R32    R16     QF         W  
(10)Richard Gasquet         52.9%  34.9%  19.5%     0.60%  
Florian Mayer               47.1%  30.1%  16.9%     0.51%  
Andrey Golubev              42.9%  13.9%   5.3%     0.04%  
Thomaz Bellucci             57.1%  21.1%   9.1%     0.10%  
Sergiy Stakhovsky           38.9%  16.3%   6.6%     0.06%  
Philipp Kohlschreiber       61.1%  31.5%  16.4%     0.37%  
(8)Nicolas Almagro         100.0%  52.1%  26.1%     0.46%  
                                                           
Player                        R32    R16     QF         W  
(6)Mardy Fish              100.0%  67.4%  43.5%     3.65%  
Feliciano Lopez             53.7%  16.4%   8.0%     0.16%  
(SE)Radek Stepanek          46.3%  16.1%   7.0%     0.11%  
(WC)Ernests Gulbis          78.1%  41.7%  18.7%     0.56%  
Juan Carlos Ferrero         21.9%   5.3%   0.9%     0.00%  
Michael Llodra              46.3%  23.9%   8.8%     0.20%  
(11)Mikhail Youzhny         53.7%  29.1%  13.1%     0.44%  
                                                           
Player                        R32    R16     QF         W  
(14)Stanislas Wawrinka      59.5%  48.6%  20.7%     1.84%  
David Nalbandian            40.5%  30.2%   9.1%     0.39%  
(q)Michael Russell          37.4%   6.0%   0.7%     0.00%  
Albert Montanes             62.6%  15.2%   2.5%     0.02%  
Pablo Andujar               23.1%   1.4%   0.2%     0.00%  
Kevin Anderson              76.9%  12.7%   4.6%     0.08%  
(4)Andy Murray             100.0%  86.0%  62.1%    12.44%  
                                                           
Player                        R32    R16     QF         W  
(7)Tomas Berdych           100.0%  60.8%  37.2%     2.20%  
Alexandr Dolgopolov         93.0%  38.8%  21.6%     0.78%  
(WC)Erik Chvojka             7.0%   0.4%   0.0%     0.00%  
Ivo Karlovic                38.8%  15.2%   4.7%     0.04%  
Juan Monaco                 61.2%  27.4%  10.4%     0.21%  
(q)Philipp Petzschner       34.4%  16.3%   5.5%     0.07%  
(9)Gilles Simon             65.6%  41.1%  20.7%     0.88%  
                                                           
Player                        R32    R16     QF         W  
(15)Fernando Verdasco       72.5%  44.2%  12.1%     0.65%  
(q)Tobias Kamke             27.5%   9.8%   1.4%     0.01%  
Janko Tipsarevic            70.6%  36.4%   9.8%     0.39%  
(q)Alejandro Falla          29.4%   9.7%   1.4%     0.01%  
Ivan Dodig                  44.4%   6.5%   2.7%     0.04%  
Jeremy Chardy               55.6%   9.7%   4.6%     0.13%  
(2)Rafael Nadal            100.0%  83.8%  68.0%    20.10%

Do Points Get Shorter as the Match Progresses?

On Friday, some interesting ideas were batted around in the comments to my post on the 61-shot rally.  One of the simpler ones boils down to the question that titles today’s post: Do points get shorter as the match progresses?

Two forces seem to work in opposite directions:

  • As players get used to each other’s games (and specifically their serves), more balls get returned.  Before looking at the numbers, I would’ve bet that this was the case, meaning that aces and service winners decline as you go deeper into a match.
  • The longer the match, the more tired the players.  Tired (or even slightly injured) players take more risks and probably have shorter rallies.

To answer the question, I looked at rally lengths shown in Pointstream at the last three grand slams.  That gives us close to 250 men’s matches, all best-of-five sets.

The short, unsatisfying conclusion is: The results are mixed.  At Wimbledon and Roland Garros, rally length increased later in matches–as much as 10% in London and 20% in Paris.  At the Australian Open, the result was the exact opposite, with rally length decreasing substantially.  Perhaps rally length increases in most cases, except when it is extremely hot or the players are not yet in top shape.  The blistering heat in Melbourne is certainly a plausible reason for a decrease in rally length.

As we’ll see when we move into more specific findings, the results get even more jumbled.  It seems that points generally get longer as a match progresses, but not necessarily because players read and return serves better.  While rally lengths increase, the number of one-stroke points (aces, service winners, service return errors) often increases, as well.

Follow the jump for my methodology and full results.

Continue reading Do Points Get Shorter as the Match Progresses?