The Effect of One More MPH

Italian translation at settesei.it

All else equal, increasing your first serve speed is a good thing … so how useful is it?  Earlier this week, I published some generic numbers, but those are far too crude to answer this question.

To get a better answer, we need to see what happens when specific players serve a little faster or slower.  Sometimes, players dramatically mix up serve speed (as with slice serves wide), but most of the time, each player stays within a fairly limited range defined by his own power and skill.

The algorithm I’ve employed is  fairly complicated, so I’ll give you the results first.

It appears that most players, if they increased their average serve speed by one mile per hour, would win 0.2% more first service points.  That’s not many–it’s not even one point in every match.  But every little bit helps, and according to my win probability models, winning 0.2% more first serve points can increase your chance of winning an even match from 50% to just short of 51%.  Except possibly at the extremes, that continues to be the case for 2 MPH, 3 MPH, and greater increases–so a 5 MPH increase takes that 50/50 match and turns it into a 54/46 contest.

(One assumption here is that all players respond to increases in serve speed the same way.  I’m sure that’s not true, but at this stage it’s a necessary assumption.)

The effect of a speed increase is even greater on ace and service winner rates.  Each additional MPH on a player’s serve increases his ace rate by about 0.4%, and his service winner rate by about 0.5%.

Now for the algorithm and some caveats.

Process

The algorithm was designed to control (to the extent possible) for different types of serving and playing styles, as well as the different average speeds to the deuce and ad court, as well as to different directions (wide, body, and T).

I used only US Open data, to avoid differences between surfaces and between the speed guns used at different events.  I used data only from the 18 players who had more than 150 first-serve points tracked by Pointstream.  For each of those players, I found their average first-serve speed for each of six directions: wide, body, and T to the deuce and ad courts.  Then, I randomly selected 150 of their first-serve points, and for each point, noted the difference between the point’s serve speed and the player’s average in the relevant court/direction.

Thus, every one of 2700 points was labeled 0 (average for that player/court/direction), or +1 (one mph above average), or -4, and so on.  That results in large pools of points with each label.  Many of the pools were too small for useful analysis, so I grouped them in sets of five: (-2, -1, 0, +1, +2), (-1, 0, +1, +2, +3), and so on.  The pools, then, were useful from about -15 to +15.

From there, I looked at  each of several stats (points won, aces, service winners) for each pool, and compared the rates from one pool to the next.  The results were somewhat erratic–in some instances, an additional mph results in aces or points won going down, but over the set of 31 pools, they generally went up.  The numbers presented above are the averages of each one-mph change.

Caveats

It’s not a very big sample, especially when separating serves into pools of 0, +1, +2, and so on.

One issue with the dataset is that the 18 servers were usually winning–that’s how they got enough first serves to merit inclusion.  Thus, the average returner in the dataset is below average.  That isn’t necessarily a bad thing–perhaps below-average returners respond to changes in serve speed the way above-average returners do–but without more data, it’s tough to know.

Another concern is what the numbers really tell us below about 5 mph slower than average.  The algorithm operates on the assumption that a 120 mph serve is the same as a 121 mph serve, only slower.  Comparing 120 and 121, that’s probably true.   But comparing 120 and 108–for the same player, serving in the same direction–it probably isn’t.  The 108 mph isn’t a simulation of what would happen if the player wasn’t as good; it’s probably a strategic choice, likely accompanied by some spin.

That said, the algorithm doesn’t directly compare 120 and 108, it compares 108 and 109, and perhaps in the aggregate, there is something useful to be gleaned from comparing a strategic spin first serve to an identical serve one mph faster.  In any event, limiting the range to between -10 and 10, or even -7 and 7, doesn’t change the results much.

Finally, the sample is completely inadequate to tell us what happens at the extremes.  The average player appears to improve his chances by adding another bit of speed, but does John Isner?  There may be a ‘sweet spot’ where a player can get maximum gains from an additional 1, 5, or 10 mph on his first serves, but beyond which, the gain is more limited.

US Open Serve Speed by Player

It’s time for more serve-speed research notes. Most of the matches at the 2011 U.S. Open were tracked by Pointstream, and serve speed was recorded for the vast majority of those points. The Open website published some serve speed numbers, but not as conveniently as I would like.

Below, find the average first and second serve speeds for every man who played three or more Pointstream-tracked matches. Oddly enough, the top and bottom of the list are held by Americans; John Isner is where you’d expect him, while Donald Young barely kept his first-serve average in the triple digits.

I didn’t expect to see nearly so much variation in the difference between first and second serve averages. Sure, Isner and Young are the endpoints in both lists, but David Nalbandian–below average on firsts–is third of 22 on seconds. To take another angle, both Marin Cilic and Jo-Wilfried Tsonga each have more than double the difference in averages than does either Alex Bogomolov or Fernando Verdasco.

(“M” is the number of matches tracked by Pointstream for each player.)

Player                 M  1sts  1stAvg  2nds  2ndAvg  
John Isner             4   313   124.5   125   106.2  
Andy Roddick           5   249   122.1   118   100.5  
Tomas Berdych          3    85   120.3    71    95.0  
Jo-Wilfried Tsonga     5   289   119.7   206    90.6  
Marin Cilic            3   125   118.7   121    86.3  
Janko Tipsarevic       3   148   116.5    84    90.5  
Roger Federer          6   355   115.6   186    94.6  
Juan Martin Del Potro  3   180   114.5    96    88.2  
Julien Benneteau       3   177   114.0    86    89.9  
Tommy Haas             3   211   113.9   124    94.1  
Novak Djokovic         7   421   113.7   226    91.4  

Player                 M  1sts  1stAvg  2nds  2ndAvg
Andy Murray            6   338   112.6   204    85.2  
Mardy Fish             4   231   112.4   165    88.0  
David Nalbandian       3   165   112.3   125    96.1  
David Ferrer           3   128   112.2    74    88.9  
Rafael Nadal           7   435   110.5   176    84.5  
Juan Monaco            3   167   109.4    70    90.4  
Gilles Simon           3   235   108.3   179    81.6  
Fernando Verdasco      3   175   107.3    72    92.6  
Alex Bogomolov Jr.     3   264   103.1    96    89.1  
Donald Young           4   213   101.9   111    80.6

The Effect of Serve Speed

Italian translation at settesei.it

All else equal, you want to serve harder. But how much does it really matter?

That’s a more difficult question than it sounds, and I don’t yet claim to have an answer. In the meantime, I can share the results of some data crunching.

In 2011 U.S. Open matches covered by Pointstream, there were more than 9,000 first serve points. The server won almost exactly 70% of those points. About 11% of points were aces, and another 24% were service winners.

To see the effect of serve speed, I looked at four outcomes: aces, service winners, short points (three or fewer shots), and points won. It’s no surprise that each type of results happens more on faster serves.

Below, find the full numbers for serves of various speeds. The finding that sticks out to me is the small change in service points won from the 95-99 MPH group to the 115-119 MPH group. It may be that the modest increase–put another way, the surprising success rate at 95-104 MPH–is a result of strategic wide serves, or the better ground games of the players who hit slower serves.

So as I said, there’s much more work to be done, identifying the effects of faster serves for individual players, looking at deuce/ad court differences (for righties and lefties), and the results on different serve directions.

MPH      SrvPts   Ace%  SvcW%  Short%  PtsWon%  
85-89       140   2.1%  17.9%   47.1%    55.0%  
90-94       275   0.7%  21.5%   47.6%    63.6%  
95-99       546   2.2%  18.5%   48.4%    66.1%  
100-104     885   4.2%  24.6%   51.0%    66.0%  
105-109    1400   6.4%  29.3%   56.6%    68.7%  
110-114    1524   8.7%  34.0%   57.3%    69.1%  
115-119    1487  12.2%  35.9%   60.8%    69.4%  
120-124    1553  16.1%  40.1%   65.2%    73.2%  
125-129     941  21.5%  48.1%   72.4%    76.3%  
130-134     353  29.7%  58.4%   77.3%    84.4%  
135-139      66  27.3%  65.2%   80.3%    89.4%

Quantifying Comebacks and Excitement With Win Probability

Italian translation at settesei.it

As promised the other day, there’s a lot we can do with point-by-point and win probability stats for over 600 grand slam matches.

I’ve beefed up those pages a bit by borrowing some ideas from Brian Burke at Advanced NFL Stats.  He invented a couple of simple metrics using win probability stats to compare degrees of comebacks and the excitement level of (American) football games.

The concepts transfer to tennis quite nicely.  Comeback Factor identifies the odds against the winner at his lowest point.  I’ve defined it the same way Burke does for football: CF is the inverse of the winning player’s lowest win probability.  In the US Open Federer/Djokovic semifinal, Djokovic’s win probability was as low as 1.3%, or 0.013.  Thus, his comeback factor is 1/.013, or about 79.  That’s about as high a comeback factor as you’ll ever see.

On the other end, comeback factor cannot go below 2.0 — that’s the factor if the winning player’s WP never fell below 50%.  Matches in which the winner dominated are often very close to 2.0, as in the Murray/Nadal semifinal.  In that match, Nadal’s low point was facing a single break point at 2-3 in the first set; the comeback factor is 2.3.

A good way to think about comeback factor is this: “At his lowest point, the winning player faced odds of 1 in [CF].”

Excitement Index is a measure of volatility, or the average importance of each point in a match.  “Volatility” measures the importance of each individual point; EI is the average volatility over the course of a match.

(Burke sums the volatilities, reasoning that in football, a fast-paced game with many plays is itself exciting.  Since there is no clock in tennis [not exactly, anyway], it seems appropriate to average the volatilities.  Win probability already considers the excitement and importance of a deciding final set.)

At the moment, I’m calculating EI by multiplying the average volatility by 1000.  The Murray/Nadal match is 35 (not very exciting, though Murray fought back), the Djokovic/Federer match is 47 (more on that in a minute), while the 2nd rounder between Donald Young and Stanislas Wawrinka is 64.  I haven’t looked at all the matches yet, but EI should generally fall between 10 and 100, possibly exceeding 100 in rare instances like the Isner/Mahut marathon.

It seems like Djok/Fed should be higher, perhaps because we remember the excitement of the final set.  (And it may be that the final set should be weighted accordingly.)  But looking at the match log, there were an awful lot of quick games, which translate to relatively low volatility.  By contrast, Donald/Stan was more topsy-turvy throughout, as the players traded sets, then send volatility through the roof with a pair of breaks midway through the final set.

Both EI’s scaling and its exact definition are works in progress.  When I get a chance, I’ll do a survey of matches for which I have point-by-point data to further investigate both of these new (to tennis) metrics.

Win Probability Graphs and Stats

Win probability graphs and stats are now available for over 600 grand slam matches from 2011.  Thanks to IBM Pointstream from this year’s slams, there is a wealth of data available like never before.

Here’s the main menu.

Here’s a sample match: The US Open semifinal between Federer and Djokovic.

When I first started publishing tennis research, win probability was one of my focuses.  You can find earlier work here, which links to specific tables for games, sets, and tiebreaks.  I’ve also published much of the relevant code, which is written in Python.

Win probability represents the odds of each player winning after every point of the match, based on the score up to that point and which player is serving. It makes no assumptions about the specific skill levels of each players, but does assume that the server has an advantage, which varies based on surface and gender.  With every point, each player’s win probability goes up or down, and the degree to which it rises or falls is dependent on the importance of the point–at 4-1, 40-0, winning the point is nice, but losing the point just delays the inevitable; at 5-6 in a tiebreak, the potential change in win probability is huge.

To quantify that in the graphs, I show another metric: Volatility, which measures the importance of each point. It is equal to the difference in win probabilities between the server winning and losing the following point. 10 percent is exciting, 20 percent is crucial, and 30 percent is edge-of-your-seat stuff.

Assumptions

To produce these numbers, I needed to make several simplifying assumptions.  Some are more important than others; here are the big two:

  • The players are equal.
  • Each player’s ability does not vary from point to point.

The first of these is almost always false, and the second is probably false as well.  The first, however, makes things more interesting.  In most matches Novak Djokovic plays these days, he goes in with an 80-percent-or-better chance of winning.  If we graphed one of his matches starting at 85 percent, we’d usually get a very slowly ascending line.  Instead, by starting at 50 percent, we can see where he and his opponent had their biggest openings, and who took advantage.

(In this long-ago post, I showed a sample graph with an assumption similar to the 85 percent for Djokovic, and you can see some of what I mean.)

Assuming that the players are equal also sidesteps of messy question of how to quantify each player’s skill level on that day, on that surface, against that opponent.

The second big assumption ignores possibility real-world attributes like clutch performance and streakiness, along with more pedestrian considerations like some players’ stronger serving in the deuce or ad court.

Another long-ago article of mine suggests that servers are not absolutely consistent, possibly because of natural rises and falls in performance, also possibly because of risk-taking (or lack of concentration) in low-pressure situations.  One of the most interesting directions for research with these stats is into this inconsistency: We need to figure out whether some players are more consistent than others, whether “clutch” exists in tennis, and much more.

One more set of assumptions regards the server’s advantage.  Since these graphs only encompass the four grand slams, I set the server’s win percentage for each tournament.  The numbers I used for men are: 63% in Australia, 61% at the French, 66% at Wimbledon, and 64% at the U.S. Open.  I used percentages two points lower for women at each event.

More on Win Probability

There’s very little out there on win probability and volatility in tennis.  I wasn’t the first person to work out the probability of winning a game, a set, or a match from a given score, but as far as I know, I’m the only person publishing graphs like this.  Much of the problem is the limited availability of play-by-play descriptions for professional tennis.

That problem doesn’t apply to baseball, where win probability has thrived for years.  Here’s a good intro to win probability stats in baseball, and fangraphs.com is known for its single-game graphs–for instance, here’s tonight’s’s Brewers game.  In many ways, win probability is more interesting in baseball than in tennis.  In tennis, there are only two possible outcomes of each point, while in baseball, there are several possible outcomes of each at-bat.

Enjoy the graphs and stats!

The Speed of Every Surface

Italian translation at settesei.it

Last week, I wrote an article for the Wall Street Journal noting the relatively slow speed of this year’s U.S. Open.  It’s not clear whether the surface itself is the cause, or whether the main factor is the humidity from Hurricane Irene and Tropical Storm Lee.  For whatever reason, aces were lower than usual, creating an environment more favorable to, say, Novak Djokovic than someone like Andy Roddick.

The limited space in the Journal prevented me from going into much detail about the methodology or showing results from tournaments other than the slams.  There’s no word limit here at Heavy Topspin, so here goes…

Aces and Server’s Winning Percentage

Surface speed is tricky to measure–as I’ve already mentioned, “surface speed” is really a jumble of many factors, including the court surface, but also heavily influenced by the atmosphere and altitude.  (And, possibly, different types of balls.)  If you were able to physically move the clay courts in Madrid to the venue of the Rome Masters, you would get different results.  But teasing out the different environmental influences is little more than semantics–we’re interested in how the ball bounces off the court, and how that affects the style of play.

So then, what stats best reflect surface speed?  Rally length would be useful, as would winner counts–shorter rallies and more winners would imply a faster court.  But we don’t have those for more than a few tournaments.  Instead, I stuck with the basics: aces, and the percentage of points won by the server.

Important in any analysis of this sort is to control for the players at each tournament.  The players who show up for a lower-rung clay tournament are more likely to be clay specialists, and the men who get through qualifying are more likely to be comfortable on clay.  Also, the players who reach the later rounds are more likely to be better on the tournament’s surface.  Thus, the number of aces at, say, the French Open is partially influenced by surface, and partially influenced by who plays, and how much each player plays.

Thus, instead of looking at raw numbers (e.g. 5% of points at Monte Carlo were aces), I took each server in each match, and compared his ace rate to his season-long ace rate.  Then I aggregated those comparisons for all matches in the tournament.  This allows us to measure each tournament’s ace rate against a neutral, average-speed surface.

The Path to Blandness

The ace rate numbers varied widely.  While the Australian Open and this year’s US Open were close to a hypothetical neutral surface speed, other tourneys feature barely half the average number of aces, and still others have nearly half-again the number of aces of a neutral surface.   I’ve included a long list of tournaments and their ace rates below; you won’t be surprised to see the indoor and grass tournaments on the high end and clay events at the other extreme.

But there’s a surprise waiting.  I also calculated the percentage of points won by the server, and like ace rate, I controlled for the mix of players in every event.  While ace rate varies from 53% of average to 145% of average, the percentage of points won by the server never falls below 90% of average, rarely drops below 95%, and never exceeds 105%.  53 of the 67 tournaments listed below fall between 97% and 103%–suggesting that surface influences the outcome of only handful of points per match.

That may defy intuition, but think back to the mix of players at each tournament.  Big-serving Americans don’t show up at Monte Carlo, while South Americans generally skip every non-mandatory event in North America.  The nominal rate at which servers win points varies quite a bit, but that’s because of the players in the mix.

Also, this finding suggests that, as a stat, aces are overrated.  They may be a useful proxy for server dominance–if a players hits 15 aces in a match, he’s probably a pretty good server–but they come nowhere near telling the whole story.  Aces on grass turn into service winners on hard courts, and then become weak returns and third-shot winners on clay.  The end result is usually the same, but Milos Raonic is a lot scarier when the serves bounce over your head.

Finally, it would be a mistake to say that a variance of 3-5% in serve points won is meaningless.  It may be less than expected, but especially between good servers, 3-5% can be the difference.  Move Saturday’s Federer/Djokovic semifinal to a surface like Wimbledon’s, and we’d be looking at a different champion.

All the Numbers

Here is the breakdown of ace rate and serve points won, compared to season average, for nearly every current ATP event.

Since I am using each season’s average, you may wonder whether the averages themselves have changed from year to year.  I’ve read that courts are getting slower, but in the five-year span I’ve studied here, the ace rate has actually crept up a tiny bit.  Each tournament varies quite a bit–probably due to weather–but generally ends up at the same numbers.

Below, find the 2011 ace rate and percentage of serve points won, as well as the average back to 2007.   Again, these are controlled for the mix of players (including how much each guy played), and the numbers are all relative to season average.

The little letter next to the tournament name is surface: c = clay, h = hard, g = grass, and i = indoor.

Tournament          2011Ace  2011Sv%    AvgAce  AvgSv%  
Estoril          c    57.5%    96.6%     53.3%   94.3%  
Monte Carlo      c    52.0%    92.1%     53.9%   91.2%  
Umag             c    58.6%    95.2%     58.7%   94.3%  
Serbia           c    54.2%    93.5%     61.0%   94.8%  
Rome             c    62.5%    95.9%     62.9%   94.4%  
Buenos Aires     c    61.9%    99.0%     62.9%   98.6%  
Houston          c    64.9%    97.2%     66.6%   96.8%  
Valencia         i                       68.0%   96.4%  
Barcelona        c    55.7%    94.3%     68.0%   96.2%  
Dusseldorf       c    45.7%    96.5%     72.8%   97.2%  

Hamburg          c    78.0%    96.6%     74.3%   96.4%  
Bastad           c    63.8%    94.5%     76.8%   97.7%  
Roland Garros    c    78.0%    98.4%     77.1%   97.5%  
Santiago         c    84.5%    98.5%     81.5%   99.4%  
Costa do Sauipe  c    83.4%   101.7%     84.2%   98.9%  
Nice             c    88.5%    97.4%     84.3%   98.1%  
Casablanca       c    79.1%    99.0%     84.9%   98.2%  
Acupulco         c    70.9%    95.6%     86.0%   98.7%  
Madrid           c    77.0%    98.5%     86.1%   98.0%  
Munich           c    87.9%   100.1%     86.5%  100.0%  

Beijing          h                       86.7%   97.3%  
Los Angeles      h    84.7%    97.2%     87.7%   97.3%  
Kitzbuhel        c    95.8%    97.9%     89.0%   98.6%  
Toronto          h                       89.6%   98.3%  
Chennai          h    82.3%    98.0%     89.6%   98.7%  
Stuttgart        c    77.0%    95.8%     89.7%   98.1%  
Indian Wells     h    88.9%    99.0%     90.9%   98.0%  
Doha             h   125.5%   101.9%     91.2%   97.6%  
Auckland         h   103.1%   102.0%     93.9%   98.7%  
Miami            h    94.5%    97.9%     94.4%   98.0%  

Shanghai         h                       94.6%   98.1%  
Australian Open  h    97.6%    97.3%     96.5%   96.9%  
Kuala Lumpur     h                       97.1%   97.3%  
Sydney           h   105.8%   100.0%     97.4%   99.1%  
St. Petersburg   i                       97.8%  101.7%  
Montreal         h    91.3%    98.4%     98.1%   98.2%  
Delray Beach     h   106.2%    99.9%     99.1%   98.6%  
Gstaad           c   104.5%   100.1%    101.2%  101.4%  
Dubai            h   102.7%    96.5%    103.2%   98.2%  
US Open          h   101.3%    97.4%    104.0%   98.7%  

Vienna           i                      105.8%  101.4%  
Johannesburg     h   110.0%   102.7%    106.0%  101.0%  
Washington DC    h    97.5%   100.1%    106.8%   99.8%  
Newport          g    93.3%    99.0%    107.5%  101.7%  
Winston-Salem    h   108.1%    99.6%    108.1%   99.6%  
Atlanta          h   110.0%   100.9%    108.4%   99.0%  
Bangkok          h                      110.5%  101.6%  
Cincinnati       h    96.2%    98.9%    111.7%  100.5%  
Zagreb           i   107.0%    99.2%    112.3%  102.3%  
Moscow           i                      113.0%  101.3%  

Brisbane         h   130.6%   100.3%    113.4%  100.0%  
Eastbourne       g   111.2%   101.8%    114.1%  102.9%  
Paris Indoors    i                      115.4%   99.6%  
Rotterdam        i   123.8%   103.7%    115.9%  101.0%  
Basel            i                      117.7%  101.3%  
San Jose         i   108.6%   103.0%    120.0%  102.7%  
Wimbledon        g   119.4%   102.8%    120.7%  103.0%  
Queen's Club     g   113.3%   101.8%    121.5%  103.2%  
Halle            g   122.9%   104.7%    123.2%  102.5%  
Marseille        i   127.4%   102.8%    124.2%  102.2%  

Stockholm        i                      124.4%   99.8%  
Metz             i                      124.6%  101.7%  
Tokyo            h                      124.7%  100.5%  
s-Hertogenbosch  g   110.9%   102.1%    126.3%  104.0%  
Memphis          i   117.1%   101.2%    129.1%  102.0%  
Montpellier      i                      145.4%  104.5%

US Open Draw Datasets

Earlier today, I published a thorough analysis of the last ten years of US Open draws, showing that while first and second seeds have had extremely easy first-round matchups, there is no other credible statistical evidence that suggests any nonrandom manipulation of the draw.

If you want to take a look at the draws yourself, I’ve made it easier.  The following files not only have the full draws going back to 2001, but they also include each player’s ATP or WTA ranking at the time of the tournament, their ordinal ranking among the players in the draw, the ordinal ranking of their first-round opponent, and the ordinal ranking of their best-possible second round opponent.

Click to download the files:

Here’s a quick rundown of the columns you’ll find in each sheet:

  • Year — each file contains the entire draws for the last ten years.
  • Draw Pos[ition] — numbers 1 to 128, so you can always sort the sheet to show the players in draw order.  (For instance, the #1 seed is 1, that player’s opponent is 2, and so on.)
  • Player
  • Country
  • Seed — the seeding assigned by the US Open
  • Rank [ATP/WTA] — the player’s official ranking the Monday that the tourney began.
  • Ordinal — the player’s rank among the 128 players in the field.  Last year, Shelby Rogers’s WTA ranking was 344, which made her ordinal ranking 124 out of 128.
  • 1stRdOpp — the ordinal ranking of the player’s first-round opponent.
  • Best2nd — the ordinal ranking of the player’s best possible second-round opponent.
Let me know if you find anything interesting!

Is the US Open Draw Truly Random?

Italian translation at settesei.it

Last week, an ESPN “Outside the Lines” article called into question the fairness of the U.S. Open main draw.  A researcher discovered that the top two seeds (both men and women) have gotten very easy first-round assignments.

This is one small step away from a direct accusation of draw-rigging by the USTA.  It’s a serious claim, and while the article’s author leans heavily on a single academic who supports the methodology used, it’s not at all clear that anything unacceptable is going on.

What they found

For some reason, the study focused on the top two seeds.  It’s not at all clear why it did so–I have no idea what the USTA’s motive would be for rigging the draw in favor of the top two seeds, regardless of their identity.  Sure, there were a few years when a Federer-Nadal final would have been particularly mouthwatering, or when American viewers craved a Serena-Venus showdown in Flushing, but why would the USTA be tweaking a draw in favor of Gustavo Kuerten?  Marat Safin?  Amelie Mauresmo? Dinara Safina?

For the moment, let’s set that major concern aside.  To quantify the difficulty of each player’s first-round opponents, the ESPN study invented a metric called “difficulty score.”  We’ll come back to “difficulty score” in a bit.

A simple look at the lists they assembled of first-round opponents does suggest that something untoward is going on.  In the last ten years of men’s draws, a top-two seed has faced a top-80 opponent only four times, and not once in the last five years.  Seeded players should face top-80 opponents about half the time.

If we are truly interested in the first-rounders assigned to top-two seeds, it’s clear that these players have been given an easier path than what would be statistically expected.  But it’s not yet clear that it’s anything other than good luck.

Breaking down “difficulty score”

Here’s the explanation of the metric that ESPN used:

So if a top two seed faced the 33rd-ranked player in the first round, he/she would get a difficulty score of 0.995 for that round; if he/she faced the 128th-ranked player in the first round, the score for that round would be 0.005. An average opponent (ranked around 80th or 81st), would correspond to a difficulty score near 0.500, which should be the average difficulty score over several years of draws.

I don’t understand why the ESPN study needed to switch from ordinal rankings (1 to 128) to difficulty scores between 0.005 and 0.995.  But I replicated the work using ordinal rankings instead of difficulty scores, and came up with the same results.

The average first round opponent for the top two seeds in each year’s men’s draw has been about the 98th-best player in the draw.  Given that seeds can draw anyone from 33 to 128, the average “should” be around 80.  With difficulty scores, ESPN says that the likelihood of the last ten years of easy draws is 0.3%.  With ordinal rankings, I found approximately the same.  The last thing the sports-analysis world needs is another superfluous metric, but at least this one doesn’t appear to be misleading.

What about better reasons for rigging?

The core problem here is this: Why do we care  specifically about the draws for the first two seeds?  Or, why would the USTA care enough to compromise the fairness of the draw?

As ESPN highlighted, some of the first-round victims are American wild cards.  Scoville Jenkins, for instance, was fed to the wolves twice, once each against Federer and Roddick.  If we’re really fishing for an explanation, perhaps the USTA wants to put up-and-coming stars such as Jenkins, Devin Britton, and Coco Vandeweghe on a big stage, either to showcase these players, or to make otherwise pedestrian blowouts more interesting.  I suppose I’d rather watch Nadal play Jack Sock than, say, Diego Junqueira.

But that’s ex post facto reasoning of the most blatant sort.  If the USTA were going to rig the draw, wouldn’t they be more likely to do so in favor of top Americans?  Or in favor of a broader range of seeds, to better ensure marquee matchups for the second week?  Or rig second-round matchups for top players, to ensure that the big names make it to the middle weekend?

If no evidence of draw manipulation appears in any of those other scenarios, it would seem that ESPN discovered something more like the famous correlation between the S&P 500 and butter production in Bangladesh.  If your search for a newsworthy conclusion is sufficiently wide, you’re bound to find something.

The top seeds

As I’ve said, there’s no doubt that the top two seeds in the men’s draw have had an easy go of it in the last ten years, since the draws started seeding 32 players instead of 16.  The same is true of the women.

The top two in both the men’s and women’s draws faced an opponent who ranked roughly 98th out of the 128 field.  The odds of this happening on either side are tiny–about 0.25%.  The chances that a single tournament would randomly produce draws so easy for the top two men and women for ten years are effectively zero.

Beyond the top two, however, any suspicions quickly disappear.  The average opponent for the top four seeded men has been ranked about 89 out of 128, meaning that #3 and #4 face opponents around #80–dead average.  The average first-round assignment for the top eight seeded men has been around 87, meaning that seeds 5-8 face average opponents in the mid-80s.  Nothing to cause a raised eyebrow there, and the numbers are almost identical on the women’s side.

To go one step further, there’s no evidence of manipulation in the second-round draws.  In fact, the top two women’s seeds faced particularly tough 2nd round opponents–there was only a 20% chance that those twenty women would be given as tough of 2nd round assignments as they have.

Before looking at the draws of U.S. players, a quick summary.  While the top two seeds were given very low-ranked opponents in the first round, the effect did not extend to the second round, or to any seeds beyond the top two.

The American draws

If the USTA were to tweak the draws, you’d expect them to do so in favor of the home players, if for no other reason than television ratings.  But they haven’t.

Let’s start with the American men.  The top two ranked American men each year have faced opponents ranked, on average, 79 of 128.  That’s a bit tougher than average.  If we expand the analysis to the top four ranked Americans, or just seeded Americans, the results stay around average.  If anyone is manipulating the draws in favor of American men, they are either doing it without regard for ATP rankings, or they aren’t doing a very good job.

More surprising is the average opponent of all American men.  The average opponent of an American man in the last ten years has been 61.2 — considerably lower than 80, in part because unseeded men may draw seeded players in the first round.  But the average shouldn’t be that low.  In fact, there is only a 20% chance that American men would be given such a tough assignment.

Results for the women are mostly similar.  The top two American women each year have gotten a slightly easy draw–the average opponent rank is 83 of 128.  Keep in mind, however, that this overlaps with the analysis of the top two seeded women–five of the 20 top-two-seeded women were Americans, and in almost each one of those five cases, those women faced one of the weakest players in the draw.  In other words, there’s more evidence that the draw is skewed in favor of the top two seeds than the top two Americans.

As with the men, American women in general have been given tough assignments.  In fact, there is only a 16% chance that American women would face such tough first round opponents as they have.

What this means

If the USTA (or anyone else) is messing with the US Open draws, they are doing so in a nearly inscrutable way.  The only evidence of manipulation is with each year’s top two seeds, as ESPN highlighted.

The theory I mentioned above–that it might be desirable to pit top players against up-and-coming Americans–is appealing, but also not supported by the evidence.  Only five of the 20 opponents of top-two men’s seeds (and six of 20 women’s opponents) has been American, despite the fact that the U.S. contributes five or six lowly-ranked wild cards each year, in addition to a disproportionate number of qualifiers.

It’s an odd situation.  The first-round opponents of the top two seeds makes for a plausible target of draw manipulation, if not the most obvious one.

Postscript: One more question

I mentioned earlier that I’d rather watch Nadal play Jack Sock than Diego Junqueira.  I like up-and-comers, and it’s always interesting to see whether a new opponent forces a top player to change tactics.  It makes for a more interesting match than Nadal (or any top-tenner) against a 29-year-old who has hovered for years around #100.

My question, then: If you’re Rafa Nadal, and (presumably) you want to go deep at the U.S. Open, who would you rather play?  The American wild card ranked #450, or the veteran ranked #99?  A tougher question: Sock, or a veteran who was nearly seeded, like Fabio Fognini?  I can see different players making different choices, but I don’t think it’s clear cut.

It is the draws of Jenkins, Britton, Glatch–in other words, the Jack Socks of previous years–that give us this evidence of manipulation.  On paper, the 127th-highest-ranked player in the draw looks like the 127th-best, but in practice, it’s not nearly so clear cut.  And if these wild cards really are “wild cards,” what looks like an easy draw may not be much easier than yet another dissection of Sergiy Stakhovsky or Albert Montanes.

It may be true that at some stage, the US Open draws are being manipulated for (and only for) the top two seeds in each field.  But that doesn’t tell us whether those players are gaining anything from it.  It’s far from clear that the lowest-ranked players in each draw are the easiest opponents.

The High-Quality Cincinnati Draw

It’s tough to imagine a Master’s series event featuring a higher-quality field than the one assembled in Cincinnati this week.  With the exception of Robin Soderling, virtually every “name” player is present.  Just as importantly, almost all of the players awarded wild cards are legitimate competitors at this level.  The same is true of most of the seven men who qualified.

For tennis fans, it’s an enjoyable outcome: With the possible exception of Robby Ginepri, everyone present “deserves” to be here.  The event gave the other three wild cards to Ryan Harrison, Grigor Dimitrov, and James Blake, three men inside the top 85 who excel on hard courts.  Four of the top seeds in qualifying advanced to the main draw, all of whose current rankings put them right on the cusp of making the cut in the first place.

All this made me wonder: How does the Cincinnati draw compare to other 56-player Masters fields?  Is Cinci always this strong?

I’ve previous looked at the field quality of ATP 250s, so it was a small step to point the guns at the bigger tourneys.  Here are all 48- and 56-draw Masters events since 2009, along with the average entry rank and median entry rank of players in the field, sorted by the latter:

Year  Event        Field  AvgRank  MedRank  
2011  Madrid          56     37.7     30.0  
2010  Paris           48     38.1     30.5  
2011  CINCINNATI      56     50.1     31.5  
2010  Shanghai        56     56.5     31.5  
2009  Paris           48     57.5     31.5  
2009  Cincinnati      56     38.5     32.0  
2009  Montreal        56     83.6     32.5  
2010  Cincinnati      56     38.5     33.5  
2009  Rome            56     42.0     33.5  
2009  Shanghai        56     54.8     33.5  
2011  Rome            56     42.2     34.5  
2009  Madrid          56     43.6     34.5  
2011  Montreal        56     50.7     35.5  
2009  Monte Carlo     56     45.1     36.5  
2011  Monte Carlo     56     51.9     36.5  
2010  Rome            56     43.1     38.5  
2010  Toronto         56     57.7     40.5  
2010  Madrid          56     59.5     43.0  
2010  Monte Carlo     56     50.6     43.5

There’s not a huge difference in quality–after all, players are required to show up for most of these events–but there is a noticeable differentiation into “haves” and “have-nots.”  Of course Monte Carlo is near the bottom, as it is not mandatory.  Rome is required, but it does get skipped.  Madrid is an interesting case, as this year’s new schedule meant all the best players showed up, while last year, it was near the bottom of the list.

Setting aside Paris, which is near the top of the list because its field has eight fewer players, Cinci appears to consistently offer one of the best Masters fields.  This makes sense, as even if it weren’t a required stop on the tour, it’s a perfectly scheduled warm-up for the U.S. Open.

How Long Does the Server’s Advantage Last?

In professional tennis, it’s a given that the server has an advantage.  The size of that advantage depends on the abilities of the two players and the surface, but especially in men’s tennis, it’s a sizable edge.  On average, a server in an ATP match starts a point with a roughly 65% chance of winning.

But how long does it last?  It seems that, at some stage in the rally, the server’s advantage has disappeared.  Four or five strokes in, the server may still be benefiting from an off-balance return.  But by ten strokes, one would assume that the rally is neutral–that the advantage conferred by serving has evaporated.

As usual with tennis analysis, one question begets several more.  Does the server’s advantage last longer on faster surfaces?  Do women settle into “neutral” rallies sooner than men do?  Do dominating players, like this year’s edition of Novak Djokovic, take away the server’s advantage faster than the average player?

Using the rally counts provided by Pointstream at the last three majors, we can start to answer these questions.

Neutralizing the serve

The first step is to take all the matches we have rally-count data for, and average them out.  Then, for each point length, we calculate the odds that the server wins a point of at least that length.  So, for instance, we look at all points of five shots or more, and figure out how many of those the server wins.

Each one of these numbers is biased, because a rally of exactly five-strokes is, by definition, won by the server.  The server either hits a winner on his third shot (the fifth overall), or the returner makes an error attempting to hit his own third shot (the sixth overall).  Thus, if we look at all points of at least five strokes, the exactly five-stroke rallies virtually guarantee that the server will have the advantage.

However, the same reasoning shows us that a six-stroke rally will be biased in favor of the returner.  When we do the math for at-least-five, at-least-six, at-least-seven, and so on, we’ll see a yo-yo effect.  When the biases have equal effect, that means the serve is neutralized.

Here are the results for the approximately 150 grand slam matches with Pointstream data so far this year:

At least…  Win%  Notes                          
0          63%   before point begins            
1          66%   if serve goes in               
2          50%   if serve is returned           
3          60%   if server makes second shot    
4          46%   if returner makes second shot  
5          58%                                  
6          45%                                  
7          57%                                  
8          44%                                  
9          56%                                  
10         44%                                  
11         56%                                  
12         43%                                  
13         56%                                  
14         43%                                  
15         56%

In the table, “Win%” refers to the server’s chance of winning the point.  The biases even out somewhere between the 4th and 8th shot, meaning that in that zone, the server’s advantage is neutralized.

While the server retains the advantage at least until the fourth shot, it is interesting to see how quickly it decays.  Dropping from 66% upon making a serve to 56% once the advantage is neutralized, it loses more than half the difference between the first and third shots.  Thus, the returner doesn’t negate the server’s advantage simply by getting the ball back in play, but he does take a large step toward doing so.

Does surface matter?

As usual, it sure does.  The numbers for the Australian and French Opens are similar, and since they make up 2/3 of the data set, they are close to the aggregate numbers shown above.  But Wimbledon, as is so often the case, seems to play by a different set of rules:

At least…    Wimby    Austr    French  
0            66%        62%       62%  
1            68%        64%       67%  
2            52%        50%       48%  
3            62%        59%       58%  
4            48%        46%       45%  
5            61%        57%       57%  
6            47%        44%       44%  
7            61%        55%       56%  
8            47%        44%       44%  
9            59%        55%       54%  
10           47%        43%       43%  
11           60%        54%       55%  
12           46%        43%       43%  
13           59%        55%       55%  
14           43%        45%       42%  
15           56%        54%       56%

The biases don’t balance out until the very bottom–at 14 or more shots!  That’s only about 3% of points.  I’m not sure how to explain this, except perhaps psychologically, that on grass (considered the best surface for servers), players are less successful in return games simply because that’s what they expect to happen.  Regardless of surface, I can’t understand why else the server’s advantage would persist into double-digit shot counts.

What about the ladies?

WTA players (on average) start each service point with a smaller advantage than their male counterparts, and as it turns out, that advantage evaporates more quickly.

We saw a moment ago that, by putting the return in play, an ATP returner gives himself a 50% chance of winning the point–at least until his opponent hits another shot.  Women, however, knock the server’s winning percentage down to 47% by making the return.

The returner clearly neutralizes the point by the fourth stroke overall, and–here’s the good part–takes over a slight advantage herself by her third shot, the sixth stroke overall.  By making that sixth shot, the returner has a 57% chance of winning the point, while the server will never reach 57% again.  The advantage is only a percentage or two, but from the sixth stroke on, the returner has the edge.

Finally, presenting Novak Djokovic

Pointstream has tracked 17 of Djokovic’s slam matches this year, giving us a good set of data to work with.  When a man is having a season like this one–in large part because of his return game–it’s fascinating to see how comprehensively he is outplaying his opponents.

In the same terms as the tables above, here are Djokovic’s serve and return points across those 17 matches.  The return points are shown with the server’s winning percentages:

At least…    ND Sv    ND Ret  
0            70%         57%  
1            72%         60%  
2            59%         42%  
3            68%         50%  
4            56%         39%  
5            68%         49%  
6            57%         38%  
7            68%         47%  
8            58%         38%  
9            68%         45%  
10           55%         34%  
11           65%         44%  
12           55%         33%  
13           66%         38%  
14           52%         29%  
15           65%         40%

After seeing the averages above, you might reasonably conclude that these numbers are out of this world.  Even with the bias of 4-, 6-, and 8-stroke rallies, as discussed above, Djokovic still maintains an edge.  For everyone else, once fifth or sixth shot is struck, the point is a 50/50 proposition.  For Novak, it’s at least 60/40 in his favor.

The amazing stats are on his return.  When he gets his return back in play, he’s more than likely to win the point.  That may not surprise anyone who has watched Djokovic play this year, but consider how remarkable that is in the context of modern men’s tennis.  By the 8th stroke or so, he’s back to the 60/40 odds of the service points that turn into longer rallies.

Thanks to Carl Bialik for suggesting this topic.