First and Second Serves: Another ATP Info-miss

Breaking news, everybody: First serves are better than second serves!

That’s what I learned, anyway, from the latest article in the “Infosys ATP Beyond the Numbers” series:

When you average out the Top 10 players in the 2015 season, they are saving break points 72 per cent of the time when making a first serve. On average, that drops to 53 per cent with second serves. That 19 per cent difference is one of the most important, hidden metrics in our sport.

Is the difference between first and second serves “important?” Definitely. Is it in any way “hidden?” Not so much.

The melodramatic phrasing here suggests that break points are different from regular points, perhaps with a much larger spread between first and second serve winning percentages. But no, that’s not the case.

Last year, top ten players won 75.6% of first-serve points and 55.4% of second-serve points. Combined with the Infosys numbers–which I can’t verify, because the ATP doesn’t make the necessary raw data available–that means that top ten players win 5% less often when making a first serve on break point, and 5% less often when missing their first serve on break point.

At the risk of belaboring this: When it comes to the importance of making your first serve, break points are no different than other points.

Even that 5% difference is less meaningful that it looks. Break points don’t occur at random–better opponents generate more break opportunities. If you play two matches, one against Novak Djokovic and one against Jerzy Janowicz, you’re likely to face far more break points against Novak than against Jerzy … and of course, you’re less likely to win them.

Pundits tend to focus on break points, and in part, they are right to do so, because this small subset of points have an outsized effect on match outcomes. However, because of the small sample, it’s easy–and far too common–to read too much into break point results. My research has repeatedly shown that, once you control for opponent quality, most players win break points about as often as they do non-break points.

The ATP is sitting on a wealth of information. If we’re going to learn anything meaningful when they go “beyond the numbers,” it would be nice if they took advantage of more of their data and offered up more sophisticated analysis.

Are Two First Serves Ever Better Than One?

Italian translation at settesei.it

It’s one of those ideas that never really goes away. Some players have such strong first serves that we often wonder what would happen if they hit only first serves. That is, if a player went all-out on every serve, would his results be any better?

Last year, Carl Bialik answered that question: It’s a reasonably straightforward “no.”

Bialik showed that among ATP tour regulars in 2014, only Ivo Karlovic would benefit from what I’ll call the “double-first” strategy, and his gains would be minimal. When I ran the numbers for 2015–assuming for all players that their rates of making first serves and winning first-serve points would stay the same–I found that Karlovic only breaks even. Going back to 2010, 2014 Ivo was the only player-season with at least 40 matches for whom two first serves would be better than one.

Still, it’s not an open-and-shut case. What struck me is that the disadvantage of a double-first strategy would be so minimal. For Karlovic (and others, mainly big servers, such as Jerzy Janowicz, Milos Raonic,and John Isner), hitting two first serves would only slightly decrease their overall rate of service points won. For Rafael Nadal and Andy Murray, opting for double-first would reduce their rate of service points won by just under two percentage points.

Here’s a visual look at 2015 tour regulars (minimum 30 matches), showing the hypothetical disadvantage of two first serves. The diagonal line is the breakeven level; Ivo, Janowicz, and Isner are the three points nearly on the line.

myplot

Since some players are so close to breaking even, I started to wonder if some matchups make the double-first strategy a winning proposition. For example, Novak Djokovic is so dominant against second serves that, perhaps, opponents would be better off letting him see only first serves.

However, it remains a good idea–at least in general–to take the traditional approach against Djokovic. Hypothetically, two first serves would result in Novak raising his rate of return points won by 1.2 percentage points. Gilles Simon and Andy Murray are in similar territory, right around 1 percentage point.

Here’s the same plot, showing the disadvantage of double-first against tour-regular returners this season:

myplot2

There just aren’t any returners who would cause the strategy to come as close to breaking even as some big servers do.

The match-level tactic

What happens if a nearly-breakeven server, like Karlovic, faces a not-far-from-breakeven returner, like Djokovic? If opting for double-first is almost a good idea for Ivo against the average returner, what happens when he faces someone particularly skilled at attacking second serves?

Sure enough, there are lots of matches in which two first serves would have been better than one. I found about 1300 matches between tour regulars (players with 30+ matches) this season, and for each one, I calculated each player’s actual service points won along with their estimated points won had they hit two first serves. About one-quarter of the time, double-first would have been an improvement.

This finding holds up in longer matches, too, avoiding some of the danger of tiny samples in short matches. In one-quarter of longer-than-average matches, a player would have still benefited from the double-first strategy. Here’s a look at how those matches are distributed:

myplot3

Finally, some action on the left side of the line! One of those outliers in the far upper right of the graph is, in fact, Ivo’s upset of Djokovic in Doha this year. Karlovic won 85% of first-serve points but only 50% of second-serve points. Had he hit only first serves, he would’ve won about 79% of his service points instead of the 75% that he recorded that day.

Another standout example is Karlovic’s match against Simon in Cincinnati. Ivo won 81% of first-serve points and only 39% of second-serve points. He won the match anyway, but if he had pursued a double-first strategy, Simon could’ve caught an earlier flight home.

Predicting double-first opportunities

Armed with all this data, we would still have a very difficult time identifying opportunities for players to take advantage of the strategy.

For each player in every match, I multiplied his “double-first disadvantage” (the number of percentage points of serve points won he would lose by hitting two first serves) with the returner’s double-first disadvantage. Ranking all matches by the resulting product puts combinations like Karlovic-Djokovic and Murray-Isner together at one extreme. If we are to find instances where we could retroactively predict an advantage from hitting two first serves, they would be here.

When we divide all these matches into quintiles, there is a strong relationship between the double-first results we would predict using season-aggregate numbers and the double-first results we see in individual matches. However, even if the most double-first-friendly quintile–the one filled with Ivo serving and Novak returning–there’s still, on average, a one-percentage-point advantage to the traditional serving tactic.

It is only at the most extreme that we could even consider recommending two first serves. When we take the 2% of matches with the smallest products–that is, the ones we would most expect to benefit from double first–26 of those 50 matches are one in which the server would’ve done better to hit two first serves.

In other words, there’s a ton of variance at the individual match level, and since the margins are so slim, there are almost no situations where it would be sensible for a player to hit two first serves.

A brief coda in the real world

All of this analysis is based on some simplifying assumptions, namely that players would make their first serves at the same rate if they were hitting two instead of one, and that players would win the same number of points behind their first serves even if they were hitting them twice as often.

We can only speculate how much those assumptions mask. I suspect that if a player hit only first serves, he would be more likely to see streaks of both success and failure; without second serves to mix things up, it would be easier to find oneself repeating mechanics, whether perfect or flawed.

The second assumption is probably the more important one. If a server hit only first serves, his ability to mix things up and disguise serving patterns would be hampered. I have no idea how much that would affect the outcome of service points–but it would probably act to the advantage of the returner.

All that said, even if we can’t recommend that players hit two first serves in any but the extreme matchups, it is worth emphasizing that the margins we’re discussing are small. And since they are small, the risk of hitting big second serves isn’t that great. There may be room for players to profitably experiment with more aggressive second serving, especially when a returner starts crushing second serves.

Ceding the advantage on second-serve points to a player like Djokovic must be disheartening. If the risk of a few more double faults is tolerable, we may have stumbled on a way for servers to occasionally stop the bleeding.

Digging Out of the Holes of 0-40 and 15-40

In the men’s professional game, serving at 0-40 isn’t a death sentence, but it isn’t a good place to be. An average player wins about 65% of service points, and at that rate, his chance of coming back from 0-40 is just a little better than one in five.

Some players are better than others at executing this sort of comeback. Tommy Robredo, for instance, has come back from 0-40 nearly 60% more often than we’d expect, while Sam Querrey digs out of the 0-40 hole one-third less often than we would predict.

Measuring a player’s success rate in these scenarios isn’t simply a matter of counting up 0-40 games. That’s what we saw on the ATP official site last week, and it’s woefully inadequate. That article marvels at Ivo Karlovic‘s “clutch” accomplishments from 0-40 and 15-40, when we could easily have guessed that Ivo would lead just about any serving category. Big serving isn’t clutch if it’s what you always do.

Statistics are only valuable in context, and that is particularly true in tennis. Simply counting 0-40 games and reporting the results hides a huge amount of potential insight. Whether a player wins or loses (a game, a set, a match, or a stretch of matches) is only the first question. To deliver any kind of meaningful analysis, we need to adjust those results for the competition and consider what we already know about the players we’re studying.

Rather than tear apart that article, though, let’s do the analysis correctly.

The number of times a player comes back from 0-40 or 15-40 isn’t what’s important. As we’ve seen, big servers will dominate those categories. That doesn’t tell us who is particularly effective (or, dare we say, “clutch”) in such a situation, it only identifies the best servers. What matters is how often players come back compared to how often we would expect them to, taking into consideration their serving ability.

Karlovic is an instructive example. Over the last few years–the time span available in this dataset of point-by-point match records–Ivo has gone down 0-40 56 times, holding 17 of those games, a rate of 30.4%. That’s third-best on tour, behind John Isner and Samuel Groth. But compared to how well we would expect Karlovic to serve, he’s only 7% better than neutral, right in the middle of the ATP pack.

Before diving into the results, a few more notes on methodology. For each 0-40 or 15-40 game, I calculated the server’s rate of service points won in that match. Since we would expect 0-40 games to occur more often in matches with good returners, in-match rates seem more accurate than season-long aggregates. Given the in-match rate of serve points won, I then determined the odds that the server would come back from the 0-40 or 15-40 score. For each game, then, we have a result (came back or didn’t come back) and an estimate of the comeback’s likelihood. Combining both numbers for all of a player’s service games tells us how effective he was at these scores.

For 30 of the players best represented in the dataset, here are their results at 0-40, showing the number of games, the number of successful comebacks, the rate of successful comebacks, and the degree to which the player exceeded expectations from 0-40:

Player                  0-40  0-40 W  0-40 W%  W/Exp  
Tommy Robredo            110      30    27.3%   1.59  
Denis Istomin            114      26    22.8%   1.36  
John Isner                87      31    35.6%   1.34  
Guillermo Garcia-Lopez   161      29    18.0%   1.32  
Kevin Anderson           130      38    29.2%   1.28  
Bernard Tomic            110      24    21.8%   1.25  
Fernando Verdasco        141      30    21.3%   1.17  
Rafael Nadal             140      32    22.9%   1.15  
Kei Nishikori            122      23    18.9%   1.15  
Marin Cilic              125      26    20.8%   1.14  
                                                      
Player                  0-40  0-40 W  0-40 W%  W/Exp  
Jo-Wilfried Tsonga       124      29    23.4%   1.14  
Novak Djokovic           124      34    27.4%   1.12  
Andreas Seppi            145      24    16.6%   1.09  
Grigor Dimitrov          115      22    19.1%   1.08  
Philipp Kohlschreiber    146      28    19.2%   1.08  
Roger Federer            107      26    24.3%   1.07  
Ivo Karlovic              56      17    30.4%   1.07  
Santiago Giraldo         113      18    15.9%   1.06  
Alexandr Dolgopolov      141      25    17.7%   1.03  
Milos Raonic              82      23    28.0%   1.01  
                                                      
Player                  0-40  0-40 W  0-40 W%  W/Exp  
Tomas Berdych            149      30    20.1%   1.01  
Jeremy Chardy            122      21    17.2%   0.98  
Feliciano Lopez          136      26    19.1%   0.97  
Fabio Fognini            211      24    11.4%   0.97  
Mikhail Youzhny          155      18    11.6%   0.92  
David Ferrer             203      32    15.8%   0.89  
Richard Gasquet          152      25    16.4%   0.87  
Andy Murray              164      24    14.6%   0.80  
Gilles Simon             158      16    10.1%   0.72  
Sam Querrey               84      12    14.3%   0.68

As I mentioned above, Robredo has been incredibly effective in these situations, coming back from 0-40 30 times instead of the 19 times we would have expected. Some big servers, such as Isner and Kevin Anderson, are even better than their well-known weapons would leads us to expect, while others, such as Karlovic and Milos Raonic, aren’t noticeably more effective at 0-40 than they are in general.

Many of these extremes don’t hold up when we turn to the results from 15-40. Quite a few more games reach 15-40 than 0-40, so the more limited variation at 15-40 suggests that many of the extreme results from 0-40 can be ascribed to an inadequate sample. For instance, Robredo–our 0-40 hero–falls to neutral at 15-40. Here is the complete list:

Player                  15-40  15-40 W  15-40 W%  W/Exp  
John Isner                238      122     51.3%   1.33  
Milos Raonic              215       98     45.6%   1.18  
Feliciano Lopez           304      108     35.5%   1.17  
Jo-Wilfried Tsonga        301      119     39.5%   1.17  
Denis Istomin             304      101     33.2%   1.17  
Rafael Nadal              320      118     36.9%   1.16  
Ivo Karlovic              148       68     45.9%   1.15  
Kevin Anderson            338      132     39.1%   1.15  
Guillermo Garcia-Lopez    405      106     26.2%   1.14  
Andreas Seppi             396      113     28.5%   1.12  
                                                         
Player                  15-40  15-40 W  15-40 W%  W/Exp  
Bernard Tomic             273       86     31.5%   1.12  
Kei Nishikori             298       96     32.2%   1.10  
Novak Djokovic            348      132     37.9%   1.07  
Richard Gasquet           325      106     32.6%   1.07  
Roger Federer             281      109     38.8%   1.07  
Fernando Verdasco         306       94     30.7%   1.06  
Philipp Kohlschreiber     352      110     31.3%   1.06  
Andy Murray               431      135     31.3%   1.06  
Santiago Giraldo          331       86     26.0%   1.05  
Tomas Berdych             398      131     32.9%   1.05  
                                                         
Player                  15-40  15-40 W  15-40 W%  W/Exp  
Marin Cilic               357      109     30.5%   1.05  
Sam Querrey               244       78     32.0%   1.04  
Jeremy Chardy             300       91     30.3%   1.04  
Fabio Fognini             422       98     23.2%   1.03  
Tommy Robredo             285       78     27.4%   0.99  
Grigor Dimitrov           307       89     29.0%   0.99  
David Ferrer              498      138     27.7%   0.98  
Alexandr Dolgopolov       299       77     25.8%   0.95  
Mikhail Youzhny           339       77     22.7%   0.94  
Gilles Simon              426       93     21.8%   0.91

The big servers are better represented at the top of this ranking. Even though Isner is expected to come back from 15-40 nearly 40% of the time–better than almost anyone on tour–he exceeds that expectation by one-third, far more than anyone else considered here.

Finally, let’s look at comebacks from 0-30:

Player                  0-30  0-30 W  0-30 W%  W/Exp  
John Isner               338     229    67.8%   1.19  
Bernard Tomic            299     146    48.8%   1.15  
Grigor Dimitrov          342     166    48.5%   1.11  
Novak Djokovic           409     235    57.5%   1.10  
Santiago Giraldo         344     142    41.3%   1.10  
Fernando Verdasco        373     175    46.9%   1.10  
Rafael Nadal             376     194    51.6%   1.09  
Tomas Berdych            492     262    53.3%   1.09  
Tommy Robredo            296     132    44.6%   1.08  
Roger Federer            344     193    56.1%   1.08  
                                                      
Player                  0-30  0-30 W  0-30 W%  W/Exp  
Feliciano Lopez          326     161    49.4%   1.07  
Alexandr Dolgopolov      347     154    44.4%   1.07  
Marin Cilic              378     179    47.4%   1.06  
Jo-Wilfried Tsonga       357     185    51.8%   1.06  
Guillermo Garcia-Lopez   380     146    38.4%   1.06  
Ivo Karlovic             186     118    63.4%   1.04  
Philipp Kohlschreiber    395     185    46.8%   1.03  
Denis Istomin            314     135    43.0%   1.03  
Kei Nishikori            341     145    42.5%   1.03  
David Ferrer             529     227    42.9%   1.02  
                                                      
Player                  0-30  0-30 W  0-30 W%  W/Exp  
Kevin Anderson           361     181    50.1%   1.02  
Mikhail Youzhny          390     142    36.4%   1.00  
Andy Murray              419     185    44.2%   1.00  
Andreas Seppi            418     164    39.2%   0.99  
Jeremy Chardy            316     132    41.8%   0.99  
Milos Raonic             246     139    56.5%   0.99  
Fabio Fognini            478     153    32.0%   0.99  
Sam Querrey              292     131    44.9%   0.97  
Gilles Simon             442     155    35.1%   0.96  
Richard Gasquet          370     159    43.0%   0.95

Isner still stands at the top of the leaderboard, while Bernard Tomic and Grigor Dimitrov give us a mild surprise by filling out the top three. Again, as the sample size increases, the variation decreases even further, illustrating that, over the long term, players tend to serve about as well at one score as they do at any other.

The Slow but Steady Erosion of the Server’s Advantage

After a couple of weeks of data-driven skepticism, I can finally confirm a bit of tennis’s conventional wisdom. Over the course of a typical match, breaks of serve are a little easier to come by.

This result–based on tens of thousands of matches from the last few years–is similar for both men and women. After about twelve games (total, not service games for each player), a hold is roughly 2% less likely than it was in the first few games of the match. By the 25th game, a hold is approximately 5% less likely than at the beginning of the match.

To control for the vagaries of surface, opponent, and other conditions, I’ve compared each service game to the server’s hold percentage within that match. Only the closest matches are likely to go very long, so it’s important to compare the last games of those matches to games with similarly even opponents.

It seems that this effect is the result of one or both of two factors: server fatigue (which may have more of an effect on results than an equivalent amount of returner fatigue), and the returner’s increasing familiarity with the server. It would be difficult to separate these two–and with this dataset, probably impossible–so for today, let’s stick with the nature of the effect, not its causes.

The following graph shows the relative probability of a hold of serve based on how much of the match (in games) has been played:

Relative hold percentage

I’ve set the hold probability of the first game at 100%, so all other numbers are relative to that. I’ve excluded tiebreaks from these calculations, though I considered them when counting games–that is, the first game of the second set after a tiebreak is considered the 14th game, not the 13th.

The results get a lot noisier starting around the women’s 25th game and the men’s 35th game, for the simple reason that most matches don’t get that far. For example, while the WTA calculations are based on 11,000 matches, only one-third reached the 25th game and less than one-tenth made it to the 31st.

The general downward trend indicates that the fatigue and/or familiarity effect dwarfs the effect of new balls. I have found that in men’s matches, the age of balls has a very small effect on hold percentage, and in women’s matches, it has no effect. In any case, the steady ebb of the server’s advantage is a stronger effect.

It is likely that some players suffer more from fatigue or familiarity than others. Due to the smaller size of the per-player samples, especially beyond the 20th game or so, I’m reluctant to draw any strong conclusions. Still, there are some intriguing numbers for the players for whom the dataset contains the most matches.

Here, I’ve calculated the hold percentage for several top players at various stages of the match, relative to their hold percentage in the first ten games. Thus, a number below 100% indicates less frequent holds, while a number above 100% means more frequent holds:

Player                 Matches  11 to 20  21 to 30  31 to 50  
Tomas Berdych              337     98.5%     98.3%    101.5%  
David Ferrer               330     97.0%     99.4%    102.4%  
Novak Djokovic             325    100.1%    101.8%    101.7%  
Roger Federer              325    100.2%     99.6%    100.4%  
Andy Murray                295     97.7%     98.7%     97.9%  
Rafael Nadal               293     99.2%    100.3%     93.7%  
Jo-Wilfried Tsonga         255    100.4%    100.9%     99.6%  
Philipp Kohlschreiber      252    101.4%     97.9%     96.7%  
John Isner                 251    100.4%    100.4%    100.3%  
                                                              
Player                 Matches  11 to 20  21 to 30  31 to 50  
Kevin Anderson             247    100.0%     98.1%     97.5%  
Richard Gasquet            246     99.1%     98.4%    105.1%  
Gilles Simon               245    100.1%    103.7%     95.0%  
Milos Raonic               238     97.1%     96.1%     96.7%  
Marin Cilic                238     95.4%     97.5%     94.5%  
Fabio Fognini              235    100.4%     99.6%     98.2%  
Kei Nishikori              233    101.8%    104.1%    107.2%  
Grigor Dimitrov            224    100.9%    100.3%     94.6%  
Andreas Seppi              221    106.4%    100.4%    103.1%  
Feliciano Lopez            221     99.2%     99.7%     98.4%  
                                                              
Total                    23326     98.1%     96.1%     95.1%

While John Isner is steady throughout the stages of the match, other big servers such as Milos Raonic and Marin Cilic are less dominant as the match progresses. The players whose hold percentage improves through the match–such as Novak Djokovic and David Ferrer–tend to be those without big serves, so we may be looking at more of an overall fatigue effect in those cases.

The most extreme number in the table is Rafael Nadal‘s relative hold percentage after the 30th game. Perhaps after that much time on court, his opponents finally figure out how to defend against the ad-court slider.

Here are the same calculations for top WTA players:

Player                Matches  11 to 15  16 to 20  21 to 40  
Agnieszka Radwanska       299    101.0%    104.9%     98.0%  
Sara Errani               279     97.7%     91.2%     92.7%  
Caroline Wozniacki        279    103.1%    102.3%    104.9%  
Serena Williams           266    102.8%    102.4%    104.9%  
Angelique Kerber          265    101.9%    103.0%    101.5%  
Samantha Stosur           253     99.2%    105.0%     97.6%  
Carla Suarez Navarro      252    102.2%    101.8%     93.7%  
Petra Kvitova             251     93.9%    100.4%     95.9%  
Roberta Vinci             250     94.2%     97.9%     95.4%  
Ana Ivanovic              241    100.8%    106.0%     95.2%  
Jelena Jankovic           241    102.2%    108.7%     96.4%  
                                                             
Player                Matches  11 to 15  16 to 20  21 to 40  
Maria Sharapova           236    100.1%    105.9%    104.9%  
Victoria Azarenka         228    100.6%    103.7%     97.8%  
Lucie Safarova            227    102.7%    100.5%     94.4%  
Simona Halep              224     89.2%     95.3%    101.7%  
Dominika Cibulkova        210     98.7%     89.9%     99.9%  
Alize Cornet              210     96.2%    102.8%     96.4%  
Andrea Petkovic           194    101.5%    104.2%    107.5%  
Sloane Stephens           185     97.5%     90.1%     88.7%  
Sabine Lisicki            185     97.4%     97.5%     96.6%  
Ekaterina Makarova        185     96.6%    102.8%     92.8%  
Flavia Pennetta           180    105.1%     92.9%    103.9%  
                                                             
Total                   22406     98.6%     97.2%     95.0%

Here is some confirmation that Serena Williams–at least on serve–gets better as the match progresses. Many of the other players with the strongest serve results late in matches are those known for fitness (like Caroline Wozniacki) or steeliness (Maria Sharapova).

Whether the root cause is fatigue or familiarity, most players are less effective on serve as the match progresses. With further research, I hope we’ll be able to better understand the cause and determine whether there are advantages to serving particularly well at certain stages of the match.

The Effects (and Maybe Even Momentum) of a Long Rally

Italian translation at settesei.it

In yesterday’s quarterfinal between Simona Halep and Victoria Azarenka, a highlight early in the third set was a 25-shot rally that Vika finished off with a forehand winner. It was the longest point of the match, and moved her within a point of holding serve to open the set.

As very long rallies often do, the point seemed like it might represent a momentum shift. Instead, Halep sent the game back to deuce after a 10-stroke rally on the next point. If there was any momentum conferred by these two points, it disappeared as quickly as it arose. It took eight more points before Azarenka finally sealed the hold of serve.

Does a long rally tell us anything at all? Does it have predictive value for the next point, or even the entire game, or is it just highlight-reel fodder that is forgotten as soon as the umpire announces the score?

To answer those questions, I delved into the shot-by-shot data of the Match Charting Project, which now contains point-by-point accounts of nearly 1,100 matches. I identified the longest 1% of points–17 shots or longer for women, 18 shots for men–and analyzed what happened afterwards, looking for both fatigue and momentum effects.

The next point

There’s one clear effect of a long rally: The next point will be shorter than average. The 10-shot rally contested by Vika and Simona yesterday was an outlier: Women average 4.45 shots on the point after a long rally, while the overall average (controlled for server and first or second serve) is 4.85. Men average 4.03 shots on the following point, compared to an average of 4.64.

For women, fatigue is also a factor for the server. Following a long rally, women land only 61.3% of first serves, compared to an average of 64.6%. Men don’t exhibit the same fatigue effect; the equivalent numbers are 62.3% and 62.2%.

There’s more evidence of an immediate fatigue factor for women, as well. The players who win those long rallies are slightly better than their opponents, winning 50.7% of points on average. Immediately after a long rally, however, players win only 49% of points.  It’s not obvious to me why this should be the case. Perhaps the player who won the long rally worked a bit harder than her opponent, maybe putting all of her remaining effort into a groundstroke winner, or finishing the point with a couple of athletic shots at the net.

In any case, there’s no equivalent effect for men.  After winning a long rally, players win 51.1% of their next points, compared to an expected 50.8%. That’s either a very small momentum effect or, more likely, a bit of statistical noise.

Both men and women double fault more often than usual after a long rally, though the effect is much greater for women. Immediately following these points, women double fault 4.7% of the time, compared to an average of 3.3%. Men double fault 4.5% of the time after a long rally, compared to an expected rate of 4.2%.

Longer-term momentum

Beyond a slightly effect on the characteristics of the next point, does a long rally influence the outcome of the game? The evidence suggests that it doesn’t.

For each long rally, I identified whether the winner of the rally went on to win the game, as Vika did yesterday. I also combined the score after the long rally with the average rate of points won on the appropriate player’s serve to calculate the odds that, from such a score, the player who won the rally would go on to win the game. To use yesterday’s example, when Azarenka held game point at AD-40, her chances of winning the game were 77.6%.

For both men and women, there is no significant effect. Women who won long rallies went on to win 66.2% of those games, while they would have been expected to win 65.7%. Men won 64.4% of those games, compared to an expected rate of 64.1%.

With a much larger dataset, these findings might indicate a very slight momentum effect. But limited to under 1,000 long-rally points for each gender, the differences represent only a few games that went the way of the player who won the long point.

For now, we’ll have to conclude that the aftereffects of a long rally have a very short lifespan: barely one point for women, perhaps not even that long for men. These points may well have a greater effect on fans than they do on the players themselves.

Is Kevin Anderson Developing Into an Elite Player?

Italian translation at settesei.it

With his upset win over Andy Murray on Monday, Kevin Anderson reached his first career Grand Slam quarterfinal. At age 29, he’ll ascend to a new peak ranking, and with a bit of cooperation from the rest of the draw, one more win could put him in the top ten for the first time.

Anderson has been a stalwart in the top 20 for two years now, but this additional step comes as a bit of a surprise. Despite the overall aging of the ATP tour and the emergence of Stan Wawrinka as a multi-Slam champion, it’s still a bit difficult to imagine a player in his late twenties taking major steps forward in his career.

What’s more, Anderson’s game is very serve-dependent. With an excellent backhand, he isn’t as one-dimensional a player as John Isner, Ivo Karlovic, or perhaps even Milos Raonic, but it’s much easier to categorize him with those players than with more baseline-oriented peers.

In today’s game, it is very difficult to reach the very top ranks without a quality return game. Tiebreaks are too much of a lottery to depend on in the long-term; you have to consistently break serve to win matches. As I wrote in a post about Nick Kyrgios earlier this year, almost no players have finished a season in the top ten without winning at least 37% of return points. Anderson has achieved that mark only once, in 2010. Entering the US Open this year, he was winning only 34.2% of return points.

The only top-ten player this year with a lower rate of return points won is Raonic, at 30.2%. Raonic is a historical anomaly, and as his tiebreak winning percentage has tumbled, from a near-record 75% last year to a more typical 51% this year, his place in the top ten is in jeopardy as well. In other words, the only servebot in the top ten has to rely on plenty of luck–or outstanding, perhaps one-of-a-kind skills in the clutch–to remain among the game’s elite.

Anderson is a more well-rounded player than Raonic, and he wins more return points than that. But he still falls well short of the next-worst return game in the top ten, Wawrinka’s 36.7%. The 2.5 percentage points between Anderson and Wawrinka represent a big gap, almost one-fifth of the entire range between the game’s best and worst returners.

The less effective a player’s return game, the more he must rely on tiebreaks to win sets, and that’s one explanation for Anderson’s success this season. His 62%(26-16) tiebreak winning percentage in 2015 is the best of his career, and considerably higher than his career tiebreak winning percentage of 54%. Again, it sounds like a small difference, but take away three or four of the tiebreaks he’s won this year, and he no longer reached the final at Queen’s Club … or might not be preparing for a quarterfinal in New York.

Very few players have managed to spend meaningful time in the top ten while depending so heavily on winning tiebreaks. Another metric to help us see this is the percentage of sets won that are won in tiebreaks. Entering the US Open, just over 25% of Anderson’s sets won were won in tiebreaks. Only four times since 1991 has a player sustained a rate that high and ended the year in the top ten: Raonic last year, Andy Roddick in 2007 and 2009, and Greg Rusedski in 1998.

In fact, between 1991 and 2014, only 17 times did a player finish a season in the top ten with this rate above 20%. Roddick represents five of those times, and almost all, except for Roddick at his peak, were players who finished outside the top five. Wawrinka’s and Raonic’s 2014 seasons were the only occurrences in the last decade.

The one ray of light in Anderson’s statistical profile this season is a significantly improved first serve. His 2015 ace rate is over 18%, compared to the 2014 (and career average) rate of 14%. His percentage of first-serve points won is up to 78.8%, from last season’s 75.4% and a career average of 75.8%.

This is a major improvement, and is the reason why he is one of only five players on tour (along with Isner, Karlovic, Roger Federer, and Novak Djokovic) winning more than 69% of service points this year. In many ways, Anderson’s stats are similar to those of Feliciano Lopez, but the Spaniard–another player who has long stood on the fringes on the top ten–has never topped 68% of service points won for a full season.

If Anderson can sustain this new level of first-serve effectiveness, he will–at the very least–continue to see a bit more success in tiebreaks. A tiebreak winning percentage higher than his career average of 54% (though still probably below his 2015 rate of 62%) will help keep him in the top 15. However, even for the best servers, tiebreaks are often little more than coin flips, and players don’t join the game’s elite by relying on coin flips.

As his quarterfinal appearance at the Open shows, Anderson is moving in the right direction. It’s easy to see a path for him that involves ending the season in the top ten. But to move up to the level above that, following the path of someone like Wawrinka, he’ll need to start serving like peak Andy Roddick, or–perhaps just as difficult–significantly improve his return game.

Break Point Persistence: Why Venus is Better Than Her Ranking

Some points matter a lot more than others. A couple of clutch break point conversions or a well-played tiebreak make it possible to win a match despite winning fewer than half of the points. Even when such statistical anomalies don’t occur, one point won at the right time can erase the damage done by several other points lost.

Break points are among the most important points, and because tennis’s governing bodies track them, we can easily study them. I’ve previously looked at break point stats, with a special emphasis on Federer, here and here. Today we’ll focus on break points in the women’s game.

The first step is to put break points in context. Rather than simply looking at a percentage saved or converted, we need to compare those rates to a player’s serve or return points won in general. Serena Williams is always going to save a higher percentage of break points than Sara Errani does, but that has much more to do with her excellent service game than any special skills on break points.

Once we do that, we have two results for each player: How much better (or worse) she is when facing break point on serve, and how much better (or worse) she is with a break point on return.

For instance, this year Serena has won 2.8% more service points than average when facing break point, and 7.5% more return points than average with a break point opportunity. The latter number is particularly good–not only compared to other players, but compared to Serena’s own record over the last ten years, when she’s converted break points exactly as often as she has won other break points.

Serena’s experience isn’t unusual. From one year to the next, these rates aren’t persistent, meaning that most players don’t consistently win or lose many more break points than expected. Since 2006, Maria Sharapova has converted 1% fewer break points than expected. Caroline Wozniacki has recorded exactly the same rate, while Victoria Azarenka has converted 2% fewer break points than expected.

On serve, the story is similar, with a slight twist. Inexperienced players seem to perform a little worse when trying to convert a break point against a more experienced opponent, so most top players save break points about 4% more often than they win other service points. Serena, Sharapova, Wozniacki, Azarenka, and Petra Kvitova all have career rates at about this level.

Unlike in the men’s game, there’s little evidence that left-handers have a special advantage saving break points on serve. Angelique Kerber is a few percentage points above average, but Kvitova, Lucie Safarova, and Ekaterina Makarova are all within one percentage point of neutral.

While a few marginal players are as much as ten percentage points away from neutral saving break points or converting them, the main takeaway here is that no one is building a great career on the back of consistent clutch performances on break points. Among women with at least 250 tour-level matches in the last decade, only Barbora Strycova has won more than 3% more break points (serve and return combined) than expected. Maria Kirilenko is the only player more than 3% below expected.

This analysis doesn’t tell us anything very interesting about the intrinsic skills of our favorite players, but that doesn’t mean it’s without value. If we can count on almost all players posting average numbers over the long term, we can identify short-term extremes and predict that certain players will return to normal.

And that (finally) brings us to Venus Williams. Since 2006, Venus has played break points a little bit worse than average, saving 2% more break points than typical serve points (compared to +4% for most stars) and winning break points on return 3% less often than other return points.

But this year, Venus has saved break points 17% less often than typical service points, the lowest single-season number from someone who played more than 20 tour-level matches. That’s roughly once per match this year that Venus has failed to save a break point that–in an average year–she would’ve saved.

There’s no guarantee that saving those additional break points would’ve changed many of Venus’s results this year, but given the usual strength of her service game, holding serve even a little bit more would make a difference.

This type of analysis can’t say whether a rough patch like Venus’s is due to bad luck, mental lapses, or something else entirely, but it does suggest very strongly than she will bounce back. In fact, she already has. In her successful US Open run, she’s won about 66% of service points while saving 63% of break points. That’s not nearly as good as Serena’s performance this year, but it’s much closer to her own career average.

Like so many tennis stats that fluctuate from match to match or year to year, this is another one that evens out in the end. A particularly good or bad number probably isn’t a sign of a long-term trend. Instead, it’s a signal that the short-term streak is unlikely to last.

Ivo Karlovic and His Remarkable 10,466* Aces

Italian translation at settesei.it

Here’s the official story: This week, Ivo Karlovic crossed the much-heralded 10,000-ace milestone. Next up is the all-time record of 10,183 aces, held by Goran Ivanisevic.

Karlovic is one of the greatest servers in the game’s history, and he has in fact hit more than 10,000 aces. Ivanisevic was really good at serving, too, and he might even hold the all-time record. But when it comes down to the details in this week’s ATP press releases, all the numbers are wrong.

Last year, Carl Bialik laid out the two main problems with ATP ace records:

  • The ATP doesn’t have any stats from before 1991. (Ivanisevic started playing tour-level matches in 1988.)
  • ATP totals don’t include aces from Davis Cup matches, even though Davis Cup results are counted toward won-loss records and rankings.

I’ll add one more: There are plenty of other matches since 1991 with no recorded ace counts, too. By my count, we don’t have stats for 14 of Ivanisevic’s post-1991 matches. (They’re not on the official ATP site, anyway.) That doesn’t count Davis Cup, the Olympics (also no stats), and the now-defunct Grand Slam Cup.

If you like tracking records and comparing the best players from different eras, tennis might not be your sport. All of these problems exist for players who retired only recently, and some of the issues persist to the present day. And if you want to compare Federer or Ivanisevic with, say, Boris Becker or–it’s tough to write this without laughing–Pancho Gonzalez, you’re completely out of luck.

We’ll probably never find ace totals from all of the missing matches. But it seems silly to pretend we can identify the true record-holder and celebrate when these “records” are broken when we so obviously cannot.

Approximate* career* totals*

What we can do is estimate the number of missing aces for each of the top contenders. In Ivanisevic’s case, his 1988-90 seasons, combined with Davis Cup and other gaps in the record, total nearly 200 matches. Even if we can’t pinpoint the exact number of uncounted aces, we can come up with a number that demonstrates just how far ahead of Karlovic he currently stands.

To fill in the gaps, I calculated each player’s rate of aces per game for each surface for every season he played. For 1988-90, I used 1991 rates. (This post at First Ball In, which I discovered after writing mine, suggests that players improve their ace rates the first few seasons of their careers, so we should adjust a bit downward. That may be right. A 5% penalty for Goran’s 1988-90 knocks off about 60 aces from his total below.)

Once we crunch the numbers, we get an estimated 2,368 aces in Ivanisevic’s 195 “missing” matches. That gives him a career total of 12,551–a mark Karlovic couldn’t achieve until the end of 2017, if then.

But wait–Ivo has some missing matches, too! The gaps in his record only amount to 21 matches, mostly Davis Cup. The same approximation method adds 466 aces to his record, meaning he hit that 10,000th ace back in June, in his second-rounder against Alexander Zverev. Even with those nearly 500 “extra” aces, Ivanisevic’s record is almost surely out of reach.

What about Pete Sampras? Officially, Pete is fifth on the all-time list, with 8,858 aces. But like Goran, he played a lot of matches before record-keeping began in 1991. His ace record is missing nearly 200 matches, as well.

In Sampras’s case, we can estimate that he hit 1,815 aces that aren’t reflected in his official total. (In line with the caveat regarding Goran’s total above, we might want to knock that total down by 50 to reflect the possibility that he hit more aces in 1991 than in 1988-90.)

Making similar minor adjustments to the other members of the top five, Federer and Andy Roddick, here’s what the all-time list should look like, at least in general terms:

Player      Official  Est Missing  Est Total  
Ivanisevic     10183         2368      12551
Sampras         8858         1815      10673  
Karlovic       10022          466      10488  
Federer         9279          524       9803  
Roddick         9074          694       9768  

Coincidentally, Karlovic is officially within 200 aces of  Ivanisevic’s all-time record, and while he really isn’t anywhere near the record, he is that close our estimate of Sampras’s second-place total.

We can be confident that Ivo is a great server. But if we can’t be sure of his own ace total, mostly amassed in the last decade, it seems foolish to pretend that we’ll know when–or even if–he breaks the all-time record.

The Almost Neutral Let Cord

Italian translation at settesei.it

Once I started charting matches–carefully watching and notating every shot–I thought I noticed a trend after “let” serves. It seemed that players missed far more first serves than usual after a let, and when players landed a post-let first serve, their offering was weaker than usual.

Now that we have nearly 500 pro matches in the Match Charting Project database, including at least 200 each from both the ATP and the WTA, there’s plenty of data with which to test the hypothesis.

To my surprise, there’s no such trend. If anything, players–men in particular–are more likely to make a first serve after a let cord. When they do, they are at least as likely to win the point as in non-let points, suggesting that the serve is no weaker than usual.

Let’s start with the ATP numbers. In over 1,100 points in the charting database, the server began with a let. He eventually landed a first serve 62.8% of the time, compared to 62.0% of the time on non-let points. When he made the first serve, he won 73.3% of points that began with a let serve, compared to only 70.6% of first-serve points when there was no let.

More first serves in, and more success on first serves. The latter finding, with its difference of 2.7 percentage points, is particularly striking.

Of the trends I had expected to see, only one is borne out by the data. Since a net cord let is only millimeters away from a fault into the net, it seems logical that net faults would be more common immediately after a let than otherwise. That is the case: 15.7% of men’s first serves result in faults into the net, but after a let,  that figure jumps to 17.0%.

When we turn to WTA matches with available data, we find that the post-let effect is even stronger. In non-let points, first serves go in at a 62.8% rate. After a first-serve let, players record a 65.3% first-serve percentage. Given that first-serve percentages are usually concentrated in a relatively small range, a difference of 2.5 percentage points is quite significant.

The WTA data tells a different story than the ATP numbers do when we look at the end result of those first serves. On non-let points, WTA players win first-serve points at a 62.8% rate, while after a first-serve let, they win these points at only a 61.8% clip. It may be that some women approach post-let first serves a bit more conservatively, and they pay the price by winning fewer of those points.

WTA players also appear to miss a few more post-let first serves into the net, though the difference is not as striking as it is for men. On non-let points, net faults make up 16.2% of the total, and after first-serve lets, net faults account for 16.7% of first serves. Of all the numbers presented here, this one is most likely to be no more than random noise.

It turns out that let serves don’t have much to tell us about the next serve or its outcome–and that’s not much of a surprise. What I didn’t expect was that, after a let serve, professionals are a bit more likely than usual to find success with their next offering.

If you like watching tennis and think this kind of research is worth reading, please consider lending a hand with the Match Charting Project. There’s no other group effort of its kind, and the more matches in the database, the more valuable the analysis.

Donald Young’s Perpetual Hopes and the Lefty Serve That Isn’t

Donald Young celebrated his 25th birthday last week, and if you’ve been following the ATP for any part of the last decade, you know all about his talent, his potential, and his underwhelming results. Every time he goes deep in a tournament–as he has in Washington this week, upsetting Kevin Anderson in three sets today–all that upside talk gets dredged up again.  Is this finally the breakthrough for which we’ve waited so long?

In general, it’s a safe bet to watch longer-term trends more closely than short-term peaks and valleys. So the short, obvious answer is: No, it’s unlikely to be a sign of much greater things to come. Still, Young has beaten three top-50 players this week, and it’s a good time to take a closer look at what might be holding him back.

A prime obstacle isn’t hard to identify. Donald has one of the weakest serves on the ATP tour. While that doesn’t automatically keep him out of the top fifty in the world, it sure doesn’t help. Young’s year-to-date ace percentage, 3.4%, is among the ten worst on tour, and with the exception of David Ferrer and Roberto Bautista Agut, none of the other players on that list are inside the top 35. This year’s number is no slump, as Young’s ace rate has been below 4% every year since 2009.

Another metric to indicate the effectiveness of a player’s service game is the ratio of service winning percentage to return winning percentage (SW/RW). If a player wins lots of service points, it might be due to a good serve, or it might owe to a strong overall game. This ratio gives us a rough measure of how much a player’s success on serve is due to the serve itself.

Coming into Washington this week, Young’s SW/RW was 1.49, one of the lowest marks of any left-handed tour regular in the last ten years. A few right-handers succeed while winning only 50% more service points than return points–including Ferrer and, for one season, Andy Murray–but the average player on tour wins roughly 73% more serve points than return points. Even Rafael Nadal hasn’t fallen below the 1.5 mark since 2005.

As Ferrer has demonstrated, a player with Young’s level of service success can have a very good career on tour. Yet Ferrer’s skillset is unusual, and importantly, he’s a righty.

Not every successful ATP left-hander is a big server. Nadal won dozens of titles before fully developing the serve he uses today. Neither Fernando Verdasco nor Jurgen Melzer, two lefties who cracked the top ten, are known for overpowering deliveries. But in the last decade, Nadal is the only left-hander to consistently succeed with a SW/RW under 1.6.

It’s a different story for righties. As we’ve seen, Ferrer is a perennial top player despite Young-like serve stats. Fabio Fognini, Nikolay Davydenko, and Lleyton Hewitt have all enjoyed solid seasons without greater serve dominance than Young. (Though Hewitt has racked up better ace totals.)

Surprisingly, it isn’t that lefties are bigger servers. On average, both lefties and righties win about 73% more service points than return points. The tentative conclusion I see from these numbers is that lefties–with the typical exception of Rafa–can’t get away with a weak serve the way that right-handers can.

Young’s SW/RW this week of 1.69 suggests that, despite only 13 aces in four matches, he’s playing well behind his serve, and the results have followed.  It may be, though, that a modest improvement to his serve–or perhaps his tactics behind the serve–would be particularly valuable, seizing whatever specific advantages worked for guys like Verdasco and Melzer.

If Young is (finally) to take a big step forward, he’ll need to do more with his serve for a season–not just a week. He doesn’t need to become the next Feliciano Lopez; he just needs to be a little less like a left-handed Fognini.