After a couple of weeks of data-driven skepticism, I can finally confirm a bit of tennis’s conventional wisdom. Over the course of a typical match, breaks of serve are a little easier to come by.
This result–based on tens of thousands of matches from the last few years–is similar for both men and women. After about twelve games (total, not service games for each player), a hold is roughly 2% less likely than it was in the first few games of the match. By the 25th game, a hold is approximately 5% less likely than at the beginning of the match.
To control for the vagaries of surface, opponent, and other conditions, I’ve compared each service game to the server’s hold percentage within that match. Only the closest matches are likely to go very long, so it’s important to compare the last games of those matches to games with similarly even opponents.
It seems that this effect is the result of one or both of two factors: server fatigue (which may have more of an effect on results than an equivalent amount of returner fatigue), and the returner’s increasing familiarity with the server. It would be difficult to separate these two–and with this dataset, probably impossible–so for today, let’s stick with the nature of the effect, not its causes.
The following graph shows the relative probability of a hold of serve based on how much of the match (in games) has been played:
I’ve set the hold probability of the first game at 100%, so all other numbers are relative to that. I’ve excluded tiebreaks from these calculations, though I considered them when counting games–that is, the first game of the second set after a tiebreak is considered the 14th game, not the 13th.
The results get a lot noisier starting around the women’s 25th game and the men’s 35th game, for the simple reason that most matches don’t get that far. For example, while the WTA calculations are based on 11,000 matches, only one-third reached the 25th game and less than one-tenth made it to the 31st.
The general downward trend indicates that the fatigue and/or familiarity effect dwarfs the effect of new balls. I have found that in men’s matches, the age of balls has a very small effect on hold percentage, and in women’s matches, it has no effect. In any case, the steady ebb of the server’s advantage is a stronger effect.
It is likely that some players suffer more from fatigue or familiarity than others. Due to the smaller size of the per-player samples, especially beyond the 20th game or so, I’m reluctant to draw any strong conclusions. Still, there are some intriguing numbers for the players for whom the dataset contains the most matches.
Here, I’ve calculated the hold percentage for several top players at various stages of the match, relative to their hold percentage in the first ten games. Thus, a number below 100% indicates less frequent holds, while a number above 100% means more frequent holds:
Player Matches 11 to 20 21 to 30 31 to 50 Tomas Berdych 337 98.5% 98.3% 101.5% David Ferrer 330 97.0% 99.4% 102.4% Novak Djokovic 325 100.1% 101.8% 101.7% Roger Federer 325 100.2% 99.6% 100.4% Andy Murray 295 97.7% 98.7% 97.9% Rafael Nadal 293 99.2% 100.3% 93.7% Jo-Wilfried Tsonga 255 100.4% 100.9% 99.6% Philipp Kohlschreiber 252 101.4% 97.9% 96.7% John Isner 251 100.4% 100.4% 100.3% Player Matches 11 to 20 21 to 30 31 to 50 Kevin Anderson 247 100.0% 98.1% 97.5% Richard Gasquet 246 99.1% 98.4% 105.1% Gilles Simon 245 100.1% 103.7% 95.0% Milos Raonic 238 97.1% 96.1% 96.7% Marin Cilic 238 95.4% 97.5% 94.5% Fabio Fognini 235 100.4% 99.6% 98.2% Kei Nishikori 233 101.8% 104.1% 107.2% Grigor Dimitrov 224 100.9% 100.3% 94.6% Andreas Seppi 221 106.4% 100.4% 103.1% Feliciano Lopez 221 99.2% 99.7% 98.4% Total 23326 98.1% 96.1% 95.1%
While John Isner is steady throughout the stages of the match, other big servers such as Milos Raonic and Marin Cilic are less dominant as the match progresses. The players whose hold percentage improves through the match–such as Novak Djokovic and David Ferrer–tend to be those without big serves, so we may be looking at more of an overall fatigue effect in those cases.
The most extreme number in the table is Rafael Nadal‘s relative hold percentage after the 30th game. Perhaps after that much time on court, his opponents finally figure out how to defend against the ad-court slider.
Here are the same calculations for top WTA players:
Player Matches 11 to 15 16 to 20 21 to 40 Agnieszka Radwanska 299 101.0% 104.9% 98.0% Sara Errani 279 97.7% 91.2% 92.7% Caroline Wozniacki 279 103.1% 102.3% 104.9% Serena Williams 266 102.8% 102.4% 104.9% Angelique Kerber 265 101.9% 103.0% 101.5% Samantha Stosur 253 99.2% 105.0% 97.6% Carla Suarez Navarro 252 102.2% 101.8% 93.7% Petra Kvitova 251 93.9% 100.4% 95.9% Roberta Vinci 250 94.2% 97.9% 95.4% Ana Ivanovic 241 100.8% 106.0% 95.2% Jelena Jankovic 241 102.2% 108.7% 96.4% Player Matches 11 to 15 16 to 20 21 to 40 Maria Sharapova 236 100.1% 105.9% 104.9% Victoria Azarenka 228 100.6% 103.7% 97.8% Lucie Safarova 227 102.7% 100.5% 94.4% Simona Halep 224 89.2% 95.3% 101.7% Dominika Cibulkova 210 98.7% 89.9% 99.9% Alize Cornet 210 96.2% 102.8% 96.4% Andrea Petkovic 194 101.5% 104.2% 107.5% Sloane Stephens 185 97.5% 90.1% 88.7% Sabine Lisicki 185 97.4% 97.5% 96.6% Ekaterina Makarova 185 96.6% 102.8% 92.8% Flavia Pennetta 180 105.1% 92.9% 103.9% Total 22406 98.6% 97.2% 95.0%
Here is some confirmation that Serena Williams–at least on serve–gets better as the match progresses. Many of the other players with the strongest serve results late in matches are those known for fitness (like Caroline Wozniacki) or steeliness (Maria Sharapova).
Whether the root cause is fatigue or familiarity, most players are less effective on serve as the match progresses. With further research, I hope we’ll be able to better understand the cause and determine whether there are advantages to serving particularly well at certain stages of the match.
Clear reason why holds are 2% less likely than in the 1st few games of a match? Returners begin to ‘notice’ regular ‘patterns’.
The returner begins to gain a read on the server’s serving patterns from the deuce & AD sides & these ‘relative’ to ‘scoreboard’.
Other factor that I believe may also be playing a significant role is players are more likely to get tight at 3-4, 4-5 down.
As far as the WTA is concerned, some coaches are coming on court & alerting their player to the certain tenancies in moments. I believe there R quite a few ‘Venus Williams’ modelled players entering the game & so the weak 2nd serves are being abused.
Getting tight late in sets has a small effect, but isn’t the main part of what we’re seeing here. At 3-4, for instance, hold% is only 0.7% below average. Counting all ‘on-serve’ scores, holds are 1% less likely at 3-3 or later than before 3-3.
Okay. Interesting, thank you.
What about getting a ‘feel’ for the serving patterns in big important moments, such as on the break-points for instance? Players have their go-to serve in important moments and so the more break points they have to save, the more the returner may have lodged in their mind where they are likely to go on their future break points. As an example.
I saw you tweeted something about outright hostility from readers. I hope this wasn’t referring to me? lol. If you’re confused, let me be clear and say I appreciate the work you are doing and I have given your article a retweet. Cheers.
nope, def not at you, no worries.
Oh that’s fine then, because then Burton was referring to arbitrage and I was becoming baffled. No worries. Probably people’s comments who you’ve ‘unapproved’?