Not all points are created equal. Ask around, and you’ll get a variety of opinions as to which points are most important. Break points, obviously, are key. Pundits are fond of 15-30.
Then there’s the first point of the game. It’s been conventional wisdom for a long time that the opening points holds disproportionate weight. In a previous study, I disproved that. Of course it’s valuable to move from 0-0 to 15-0, and no one likes to start a game by dropping to 0-15. But the first point doesn’t have any magical effect on the outcome of the game beyond simply adding to one or the other player’s tally.
Yet here I am, talking about the first point again. While there still isn’t any magic, the first point is going to the returner too often. With a slight change in tactics or focus, this is a rare analytical insight that pros may be able to use to win a few more service games.
Point by point
The balance between the server and returner varies a great deal depending on the point score. In men’s singles matches at the US Open between 2019 and 2021, servers won 63.6% of points in non-tiebreak games. Yet at 40-love, the server won 67.7%, and at ad-out, the server won only 59.6%.
The point scores that generated such extremes hint at what’s going on here. If a game has reached 40-love, the server is probably a good one. It’s not always the case, but if you look at all the 40-love games in a large dataset, you’ll get far more John Isner holds than Benoit Paire holds. The opposite applies to ad-out, a score that Isner rarely faces. Thus, the difference in point-by-point serve percentage isn’t (entirely) because of the point score–it’s because of the servers who get there.
Other differences are more prosaic. On average, servers win more deuce-court points than ad-court points. In the same three-year dataset, the difference was 64.2% to 62.9%. There’s no selection bias component here. The typical ATPer is simply stronger in that direction. Some players–particularly left-handers–break the mold, but most will favor the deuce side. Both Novak Djokovic and Roger Federer, for instance, win nearly two percentage points more often when serving to that court.
Unbiasing
Because scores like 40-love and ad-out aren’t randomly distributed among servers, we need to do a bit more work to figure out which scores really do favor the server. The trick here is to compare each service point to the rest of the server’s points in the same match. A point like 40-love has a ton of Isners and Opelkas in it, so we’ll end up comparing it to a lot of other Isner and Opelka points. And in fact, the average player who reaches 40-love wins 65.0% of their service points and 64.3% in the ad court, two numbers that are well above average.
Working through the same exercise for every point score gives us a list of “actual” serve points won, “expected” serve points won, and differences. The “actual” column tells us what really happened at that score, bias and all; “expected” tells us how often that particular set of players won service points during the entire matches in question; and the difference gives us a first look at where servers are over- or under-performing.
The following table shows these numbers for each point score:
Score Actual Expected Difference
40-AD 59.6% 61.4% -1.8%
0-0 63.3% 64.6% -1.3%
15-0 62.7% 63.3% -0.6%
40-30 61.6% 62.2% -0.6%
15-30 62.3% 62.7% -0.4%
30-0 64.7% 65.1% -0.3%
40-40 62.6% 62.8% -0.1%
0-15 63.2% 63.3% -0.1%
Score Actual Expected Difference
40-15 64.6% 64.5% 0.0%
30-15 62.8% 62.7% 0.1%
AD-40 61.6% 61.4% 0.2%
30-30 64.0% 63.6% 0.4%
0-30 65.9% 65.2% 0.8%
15-15 64.8% 64.0% 0.8%
30-40 63.6% 62.2% 1.4%
0-40 66.1% 64.7% 1.4%
15-40 66.9% 64.5% 2.4%
40-0 67.7% 64.3% 3.4%
The scores at the top of the table are the ones where we would expect servers to win more points. At the bottom of the list are those where the server seems to overperform.
Some of the results lend themselves to easy narratives. Servers really focus at 0-40 and 15-40, while returners know they have more break chances coming. 40-AD (ad-out) seems like a stressful time to serve, and the numbers back that up. Other results are a bit more baffling–shouldn’t 30-30 and 40-40 be the same, since they are logically equivalent? Why are servers performing so well at 30-40 if they ultimately struggle at 40-AD?
And to today’s topic: What about the first point? It ranks second only to 40-AD in how much the server underperforms, despite no obvious reason why it should lean one way or the other.
Second to none
When we consider a few more factors, this first-point underperformance has an even greater impact.
One useful way to measure the importance of a point is with win probability. Given any point score (or set/game/point score), combined with the likelihood that the server will win any given point, you can calculate the probability of a hold (or a match victory). If we assume that the server wins 64.2% of points, he’ll hold 81.6% of the time, so his win probability at the beginning of the game is 81.6%.
* 64.2% was the rate in non-tiebreak games at the 2021 US Open, while the overall rate for this 2019-21 dataset is a bit lower.
The next concept is volatility. A point’s volatility is determined by how much the result could swing the win probability. By winning the first point, the server’s win probability rises to 89.7%, the figure for such a server at 15-love. If he loses, it falls to 67.2%. The difference–22.5%–tells us how much is at stake in that single point.
In volatility terms, the first point isn’t particularly crucial. A 22.5% swing far outstrips, say, the 9.3% volatility at 30-love, but it pales next to the 76.3% volatility at 30-40. When the server faces break point, one swing of the racket can determine whether win probability drops to zero (because he loses the game), or bounces back north of 50% (because he gets back to deuce).
What the first point of the game gives up in volatility, it wins back in volume. The stakes are never higher than at 40-AD, but at the US Open in the last few years, barely one-fifth of games ever get that far. By contrast, there’s a love-love kickoff in every single game.
By combining volatility and volume with the degree to which servers under- or over-perform, we can put together a top-level view of what players are gaining or losing at each point score.
Multipliers gone wild
In a tour de force of mathematical derring-do, I’m going to take these three numbers and multiply them together.
The “difference” from the previous table tells us how much better or worse players are serving at a specific point score, compared to their overall performance. If two differences are similar, the one that matters more is the one with higher volatility, right? So we multiply by volatility. And all else equal, the more often a situation occurs, the greater its impact on the end result. So we multiply by the number of occurrences in the dataset.
The final tally is volatility * occurrences * difference, cleverly dubbed “V*O*D” in the table below. The product of three percentages is tiny, so I’ve multiplied those figures by 10,000 to make the results easier to read.
Here are the results:
Score Volatility Occurrences Difference V*O*D
40-AD 76.3% 22% -1.8% -29.9
0-0 22.5% 100% -1.3% -29.2
15-30 44.9% 34% -0.4% -5.8
15-0 16.5% 50% -0.6% -4.9
40-30 23.8% 26% -0.6% -3.6
40-40 42.5% 43% -0.1% -2.6
0-15 33.2% 50% -0.1% -2.3
30-0 9.3% 27% -0.3% -0.9
Score Volatility Occurrences Difference V*O*D
40-15 8.5% 24% 0.0% 0.1
30-15 20.7% 34% 0.1% 0.6
AD-40 23.8% 22% 0.2% 1.1
40-0 3.0% 16% 3.4% 1.7
30-30 42.5% 32% 0.4% 5.9
0-40 31.4% 16% 1.4% 7.1
0-30 40.0% 27% 0.8% 8.2
15-15 29.4% 46% 0.8% 11.0
30-40 76.3% 25% 1.4% 26.3
15-40 49.0% 24% 2.4% 28.2
With all factors taken into account, we see that servers are giving up about as much on the first point of the game as they are when faced with nerves at 40-AD. Two point scores also stick out at the other end of the spectrum, where 30-40 puzzlingly continues to be a time when servers find their best stuff.
Exploiting the mundane
The exact V*O*D numbers are far (far!) from natural laws, but when I ran the same algorithm on data from other grand slams, the contours were nearly the same. In the 2017 and 2018 US Opens, for instance, 40-AD and 0-0 were again the standout “underperforming” points, and 0-0 was the one that topped the list.
* I took a rudimentary look at this topic very early in the blog’s history, using data from 2011. 0-0 didn’t stick out to the same degree, but I didn’t control for the deuce/ad difference, as I have today. When accounting for deuce-court strength, 0-0 performance looks relatively worse.
All of which is to say: I can’t explain why this is a thing, but it sure looks like it’s a thing. And if it’s a thing, it looks like an opportunity for savvy players and coaches.
I’m perfectly happy to accept that servers struggle to maintain their focus (and perhaps their ability to surprise) at 40-AD. More importantly, I’m sure that players and coaches are very aware of the necessary mental gymnastics so deep in a game.
On the other hand, there’s no good reason that servers should underperform at the start of every game. In fact, I’d be more ready to accept the idea that servers would have the edge. The opponent hasn’t seen a serve for a few minutes (or more), and the server’s arm is (relatively) fresh. While it’s not a recipe for domination, it sounds like a recipe for a tiny edge that the server can build on.
That’s why I believe there’s something to be exploited here. Perhaps players–or at least some of them–are taking a bit off their first-point first serves, using the opening salvo as a mini-warmup. Maybe they are more willing to hit their second-best serve, or aim to the returner’s stronger side, as a tactical move to set up more effective serves later in the game. As I’ve said, I don’t know why the numbers are turning up this underperformance, but it’s clear there’s a gap to be closed.
There’s no magic in the first point, but there’s an awful lot of value. Players who serve up their best stuff at the beginning of the game are getting an edge that their peers ought to be developing, too.