Dayana Yastremska’s Erratic Attack

Also today: February 2, 1974

Dayana Yastremska at the 2023 US Open. Credit: Hameltion

Power giveth, and power taketh away. Few women hit as hard as Dayana Yastremska does, and sometimes, when enough of her returns find the court, that translates into victory. She squeaked through Australian Open qualifying by winning three deciding sets against players outside the top 200, then demolished 7th seed Marketa Vondrousova and rode the resulting momentum all the way to the semi-finals.

Then, yesterday in Linz, she managed just two games against Donna Vekic. So it goes.

The Ukrainian is essentially Jelena Ostapenko lite, mixing a middling serve with monster groundstrokes and a do-or-die approach on return. I wrote a few weeks ago about how Ostapenko’s game style leaves her unusually susceptible to chance; that applies even more to her less accomplished colleague.

The good news for Yastremska is that momentum is temporary. She’ll have off days, like the 92-point flop against Vekic, and she’ll occasionally play a perfect hour, like the dismantling of Vondrousova. More often, though, she’ll pack it all into a single match. The 23-year-old’s stats from her third-round adventure in Melbourne against Emma Navarro make for a good illustration:

       SPW%  RPW%  Winners  UFE  
Set 1   64%   54%       12   11  
Set 2   50%   33%        6   15  
Set 3   73%   56%       15    8 

I’ll bet you can tell which sets she won. It was a lopsided match, just not always in favor of the same player.

Typically, the wildest fluctuation came in Yastremska’s return numbers. Her serve is a weak point–she holds less than 60% of service games, worse than all but one other top-50 player–and it is no picture of consistency, either. But her return is a shot she can ride to a major semi-final. In the first five matches of her Australian Open campaign, she won 48% of return points, including 21 of 38 break point chances. Against Victoria Azarenka in the fourth round, Yastremska landed only 60% of her returns, but when she put the serve back in play, she won nearly three-quarters of the time. Almost one in six Azarenka service points ended with a Yastremska return that Vika couldn’t handle.

A few days later against Qinwen Zheng, the same attack proved to be too risky. The Ukrainian put just half of Zheng’s serves back in play. More than 20% of those returns ended the point, but against all of the free points she gave away, it wasn’t enough. Unlike the scattershot second set of the Navarro match, there wasn’t enough time to find the range before the contest was over.

The streaky slugger

After Yastremska’s eight straight wins from qualifying to the Australian Open semi-final, it’s tempting to call her a streaky player. Combine the big-picture run with narrow-focus ups and downs like the three sets of the Navarro match, and she looks like a kite blown around by the winds of chance at both the macro and micro levels.

I normally dismiss claims that any player’s results are particularly momentum-driven: While athletes aren’t robots, study after study suggests that if momentum (or “clutch” or “streakiness”) is real, it’s a minor effect, far more minor than commentators or the casual fan seems to believe. But after watching the Ukrainian’s three sets against Navarro, I had to test it.

Here’s a more precise hypothesis: Yastremska is more likely to win a game when she has won the previous game, compared to when she has lost the previous game. That isn’t the whole story of in-match streakiness, but for a single number, I think it gets to the core of the issue.

Result? True!

Player                 Change after Gm-W  
Alison Riske Amritraj             +11.9%  
Linda Fruhvirtova                  +9.7%  
Lesia Tsurenko                     +8.9%  
Irina Camelia Begu                 +8.7%  
Ajla Tomljanovic                   +7.1%  
Kaja Juvan                         +6.5%  
Polona Hercog                      +6.0%  
Yulia Putintseva                   +5.6%  
Shuai Zhang                        +5.4%  
Dayana Yastremska                  +5.3% 
--- 
Jelena Ostapenko                   +3.2%  
Iga Swiatek                        +1.6%  
-- Average --                      +1.0%  
Aryna Sabalenka                    +0.3%  
Elena Rybakina                     -0.8%  
Coco Gauff                         -1.2%  
Caroline Garcia                    -3.3% 

Among the 102 women with at least 20 charted matches since 2017, Yastremska ranks in the top ten, winning games more than 5% more often than average when she has won the previous game. She out-momentums her fellow hyper-aggressor Ostapenko by a modest amount. Another slugger, Aryna Sabalenka, seems to be impervious to previous results, even more so than the slightly streaky average player.

(The exact metric compares games-that-follow-games-won to games-that-follow-games [that is, games that don’t begin a set] within the same match, and excludes tiebreaks. Winning a match 6-0 6-0 isn’t “streaky” by this measure, because it’s impossible to know whether the result is due to a lopsided matchup [or injury] or to momentum–the winner went 10 for 10 in games that followed games won, and 10 for 10 in games that followed any game. With this metric, a streaky player is one who wins 10 of 20 total games in a match including, say, 7 of 10 games that follow other games won.)

So Yastremska is a little tougher to beat when she’s on a roll. She’s really hard to derail if she has just won a game and you have the misfortune of serving. Here is the same metric, only limited to winning percentage in return games:

Player             After Service Hold  
Katerina Siniakova             +13.4%  
Dayana Yastremska              +13.3%  
Lauren Davis                   +13.0%  
Linda Fruhvirtova              +12.2%  
Tatjana Maria                  +11.2%  
Alison Riske Amritraj          +10.8%  
Marta Kostyuk                  +10.3%  
Anhelina Kalinina               +9.9%  
Yulia Putintseva                +9.6%  
Qinwen Zheng                    +8.8% 
--- 
Jelena Ostapenko                +2.7%  
Iga Swiatek                     +2.6%  
Aryna Sabalenka                 +2.2%  
-- Average --                   +1.0%  
Coco Gauff                      -1.3%  
Caroline Garcia                 -4.9%

Yastremska’s success in return games skyrockets after she has held serve. Maybe she feels especially confident after getting through a service game; maybe a hold is a sign that her whole game is clicking. Whatever the reason, she rides this particular type of momentum as much as anyone, trailing Siniakova at the top of the list by a meaningless 0.031 percentage points.

You might suspect–or at least, I initially suspected–that streakiness is related to slugging. It’s easy enough to invent a story to link the two: Big hitting is risky; winners and errors come in batches. But no, there’s virtually no correlation, positive or negative, between these measures of streakiness and any of the metrics I use to quantify aggression. Grinders like Yulia Putintseva share the top of the list with Yastremska, while attackers like Caroline Garcia appear at the other extreme.

For the Ukrainian, it seems, the ups and downs are here to stay. Until she gets more out of her serve, she’ll continue to get dragged into three-set battles against opponents much further down the ranking list. As long as she doesn’t miss too many returns, she’ll keep herself in position to win. The losses will sometimes be ugly, but the victories–like the games that contribute to them–will compensate by coming in batches.

* * *

February 2, 1974: Five-dollar words

My favorite moments in early-1970s tennis came when Billie Jean King got feisty. I don’t mean the take-this-fight-to-Congress, crusading Billie Jean, though there was plenty of that. On the rare occasions when an opponent pushed Madame Superstar to the brink, she could get downright nasty. Pity the poor linesmen.

Fifty years ago today, King faced longtime friend, doubles partner, and punching bag Rosie Casals in the semi-finals of the Virginia Slims of Washington. It was the marquee match of the week, with all of the tour’s other stars absent. Chris Evert and Nancy Richey were taking the week off, Evonne Goolagong was chasing appearance fees on the other side of the globe, and Margaret Court was pregnant. Billie Jean took it upon herself to keep the crowds happy: She went to three sets in the opening round against Kerry Harris, then delivered a 6-0, 6-1 masterclass to win her quarter-final against the 17-year-old Kathy Kuykendall.

Some fans griped about the ticket prices: five bucks for the King-Casals semi and six dollars for the evening session, which featured Australians Kerry Melville and Helen Gourlay in the other semi-final. The 2,800 locals who showed up for the afternoon match, at least, got their money’s worth.

Casals rounded into form just in time, having struggled a bit to recalibrate her game as the tour seesawed between indoor and outdoor events. Her athletic net game outpaced King’s own attack throughout the first set, leading Billie Jean to find a scapegoat among the officials. She berated the service line judge, even threatening to quit; Casals had to calm her down and convince her to stay. (Rosie quipped later that she deserved 60% of the prize money for keeping her pal on court.) After the Old Lady vented her wrath at the chair and two separate linesmen, she settled for moving the offending service line judge to the net cord.

“What this game needs are professional linesmen,” King said. “We’re years behind the times. There are too many questionable situations for a bunch of amateurs to try to master. I’ve suffered through 21 years of bad line calls, and I’m fed up.”

Tennis officiating was certainly a mixed bag. A few months earlier, at the men’s season-ending Masters event in Boston, a last-minute strike forced organizers to pluck fans from the crowd to call the lines.

But not everyone believed that Billie Jean’s reaction was warranted, or that it was triggered by what King called her own “low boiling point.” Melville and Gourlay played their match with the same crew and had no problems. “Most of this arguing with linesmen is done for tactical reasons,” Melville said. “It helps intimidate them. You can get away with it over here, but not in Australia.”

The offending service line judge, Stew Saphier, had a few words of his own. Nothing like this had ever happened to him before, and he wasn’t embarrassed by it. Why not? “Because I was correct in all my calls.”

Whatever the cause of King’s outburst, the day ended as it usually did. After dropping a 7-5 first set to Casals, she came back to win, 6-2, 6-0. The next day, she dispatched Melville 6-0, 6-2, completing the rare feat of a tournament victory that included a 6-0 set won in every match. She was now 14-1 on the young season, her only loss coming in the previous week’s final against Evert. Past her 30th birthday, more famous than ever, she still had plenty of battles ahead.

* * *

Meanwhile, in Ohio…

The men competing at the 1974 Dayton Pro Tennis Classic didn’t draw much in the way of crowds, but tournament organizers slapped together a sure-fire attraction: an exhibition match between Bobby Riggs and Cincinnati Reds star Pete Rose. 4,000 fans turned out for the famous court hustler and baseball’s “Charlie Hustle.” It was clear what they came for: Half of them left before the next regulation match got started.

“This is a disgrace for tennis,” said Yugoslavian veteran Boro Jovanovic. “People don’t come out to see us all week, then they come out for something like this.”

Rose insisted that Riggs play him “straight,” but after three games of running the outfielder ragged with all the spin that a 55-year-old arm could muster, the clowning began. Riggs donned everything from baseball catcher’s gear to a dress, and he eventually set out beach chairs and carried a briefcase to further aid his opponent’s cause. Final score: five games to two, Riggs.

Bobby recognized that rematches with King and Court were off the table and that neither Evert nor Goolagong were likely to accept a challenge. “I’d like to play women from all the world,” he said, naming Casals as potential foe. In the meantime, he’d take on all comers. With his Battle-of-the-Sexes celebrity still going strong, he knew people would show up to watch.

Click here for other posts about the 1974 season. Or here for dispatches from 1924.

* * *

Subscribe to the blog to receive each new post by email:

 

Aryna Sabalenka Under Pressure

Also today: January 26, 1924

Aryna Sabalenka at Wimbledon in 2023. Credit: Adrian Scottow

It felt like a pivotal moment. Aryna Sabalenka had taken a 5-2 first-set lead in yesterday’s Australian Open semi-final against Coco Gauff. Gauff kept the set going with a strong service game for 5-3. Sabalenka lost the first point on her serve, but bounced back with a plus-one backhand winner.

At 30-15, the American struck again. She took advantage of a Sabalenka second serve to drag the Belarusian into a backhand rally, ultimately drawing an unforced error on the ninth shot and putting the game back in play.

Then, still just two points from the set, Sabalenka double-faulted.

The narrative practically writes itself. Aryna hits hard, aims for the lines, and keeps points short. Let her do that, and she will destroy you. Her first five opponents in Melbourne managed a grand total of 16 games against her. On the other hand, if you keep the ball in play, she’ll start pressing, trying too hard to dictate with her serve, going for too much when a smackable groundstroke presents itself.

Gauff, by this reading, is Sabalenka’s nightmare opponent. She won the US Open final by denying the Belarusian one would-be winner after another. Not only can she take Sabalenka’s game away from her, but Coco–at least on a good day–won’t give it back on her own serve. When she lets loose, Gauff wields just as much power as her more tactically aggressive opponent.

As it turned out, Sabalenka did lose that service game. Several twists and turns later, Gauff led the set, 6-5. Only then did Aryna regroup, winning four straight points from 30-love to force a tiebreak, then dropping just two more points to clinch the set. Gauff kept the second set close, but Sabalenka never allowed her to reach break point. The contest closed with a narrative-busting move: Facing match point, Gauff pulled out a 12-stroke rally, the kind of point that has been known to steer her opponent off course. But instead of compounding the damage, Sabalenka came back with two unreturned serves. Game over.

What to believe, then? Was the apparent first-set turning point a reflection of the true Sabalenka? Or is this the new Aryna, who slams the door when challengers sniff opportunity? Or is it something else, the all-too-common story in which someone looks like a clutch hero or a constant choker, only for us to discover, after crunching all the numbers, that she’s impervious to momentum and plays pretty much the same all the time?

Recovering at a disadvantage

Sabalenka’s serve games do follow a pattern after she loses a longish rally. But the results are not entirely straightforward.

On the next point (assuming the lost rally didn’t end the service game), Aryna is more likely to miss her first serve:

Year   1stIn%  post-rallyL-1stIn%  Change  
2019    61.2%               55.9%   -8.6%  
2020    61.5%               57.0%   -7.3%  
2021    58.6%               52.6%  -10.3%  
2022    60.0%               59.9%    0.0%  
2023    61.1%               61.3%    0.4%  
2024    63.3%               62.5%   -1.2%
----  
TOTAL   60.5%               57.6%   -4.8% 

Most of the effect is concentrated in the earlier years of her career on tour. Yesterday, the trend ran in the opposite direction: She made nearly 76% of her first serves overall, but after Gauff won a rally, she landed 88% of them.

The trend is clearer–and persisting to the present–when we look at double faults after losing a rally:

Year     DF%  post-rallyL-DF%  Change  
2019    8.6%            10.4%   20.8%  
2020    6.2%             8.4%   36.9%  
2021    7.9%            11.8%   50.3%  
2022   10.7%            10.1%   -5.5%  
2023    6.2%             7.2%   16.5%  
2024    3.4%             8.3%  144.7%  
----
TOTAL   7.9%             9.6%   22.5%

2022 was Aryna’s year of the yips; she was more likely to bunch double faults together than hit them in particularly nervy spots. (Put another way: Every spot was a nervy one.) The 2024 number will surely come back to earth, but it is still revealing: Sabalenka has made so much progress in this aspect of her game, but her second-serve struggles continue when she faces the threat of getting dragged into another rally.

Some of these effects persist even longer. From those service games that last long enough, here are Sabalenka’s first-in and double-fault percentages two points after losing a long rally:

Year   1stIn%  +2 1stIn%  Change    DF%  +2 DF%  Change  
2019    61.2%      55.8%   -8.8%   8.6%    8.7%    1.2%  
2020    61.5%      50.5%  -17.9%   6.2%    7.2%   17.1%  
2021    58.6%      56.0%   -4.5%   7.9%    8.7%   10.5%  
2022    60.0%      63.1%    5.3%  10.7%    7.8%  -27.1%  
2023    61.1%      59.2%   -3.2%   6.2%    8.4%   35.6%  
2024    63.3%      57.1%   -9.7%   3.4%    2.4%  -30.1%  
----
TOTAL   60.5%      57.1%   -5.6%   7.9%    8.0%    2.0% 

She continues to miss more first serves even two points after the rally setback. To some degree, the memory should have dissipated–after all, something else happened on the intervening point. On the other hand, she’s back in the same court. If a reliable serve didn’t work in the deuce court at 30-love, there’s reason to doubt it at 30-all.

The double fault trends are less clear, in part because our sample size is shrinking and double faults are blessedly rare. If nothing else, it’s safe to conclude that the explosion of double faults on the point after the lost rally doesn’t continue to nearly the same degree.

Tallying the cost

Now, this all seems bad. Sabalenka possesses one of the best first serves in the game; her whole attack is built around it. Her emergence as a superstar came after she got control of the service yips and cut her double faults down to manageable levels. After losing a long rally, she needs her serve more than ever, and–at least by comparison with other situations–it isn’t there for her.

Except… it doesn’t matter! At least not on the first point. Here is the bottom-line figure of service points won:

Year    SPW%  post-rallyL-SPW%  Change  
2019   59.6%             63.8%    7.2%  
2020   60.3%             56.6%   -6.0%  
2021   61.5%             61.3%   -0.3%  
2022   57.2%             59.9%    4.7%  
2023   63.7%             63.9%    0.4%  
2024   66.7%             70.8%    6.3%  
----
TOTAL  60.7%             61.7%    1.6% 

Fewer first serves, but more serve points won. It isn’t supposed to work like that, but Sabalenka bounces back strong from lost rallies. A shift of +1.6% in her favor is solid enough, and it’s even better if you look solely at the last three years.

Part of the explanation is that she tightens up the rest of her game–exactly the opposite of what my off-the-cuff narrative suggests. Under pressure, I hypothesized, she would try too hard to end points. Instead, after losing a long rally, she’s more willing than usual to play another one: She commits 14% fewer plus-one errors than her usual rate, implying a lower rate of aggression when she has an early chance to put the point away.

On the second point after losing a long rally, the bottom-line outcomes are more mixed:

Year    SPW%  +2 SPW%  Change  
2019   59.6%    53.9%   -9.5%  
2020   60.3%    55.3%   -8.3%  
2021   61.5%    58.5%   -4.9%  
2022   57.2%    61.5%    7.4%  
2023   63.7%    60.7%   -4.7%  
2024   66.7%    71.4%    7.1% 
---- 
TOTAL  60.7%    58.2%   -4.0%

While these aren’t as rosy as the next-point results, focus on the last few years. Since the beginning of 2022, Aryna has won more service points than usual when she returns to the serving direction where she recently lost a long rally–despite landing fewer first serves. She is even stingier with plus-one errors on these points, coughing up 29% fewer than usual.

These trends did not hold in yesterday’s semi-final. While Sabalenka made more first serves on the two points after Gauff outlasted her in a rally, fewer of them ended in her favor: 4% less on the first point, 12% less on the second. We can’t read too much into single-match totals with stats like these: 4% is a difference of one point. And Gauff is a far superior returner and baseline player than the typical WTAer, one who is unlikely to lose focus after going toe to toe with Sabalenka for a point or two. The average player pushes Aryna to a seventh shot barely one-tenth of the time; Gauff did so on one of every six points yesterday.

All of this leads us to an unexpected conclusion: Does Aryna Sabalenka have nerves of steel? First serves and double faults are just components in a larger picture; when we measure her results by points won, Sabalenka serves more successfully right after an opponent makes her uncomfortable. The yips are gone, and the on-court histrionics are a diversion that deceived us all. Aryna under pressure may be even more fearsome than her typical, terrifying self.

* * *

January 26, 1924: Suzanne’s longest day

Suzanne Lenglen wasn’t accustomed to spending much time on court. In eight tournaments since the 1923 Championships at Wimbledon, she had lost just ten games. Her doubles matches, especially with net maven Elizabeth Ryan at her side, were often just as lopsided. She never missed, she could put the ball anywhere on the court, and most opponents were lucky just to win a single point.

Lenglen and Ryan in 1925 at Wimbledon. Colorization credit: Women’s Tennis Colorizations

In January 1924, Lenglen eased her way back onto the circuit. Battling some combination of illness, anxiety, and hypochondria, she didn’t return to singles action until February. (She’d win her first three matches before dropping a game.) But she was a celebrity on the French Riviera, and she was prevailed upon to compete in doubles. She won the mixed at the Hotel Beau-Site tournament in Cannes to ring in the new year, and she entered both the women’s doubles–with Ryan–and the mixed at the Hotel Gallia tournament a few weeks later.

On the 26th, Lenglen and Ryan completed their waltz through the draw, defeating a British pair, Phyllis Covell and Dorothy Shepherd-Barron, 6-3, 6-4. Suzanne’s most aggravating foe was another Brit, a line judge with the temerity to call a foot-fault on the five-time Wimbledon champion. She tried to get the man removed and ultimately had to settle for his “voluntary” departure. “It is unfair,” she said. “The English are pigs.”

Her nerves would be tested even more severely in the mixed doubles final. Lenglen partnered Charles Aeschlimann of Switzerland, while Ryan teamed with the 43-year-old Canadian Henry Mayes. Both men were better known on the Riviera than in the tennis world at large, more clubbable than talented. Lenglen and Ryan–herself one of the top few woman players in the world–would be the stars of the show.

Lenglen and Aeschlimann took the first set, 6-4; Ryan and Mayes came back with a 6-1 frame of their own. The underdogs–that is, the team without Suzanne–built up an early lead in the third, thanks to Aeschlimann’s inconsistency and Ryan’s glittering play. Mayes served for a 4-2 advantage, but a lucky netcord halted their momentum, and the deciding set settled into a rhythm it wouldn’t break for 20 more games.

Only at 13-14 did Ryan finally give in. She gifted a double fault to her opponents, and Mayes’s fatigue–he had played a four-set men’s doubles final beforehand–began to tell. Lenglen and Aeschlimann broke serve, securing the 6-4, 1-6, 15-13 victory. It would stand as the longest set of Suzanne’s unparalleled career.

* * *

Subscribe to the blog to receive each new post by email:

 

Dominic Thiem, Tennys Sandgren, and Playing Your Way In

Dominic Thiem is one of the best clay-court players on earth, with eight titles and a Roland Garros final to his credit. But his impressive track record wasn’t worth much last night, when he lost his opening-round match in Rio de Janeiro. The straight-set defeat to 90th-ranked Laslo Djere calls to mind other first-match failures, such as Thiem’s loss to Martin Klizan last summer in Hamburg, or his truly gobsmacking upset at the hands of 222nd-ranked Ramkumar Ramanathan on grass in Antalya two years ago.

It’s also not the first time this season that a top seed has proven unable to live up to their billing. Two weeks ago, the No. 1 seeds in three different ATP events all lost their first matches. I dug a bit deeper and discovered that top seeds underperform by a modest amount at these smaller tournaments. Rio is technically a higher-profile event, but the result is the same: An elite player at a non-mandatory event, heading home early.

You’ll hear all sorts of theories for this sort of thing. In ATP 250s, when top seeds get a bye, it’s possible that the elites are in danger because their opponents have played their way into form. At any optional events, it’s possible that the top seeds are not particularly motivated, making the trip for a quick appearance fee and nothing more. Finally, there’s the old saw that some competitors need to get used to their surroundings. In other words, they need to “play their way in” to the tournament. It’s this last theory that I’d like investigate.

Present and prepared

If a player needs time to get comfortable, we would expect him to underperform in the first round, and possibly continue playing below average to a lesser extent in the second round. The flip side of that is that the player would need to overperform in later rounds–if he didn’t, the earlier underperformance wouldn’t be below average, it would just be bad. These under- and over-performances are effects we can quantify.

Let’s start with Thiem. I went through his career results at the ATP level and broke his matches into several categories (some overlapping), like first match, second match, first match at a non-mandatory event, second-or-later match, finals, and so on. For each of those categories, I tallied up his results and compared them to expecatations (Expected Wins, or “ExpWins” in the table), based on what Elo forecasted at the time. Here are Thiem’s results:

Category     Matches  ExpWins  Wins  
1st              141     94.3    94  
1st (small)       84     52.9    54  
1st/2nd          238    151.3   151  
2nd               97     59.9    60  
2nd+             203    117.7   118  
3rd               58     34.9    35  
3rd+             106     60.7    61  
4th               32     18.5    19  
Finals            17     10.2    10

The Austrian has been almost comically predictable. In 84 non-mandatory tournaments through last week, Elo expected that he would win his first match 53 times. He won 54. In all tournaments, he has won his first match 94 times, exactly in line with the Elo estimation. In the nine categories shown here, his performances was never more than a 1.1 matches better or worse than expected. If he’s playing his way into tournaments, he’s doing it in a way that doesn’t show up in the results.

What about Tennys?

Thiem has suffered some rough early-round upsets, but over the course of his career, he’s usually ended up on the winning side. Maybe we’d do better to focus on a true feast-or-famine player, someone who more often loses his first-round encounters, but is dangerous when he advances further.

A great recent example of such a player is Tennys Sandgren. The American raced to the quarter-finals of last year’s Australian Open, reached a final in Houston, and won a title in Auckland to start the 2019 season. Other than that, he rarely turns up on the tennis fan’s radar. He acknowledged his inconsistency on a recent Thirty Love podcast, explaining from a player’s perspective why he thinks his results are so erratic. Like Thiem, he lost easily in an opening match last night, winning only four games against Reilly Opelka in Delray Beach.

Sandgren’s round-by-round results are less predictable than Thiem’s, but for an apparently extreme example of the go-big-or-go-home-early phenomenon, there’s not much support for it in the numbers. Because Sandgren has played fewer tour events than Thiem, I included his Challenger results before separating his matches into the same categories:

Category     Matches  ExpWins  Wins  
1st              124     64.7    62  
1st (small)      113     60.2    60  
1st/2nd          186     96.4    98  
2nd               62     31.7    36  
2nd+             120     60.3    63  
3rd               35     17.3    15  
4th               15      7.3     9  
Finals             8      4.2     3

The American has underperformed a bit in his first matches and beaten expectations in his second rounders, but the effect disappears after two matches are in the books. In any case, none of the over- or under-performances are even close to statistically significant. His extra first-match losses have about a one-in-three probability of happening by chance, and his bonus second-match wins would occur about one time in six. There could be something interesting going on here, but the effects are small, and it’s very likely that we’re seeing nothing more than randomness.

Positive results, anyone?

So far, we’ve investigated two players who seemed likely to over- or under-perform in certain groups of matches. Yet we found nothing. The “playing your way in” theory will surely survive this blog post, but let’s make sure there aren’t players who embody it, even if Thiem and Sandgren don’t.

I went through the same steps for the other 98 men in this week’s top 100, grouping their matches into categories, tallying up Elo-based expected wins and actual wins, and calculating the probability that their results–above or below expectations–are due to chance. The result is 1,043 player-categories, from Novak Djokovic’s finals to Pedro Sousa’s first matches. (The number of player-categories isn’t a round number because not every player has matches in every category, like 6th matches or finals.)

Of those 1,000 player-categories, only 29 meet the usual standard of statistical significance, in that there is less than a 5% chance they can be explained by randomness. A familiar example is Gael Monfils’s record in finals. Even with last week’s title in Rotterdam, his eight wins are outweighed by 21 losses. But such cases are extremely rare. Since fewer than 3% of the player-categories meet the 5% threshold, it’s wrong to say that these categories represent real trends (like, perhaps, a psychological basis for Monfils’s inability to win tournaments). When we test over one thousand groups of matches, dozens of them should look like outliers.

In other words, there’s no statistical support for the claim that certain players are more or less effective in certain rounds. It’s always possible that a very small number of guys have certain characteristics along these lines, but among the 29 player-categories with particularly unlikely results, only Monfils’s finals record fits any kind of narrative I’ve heard before. Richard Gasquet has won 120 times–11 more than expected–in first matches at non-mandatory events. That overperformance is just as unlikely as Monfils’s letdown in finals, so maybe we should be talking about how assiduously he prepares for the start of each tournament, no matter the stakes?

It’s always possible that the top men do, in fact, play their way into tournaments. But based on this evidence, it’s only the case if everyone rounds their way into form at approximately the same rate. Maybe first rounders are lower in quality than semi-finals. But if we’re interested in predicting outcomes–even Thiem’s first-round results against journeymen–we’d do better to ignore the theories. Opening matches just aren’t that unique, even for the players who think they are.

The Naomi Osaka First-Set Guarantee

Italian translation at settesei.it

Today in the Australian Open quarter-finals, Naomi Osaka recorded a routine victory, beating 6th seed Elina Svitolina 6-4 6-1. She’ll face Karolina Pliskova in tomorrow’s semi-final, and she has a chance to finish the tournament as the top-ranked player in the world.

(See the bottom of this post for updates.)

Osaka’s sprint to the finish line against Svitolina was what we’ve come to expect from the 21-year-old. The Eurosport commentators shared a remarkable stat: The last 59 times Osaka has won the first set, she has gone on to win the match. (On Eurosport during the match, they said 57, making today’s win 58, but I believe they left out a 2017 win by retirement against Heather Watson in which the first set was completed.) The last time she failed to convert a one-set advantage into a victory was the final match of her 2016 season, in Tianjin against Svetlana Kuznetsova.

Of course, winning the first set is a big advantage for anyone. If two players are evenly matched and there’s no momentum effect, the winner of the first set has a 75% chance of finishing the job. In the real world, the woman who takes the first set is usually the superior player, so her odds in the second and third sets are even better still. On the 2018 WTA tour, the player who claimed first set went on to win the match 81.5% of the time.

Even if Osaka’s theoretical odds of converting one-set advantages are even higher, 59 matches in a row is one heck of a feat. Only 15 women have an active streak of 10 or more consecutive first-set conversions, and a mere four hold a running streak of at least 20. In addition to Osaka, Aryna Sabalenka has converted 25 straight first-set victories, Qiang Wang has won 27 in a row, and Serena Williams is ready to pounce as soon as Osaka falters, with a current tally of 51. Serena’s string of consecutive conversions stretches over an even longer span, back to April 2016, in Miami. (Remember who came back to beat her? Svetlana Kuznetsova.)

It’s no surprise to see Serena showing up near the top of this list. After several years of looking up various tennis records and streaks, I’ve discovered a few general rules. First, if you think you’ve found a noteworthy recent achievement, Serena did it better. Second, if it involves brushing aside the tour’s rank and file, Steffi Graf was even better than Serena. And third, no matter how impressive Serena’s and Steffi’s feats, the all-time record will belong to either Chris Evert or Martina Navratilova.

The first-set-conversion streak no different. In addition to her current streak of 51 straight, Serena won 61 in a row in 2002-03. That’s two matches and three places above Osaka, but it’s only 37th on the all-time list. Graf converted first-set advantages for more than twice as long, tallying 126 in a row from 1989 to 1991. As impressive as that is, my third rule holds with a vengeance: Evert converted 220 in a row between 1978 and 1981 to earn top billing on this list. Navratilova comes in second, but with the consolation that she holds third place as well. Martina and Steffi are the only women with multiple triple-digit streaks.

Here are the longest first-set conversion streaks held by players in the top 40. Many of these women put together multiple streaks of 60 or more, and in those cases I’ve listed only their longest:

Rank  Player                   Matches     Span     Notes  
1     Chris Evert                  220  1978-81  + 3 more  
2     Martina Navratilova          172  1982-84  + 5 more  
4     Steffi Graf                  126  1989-91  + 3 more  
6     Monica Seles                 112  1991-93  + 1 more  
7     Mary Joe Fernandez           105  1989-91            
8     Pam Shriver                  105  1986-88            
9     Vera Zvonareva               103  2006-08            
12    Martina Hingis                86  1996-97            
14    Arantxa Sanchez Vicario       85  1992-93            
16    Victoria Azarenka             79  2011-13            
17    Maria Sharapova               77  2010-12  + 1 more  
19    Margaret Court                74  1969-77            
21    Venus Williams                73  1999-01            
22    Sue Barker                    70  1973-78            
23    Evonne Cawley                 69  1978-80  + 1 more  
24    Lindsay Davenport             67  1999-00  + 1 more  
25    Tracy Austin                  67  1979-80            
26    Virginia Wade                 66  1975-78            
28    Gabriela Sabatini             65  1990-91            
30    Andrea Jaeger                 64  1981-82            
33    Claudia Kohde Kilsch          63  1986-87            
34    Kerry Reid                    62  1969-77            
37    Serena Williams               61  2002-03            
39    Anna Chakvetadze              60  2006-07            
40    Naomi Osaka                   59  2017-19  (active)

* Unfortunately all of these numbers come with a huge caveat. My historical WTA database isn’t perfect. I know that there are Evert and Navratilova matches missing, along with a handful of later results. For records like this, a single missing match could mean that Evert really had two streaks of 110 each, or any number of other permutations that would render my all-time list incorrect. So please, take these records as unofficial, and maybe the WTA will query their own–presumably more complete–database to produce a better list.

This is good company for the reigning US Open champion, and it looks even better if we narrow our view to 21st-century players. Only five of the women ahead of her on the list are active, and four of those are winners of multiple majors–another club that the 21-year-old could join this week. Her semi-final opponent, Karolina Pliskova, executed her own history-making comeback against Serena today. But if Pliskova finds herself down a set to Osaka, even she may not be enough of an escape artist to fight back against the best front-runner in women’s tennis.

Update: Osaka finished off the 2019 Australian Open with two more first-set conversions. In both the semi-final against Pliskova and the final against Kvitova, she won the the first set and went on to win in three. Thus, her streak is up to 61 and she has matched Serena’s best.

Dominic Thiem In Pressure Service Games

Embed from Getty Images

Dominic Thiem has good reason to be frustrated.

Italian translation at settesei.it

On Tuesday night, Rafael Nadal and Dominic Thiem delivered the match of the 2018 US Open thus far. After nearly five hours of play, nothing separated them as they battled their way to 5-5 in a fifth-set tiebreak. Nadal finally crept ahead by the narrowest of margins, sealing a victory by the unlikely score of 0-6 6-4 7-5 6-7(4) 7-6(5).

Both players had plenty of chances, and while Rafa prepares for a semi-final against Juan Martin del Potro, Thiem will have plenty of time to mull over the opportunities he missed. In the second set, he failed to hold in both of his last two service games, including the final frame of the set, at 4-5. In the third set, he took the lead by breaking Nadal in the seventh game, but failed to follow up his advantage, losing serve when he attempted to serve it out at 5-4. Two games later, he proved unable to hold serve to stay in the set at 5-6, though he forced Rafa to four deuces before finally giving way.

These three missed chances are hardly the entire story of the match, but they stick out in memory. Overall, Thiem served quite well, allowing Nadal only one break per set. That’s 21 holds in 26 service games, an 81% hold rate, a significant achievement compared to the 66% that Nadal’s opponents have averaged against him on hard courts this year, or the paltry 52% that Rafa has allowed overall. The problem isn’t that the Austrian served badly–he didn’t–but that he weakened at the wrong times. Thiem broke Nadal more often than Rafa returned the favor–six to five–but because three of Thiem’s breaks came in the first, 6-0 set [editor’s note: !??!?!?!?] , Nadal’s six proved less costly than Thiem’s five.

Bad day, or just bad?

Is this something Thiem does, or is it just something that he did, perhaps nudged over the edge one of the greatest returners of all time? Too often, viewers–along with many of those paid to talk and write about tennis–see the latter and assume the former. Does Thiem make a habit of serving strong in lower-leverage games and then wilting when the pressure ratchets up?

If he does, it would make him an exception. I looked at “serving for the set” opportunities a few years ago and found that ATP players serve almost exactly as well when a hold would earn them the set than otherwise. The difference is a mere 0.7%, meaning that the “difficulty” of serving for the set translates into one additional break per 143 opportunities. The effect wasn’t any more noticeable when I narrowed the focus to situations in which the player led by only a single break, like Thiem’s dropped service game at 5-4 in the third set last night.

Let’s look again, and pay specific attention to Thiem. My dataset of sequential point-by-point data, spanning most ATP tour matches between late 2011 and a few weeks ago, now covers over 400,000 service games, including 30,000 serving-for-the-set chances, over two-thirds of them with a lead of a single break. Over 1% of them have Thiem serving, so at least our sample size benefits from the Austrian’s strenuous schedule, even if it doesn’t do him any favors on the court. In other words, we’ve got a ton of data here, so if there is an effect, we should be able to find it.

Thiem’s missed chances included chances to both finish a set and stay in a set, so I’ve expanded our view to a variety of pressure situations. For each situation, I’ve calculated the hold rate for players in that position relative to their typical hold rate in those matches. (A player with a lot of serve-to-stay-in opportunities is probably on the losing end, with a lower hold rate than average, but this method should control for that.) A ratio of 1.0 means that the hold rate in the pressure situation is exactly the same as normal. A ratio above 1.0 means the hold rate is higher than usual, and below 1.0 signifies a lower hold rate–the lag many of us expect to see when the stakes get higher. Here are the ratios for a variety of situations, including serving for the set (plus a category one-break leads), serving to stay in the set (also with one-break deficits identified), ties late in the set such as 4-4 and 5-5, and for comparison’s sake, low-pressure situations–“All Else”–which is a catch-all for everything not in the above categories.*

* Yes, it includes the famous seventh game, which I’ve previously shown isn’t particularly important, no matter what Bill Tilden said.

Situation          Examples  Hold% / Avg  
For-Set            5-4; 5-2        0.994  
- For-Set Close    5-3; 6-5        0.989    
To-Stay            4-5; 1-5        0.999  
- To-Stay Close    5-6; 3-5        0.969    
Tied Late          4-4; 5-5        0.953  
All Else           2-3; etc        1.003

The “serving-for-the-set” effect is almost exactly same as what I found three years ago: a drop of a bit more than half a percent. Last year, the impact of serving for the set with a single break lead was a bit greater than I initially found, but it’s still small. We find servers struggling the most when serving to stay in the set while trailing by a a single break–losing serve 3.1% more often than usual–and when serving at 4-4 and 5-5, when they drop serve almost 5% more frequently than expected. These are the most substantial effects I’ve seen, but keep in mind the magnitude–even a 5% difference means it only flips the outcome of one service game in twenty. It certainly matters, but it would be awfully hard to spot with the naked eye.

The one percent

How does Thiem compare? Here is the same set of ratios for him, with separate columns for his career numbers (subject to the limitations of my dataset, which includes few matches before 2012) and for single-season figures from 2016, 2017, and 2018:

Situation        Career   2016   2017   2018  
For-Set           0.996  1.049  1.011  0.966  
- For-Set Close   0.984  1.078  1.008  0.887  
To-Stay           1.030  1.160  1.027  0.940  
- To-Stay Close   0.984  1.148  0.957  0.964  
Tied Late         0.984  0.976  0.991  0.889  
All Else          1.004  0.994  1.009  1.030

Thiem’s career numbers reveal little, just a player who is a tiny bit worse in high-leverage situations, though perhaps a little less affected by the pressure than his peers. The concern is his numbers so far this year, which are way down across the board. Each one of the categories represents a relatively small sample–for example, I have only 42 games in which he was serving for the set with a single break advantage–but taken together, the set of sub-1.0 ratios don’t point in an encouraging direction. We could never have forecast before last night’s match that Thiem would serve so well in general but so much weaker in the clutch, but there were subtle hints lurking in his 2018 performance.

A puzzle

I want to show you the same set of data, but for another player. In one way, it’s the opposite of Thiem’s: many more breaks in pressure situations over the course of the player’s career, but the opposite trend in the last few years, pointing toward more service holds:

Situation        Career   2016   2017   2018  
For-Set           0.929  0.931  1.200  1.077  
- For-Set Close   0.910  0.895  1.333  1.000  
To-Stay           1.026  1.077  1.083  1.061  
- To-Stay Close   0.929  1.100  1.167  1.044  
Tied Late         0.905  1.050  1.000  1.048  
All Else          1.011  1.013  1.024  1.013

Any ideas? It’s a bit of a trick question–you’re looking at the tour serving against Rafa. From 2012-15, Nadal absolutely shut down opposing servers starting at about 4-4. (He wasn’t as good–relative to his average, anyway–late in sets on his own serve.) Very few players or seasons show effects of greater than 5% in either direction, but Rafa’s opponents saw their hold rate dip by more than twice that in some seasons. Yet the story has been different for the last year or two, with Rafa himself becoming the underperformer in his late-set return games.

Again, we shouldn’t read too much into a single year of this data: The sample size is an issue, especially for a top player’s return games, because not many guys find themselves serving for a set against him. But had we looked at Nadal’s return record in pressure situations alongside Thiem’s recent serve performance, it would have made for a more complicated picture, one less likely to predict some of the crucial moments in last night’s match. In any given contest, there are simply too few key games for us to forecast their outcome with any success, especially when a let cord, an untimely distraction, or a missed line call could reverse the result. But that doesn’t mean we shouldn’t try to understand them. Unlucky, unclutch, or whatever else, Thiem could have flipped the outcome of the entire match by holding just one of those three games. The stakes could hardly be higher.

Simona Halep and Recoveries From Match Point Down

Italian translation at settesei.it

In yesterday’s French Open quarterfinals, Elina Svitolina held a commanding lead over Simona Halep, up a set and 5-1. Depending on what numbers you plug into the formula, Svitolina’s chance of winning the match at that stage was somewhere between 97% and 99%. Halep fought back to 5-5, and in the second-set tiebreak, Svitolina earned a match point at 6-5. Halep recovered again, won the breaker, and then cruised to a 6-0 victory in the third set.

It’s easy to fit a narrative to that sequence of events: After losing two leads, Svitolina was dispirited, and Halep was all but guaranteed a third-set victory. Maybe. It’s impossible to test that sort of thing on the evidence of a single match, but this is hardly the first time a player has failed to convert match point and needed to start fresh in a new set.

Even without a match point saved, the player who wins the second set has a small advantage going into the decider. In the last six-plus years of women’s Slam matches, the player who won the second set went on to win 51.3% of third sets. On the other hand, if the second set was a tiebreak, the winner of the second set won the decider only 43.7% of the time. Though it sounds contradictory at first, consider what we know about such sets. The second-set winner just barely claimed her set (in the tiebreak), while usually, her opponent took the first set more decisively. Momentum helps a little, but it can’t overcome much of a difference in skill level.

Let’s dig into the specific cases of second-set match points saved. Thanks to the data behind IBM’s Pointstream on Grand Slam websites, we have the point-by-point sequence for most Slam singles matches going back to 2011. (The missing matches are usually those on non-Hawkeye courts and a few small courts at Roland Garros.) That’s over 2,600 women’s singles matches. In just over 1,700 of them, one of the two players earned a match point in the second set. Over 97% of the time, that player converted–needing an average of 1.7 match points to do so–and avoiding playing a third set.

That leaves 45 matches in which one player held a match point in the second set, failed to finish the job, and was forced to play a third set. It’s a limited sample, and it doesn’t wholeheartedly support the third-set-collapse narrative suggested above. 60% of the time–27 of the 45 matches–the player who failed to convert match point in the second set, like Svitolina did, went on to lose the third set. The third set was often lopsided: 5 of the 27 were bagels (including yesterday’s match), and the average score was 6-2. None of the third sets went beyond 6-4.

The other 18 matches–the 40% of the time in which the player with the second-set match point bounced back to win the third set–featured rather one-way deciders, as well. In those, the third-set loser managed an average of only 2.3 games, also never doing better than 6-4.

This is a small sample, so it’s unwise to conclude that this 60/40 margin is anything close to an iron law of tennis. That said, it does provide some evidence that players don’t necessarily collapse after failing to convert a straight-sets win at match point. What happened to Svitolina yesterday is far from certain to happen next time.

The Odds of Successfully Serving Out the Set

Italian translation at settesei.it

Serving for the set is hard … or so they say. Like other familiar tennis conceits, this one is ripe for confirmation bias. Every time we see a player struggle to serve out a set, we’re tempted to comment on the particular challenge he faces. If he doesn’t struggle, we ignore it or, even worse, remark on how he achieved such an unusual feat.

My findings–based on point-by-point data from tens of thousands of matches from the last few seasons–follow a familiar refrain: If there’s an effect, it’s very minor. For many players, and for some substantial subsets of matches, breaks of serve appear to be less likely at these purportedly high-pressure service games of 5-4, 5-3 and the like.

In ATP tour-level matches, holds are almost exactly as common when serving for the set as at other stages of the match. For each match in the dataset, I found each player’s hold percentage for the match. If serving for the set were more difficult than serving in other situations, we would find that those “average” hold percentages would be higher than players’ success rates when serving for the set.

That isn’t the case. Considering over 20,000 “serving-for-the-set” games, players held serve only 0.7% less often than expected–a difference that shows up only once every 143 attempts. The result is the same when we limit the sample to “close” situations, where the server has a one-break advantage.

Only a few players have demonstrated any notable success or lack thereof. Andy Murray holds about 6% more often when serving for the set than his average rate, making him one of only four players (in my pool of 99 players with 1,000 or more service games) to outperform his own average by more than 5%.

On the WTA tour, serving for the set appears to be a bit more difficult. On average, players successfully serve out a set 3.4% less often than their average success rate, a difference that would show up about once every 30 attempts. Seven of the 85 players with 1,000 service games in the dataset were at least 10% less successful in serving-for-the-set situations than their own standard.

Maria Sharapova stands out at the other end of the spectrum, holding serve 3% more often than her average when serving for the set, and 7% more frequently than average when serving for the set with a single-break advantage. She’s one of 30 players for whom I was able to analyze at least 100 single-break opportunities, and the only one of them to exceed expectations by more than 5%.

Given the size of the sample–nearly 20,000 serving-for-the-set attempts, with almost 12,000 of them single-break opportunities–it seems likely that this is a real effect, however small. Strangely, though, the overall finding is different at the lower levels of the women’s game.

For women’s ITF main draw matches, I was able to look at another 30,000 serving-for-the-set attempts, and in these, players were 2.4% more successful than their own average in the match. In close sets, where the server held a one-break edge, the server’s advantage was even greater: 3.5% better than in other games.

If anything, I would have expected players at lower levels to exhibit greater effects in line with the conventional wisdom. If it’s difficult to serve in high-pressure situations, it would make sense if lower-ranked players (who, presumably, have less experience with and/or are less adept in these situations) were not as effective. Yet the opposite appears to be true.

Lower-level averages from the men’s tour don’t shed much light, either. In main draw matches at Challengers, players hold 1.4% less often when serving for the set, and 1.8% less often with a single-break advantage. In futures main draws, they are exactly as successful when serving for the set as they are the rest of the time, regardless of their lead. In all of the samples, there are only a handful of players whose record is 10% better or worse when serving for the set, and a small percentage who over- or underperform by even 5%.

The more specific situations I analyze, the more the evidence piles up that games and points are, for the most part, independent–that is, players are roughly as effective at one score as they are at any other, and it doesn’t matter a great deal what sequence of points or games got them there. There are still plenty of situations that haven’t yet been analyzed, but if the ones that we talk about the most don’t exhibit the strong effects that we think they do, that casts quite a bit of doubt on the likelihood that we’ll find notable effects elsewhere.

If there is any truth to claims like those about the difficulty of serving for the set, perhaps it is the case that the pressure affects both players equally. After all, if a server needs to hold at 5-4, it is equally important for the returner to seize the final break opportunity. Maybe the level of both players drops, something we might be able to determine by analyzing how these points are played.

For now, though, we can conclude that players–regardless of gender or level–serve out the set about as often as they successfully hold at 1-2, or 3-3, or any other particular score.

How Important is the Seventh Game of the Set?

Italian translation at settesei.it

Few nuggets of tennis’s conventional wisdom are more standard than the notion that the seventh game of each set is particularly crucial. While it’s often difficult to pin down such a well-worn conceit, it seems to combine two separate beliefs:

  1. If a set has reached 3-3, the pressure is starting to mount, and the server is less likely to hold serve.
  2. The seventh game is somehow more important than its immediate effect on the score, perhaps because the winner gains momentum by taking such a pivotal game.

Let’s test both.

Holding at 3-3

Drawing on my database of over 11,000 ATP tour-level matches from the last few years, I found 11,421 sets that reached three-all. For each, I calculated the theoretical likelihood that the server would hold (based on his rate of service points won throughout the match) and his percentage of service games won in the match. If the conventional wisdom is true, the percentage of games won by the server at 3-3 should be noticeably lower.

It isn’t. Using the theoretical model, these servers should have held 80.5% of the time. Based on their success holding serve throughout these matches, they should have held 80.2% of the time. At three-all, they held serve 79.5% of the time. That’s lower, but not enough lower that a human would ever notice. The difference between 80.2% and 79.5% is roughly one extra break at 3-3 per Grand Slam. Not Grand Slam match–an entire tournament.

None of that 0.7% discrepancy can be explained by the effect of old balls [1]. Because new balls are introduced after the first seven games of each match, the server at three-all in the first set is always using old balls, which should, according to another bit of conventional wisdom, work against him. However, the difference between actual holds and predicted holds at 3-3 is slightly greater after the first set: 78.9% instead of the predicted 79.8%. Still, this difference is not enough to merit the weight we give to the seventh game.

The simple part of our work is done: Servers hold at three-all almost as often as they do at any other stage of a match.

Momentum from the seventh game

At 3-3, a set is close, and every game matters. This is especially true in men’s tennis, where breaks are hard to come by. Against many players, getting broken so late in the set is almost the same as losing the set.

However, the focus on the seventh game is a bit odd. It’s important, but not as important as serving at 3-4, or 4-4, or 4-5, or … you get the idea. The closer a game to the end of the set, the more important it is–theoretically, anyway. If 3-3 is really worth the hoopla, it must grant the winner some additional momentum.

To measure the effect of the seventh game, I took another look at that pool of 11,000-plus sets that reached three-all. For each set, I calculated the two probabilities–based on each player’s service points won throughout the match–that the server would win the set:

  1. the 3-3 server’s chance of winning the set before the 3-3 game
  2. his chance of winning the set after winning or losing the 3-3 game

In this sample of matches, the average server at three-all had a 48.1% chance of winning the set before the seventh game. The servers went on to win 49.4% of the sets [2].

In over 9,000 of our 3-3 sets, the server held at 3-3. These players had, on average, a 51.3% chance of winning the set before serving at 3-3, which rose to an average of a 57.3% chance after holding. In fact, they won the set 58.6% of the time.

In the other 2,300 of our sets, the server failed to hold. Before serving at three-all, these players had a 35.9% chance of winning the set, which fell to 12.6% after losing serve. These players went on to win the set 13.7% of the time. In all of these cases, the model slightly underestimates the likelihood that the server at 3-3 goes on to win the set.

There’s no evidence here for momentum. Players who hold serve at three-all are slightly more likely to win the set than the model predicts, but the difference is no greater than that between the model and reality before the 3-3 game. In any event, the difference is small, affecting barely one set in one hundred.

When a server is broken at three-all, the evidence directly contradicts the momentum hypothesis. Yes, the server is much less likely to win the set–but that’s because he just got broken! The same would be true if we studied servers at 3-4, 4-4, 4-5, or 5-5. Once we factor in the mathematical implications of getting broken in the seventh game, servers are slightly more likely to win the set than the model suggests. Certainly the break does not swing any momentum in the direction of the successful returner.

There you have it. Players hold serve about as often as usual at three-all (whether they’re serving with new balls or not), and winning or losing the seventh game doesn’t have any discernible momentum effect on the rest of the set [3]. Be sure to tell your friendly neighborhood tennis pundits.

Continue reading How Important is the Seventh Game of the Set?

The Effects (and Maybe Even Momentum) of a Long Rally

Italian translation at settesei.it

In yesterday’s quarterfinal between Simona Halep and Victoria Azarenka, a highlight early in the third set was a 25-shot rally that Vika finished off with a forehand winner. It was the longest point of the match, and moved her within a point of holding serve to open the set.

As very long rallies often do, the point seemed like it might represent a momentum shift. Instead, Halep sent the game back to deuce after a 10-stroke rally on the next point. If there was any momentum conferred by these two points, it disappeared as quickly as it arose. It took eight more points before Azarenka finally sealed the hold of serve.

Does a long rally tell us anything at all? Does it have predictive value for the next point, or even the entire game, or is it just highlight-reel fodder that is forgotten as soon as the umpire announces the score?

To answer those questions, I delved into the shot-by-shot data of the Match Charting Project, which now contains point-by-point accounts of nearly 1,100 matches. I identified the longest 1% of points–17 shots or longer for women, 18 shots for men–and analyzed what happened afterwards, looking for both fatigue and momentum effects.

The next point

There’s one clear effect of a long rally: The next point will be shorter than average. The 10-shot rally contested by Vika and Simona yesterday was an outlier: Women average 4.45 shots on the point after a long rally, while the overall average (controlled for server and first or second serve) is 4.85. Men average 4.03 shots on the following point, compared to an average of 4.64.

For women, fatigue is also a factor for the server. Following a long rally, women land only 61.3% of first serves, compared to an average of 64.6%. Men don’t exhibit the same fatigue effect; the equivalent numbers are 62.3% and 62.2%.

There’s more evidence of an immediate fatigue factor for women, as well. The players who win those long rallies are slightly better than their opponents, winning 50.7% of points on average. Immediately after a long rally, however, players win only 49% of points.  It’s not obvious to me why this should be the case. Perhaps the player who won the long rally worked a bit harder than her opponent, maybe putting all of her remaining effort into a groundstroke winner, or finishing the point with a couple of athletic shots at the net.

In any case, there’s no equivalent effect for men.  After winning a long rally, players win 51.1% of their next points, compared to an expected 50.8%. That’s either a very small momentum effect or, more likely, a bit of statistical noise.

Both men and women double fault more often than usual after a long rally, though the effect is much greater for women. Immediately following these points, women double fault 4.7% of the time, compared to an average of 3.3%. Men double fault 4.5% of the time after a long rally, compared to an expected rate of 4.2%.

Longer-term momentum

Beyond a slightly effect on the characteristics of the next point, does a long rally influence the outcome of the game? The evidence suggests that it doesn’t.

For each long rally, I identified whether the winner of the rally went on to win the game, as Vika did yesterday. I also combined the score after the long rally with the average rate of points won on the appropriate player’s serve to calculate the odds that, from such a score, the player who won the rally would go on to win the game. To use yesterday’s example, when Azarenka held game point at AD-40, her chances of winning the game were 77.6%.

For both men and women, there is no significant effect. Women who won long rallies went on to win 66.2% of those games, while they would have been expected to win 65.7%. Men won 64.4% of those games, compared to an expected rate of 64.1%.

With a much larger dataset, these findings might indicate a very slight momentum effect. But limited to under 1,000 long-rally points for each gender, the differences represent only a few games that went the way of the player who won the long point.

For now, we’ll have to conclude that the aftereffects of a long rally have a very short lifespan: barely one point for women, perhaps not even that long for men. These points may well have a greater effect on fans than they do on the players themselves.