Are You There, Margaret? It’s Me, Serena

As I write this, Serena Williams is two matches away from winning her 24th grand slam. She’s been stuck on 23 since early 2017, which must be frustrating, since the all-time record is 24. Serena already holds the open-era record (for titles since 1968), one ahead of Steffi Graf’s 22. But Margaret Court is the leader across all eras, with 24 major championships between 1960 and 1973.

Williams is, of course, one of the greatest players of all time. Maybe the greatest. Court is also in the conversation, along with other luminaries such as Graf, Martina Navratilova, and Chris Evert. Cross-era comparisons in tennis are extremely difficult, because nearly everything about the game has changed. Serena’s technique, training, equipment, and tour schedule–not to mention wealth and celebrity status!–would all be extremely foreign to a 1960s or 70s superstar such as Court.

The challenge of cross-era comparisons hasn’t stopped fans from expressing opinions about where Williams should stand on the all-time leaderboard. Regardless of whose trophy cabinet numbers 23 or 24, Serena supporters tend to rely on three main arguments:

  1. The level of competition is way higher now than it was back then.
  2. Court won the Australian Open 11 times, back when it was the weakest of the four majors.
  3. Court is an obnoxious blowhard whose opinions are unacceptable.

Number one is probably true, but if we’re going to attempt cross-era comparisons, I think the only valid way to do so is to treat all eras as equal. We’ll never know how Williams would have fared with a wooden racket, or how Court’s body would’ve responded to today’s more physical game. You can make a logical case that today’s players are simply better than those of a couple generations ago, who were better than the ones before them, and so on. But the very idea of a “greatest of all time” implies something different than the “greatest of all time measured by today’s standards,” so we’re going to treat all eras as equal.

Number three is also popular, but my database isn’t able to shine much light on that line of argument.

That leaves number two, the relative weakness of the Australian Open.

Aussie ease

Court won the Australian Open 11 times, more than any other woman has claimed a single major title. In itself, that’s not a negative. Nobody counts it against Rafael Nadal that he’s won the French Open 12 times. But in the amateur era–and for some years after tennis went fully professional–the Australian Open wasn’t a mandatory stop for the best players in the world. It was a long trip, and it hadn’t yet gained the prestige that it holds today.

Thus, it’s fair to conclude that Court’s 1963 Wimbledon title was a more noteworthy accomplishment than her trophy 1963 Australian Championships. Most of us would agree that we should discount those Australian Opens. But by how much?

Difficulty-adjusted slam titles

In the past, I’ve compared men’s greatest-of-all-time candidates by major titles, adjusted for the level of competition. In the modern game, the field is almost exactly the same from one major to the next, but the draw can make one tournament considerably more difficult to win than another. The same technique allows us to compare draw difficulty and field quality for tournaments from the 1970s when both varied. For instance, the difficulty of Court’s path to the 1973 US Open title rated as average, in line with many of Williams’s title paths. But her 1973 Australian crown was only two-thirds as difficult–one of the easiest paths to a major title in the open era.

It’s no accident that I’m using Court’s last few major titles as examples. By analyzing performances from the 1970s, we’re pushing up against the edge of the weakness of historical tennis data. It’s well-nigh impossible to estimate the exact difficulty level of most of Court’s titles, because so little data is available from the amateur era. Instead, we’ll need to approximate using the limited information we have.

My difficulty adjustments rely on Elo ratings, which I have calculated as far back as 1972. (We have fairly complete results back to 1970 or so, but it takes a bit of time to amass a decent sample of match results for each player and for ratings to stabilize.) Let’s look at the relative difficulty of the four grand slams in the first five possible years, 1972-76:

Major            Difficulty  
Australian Open        0.60  
French Open            0.54  
Wimbledon              0.99  
US Open                0.85

The average major title, 1972-present, rates 1.0, with more difficult paths earning higher numbers. The fields weren’t as deep in the 1970s as they are now, so the typical path to a slam title then was lower than 1.0. In this first five-year period, we see that Wimbledon was in line with the historical average, the US Open was a bit easier, and the other two quarters of the grand slam were considerably less challenging. If we follow my suggestion above, to treat all eras as equal–except for the weakness of the Australian draws–we need to normalize these difficulties so that the other three slams average 1.0:

Major            Difficulty  
Australian Open        0.76  
French Open            0.68
Wimbledon              1.25  
US Open                1.07

Extrapolating backwards

We don’t know much about the field quality of the Australian majors in Court’s prime. For lack of a better option, then, we’ll use the 1972-76 average, since that’s as close as we can get. These probably overstate the quality of the Australian draws relative to the other slams, but if we’re inching toward calling Serena the all-time leader at Court’s expense, we should make conservative assumptions, to give us more confidence in our end result.

Here’s what happens to Court’s career totals if we apply the normalized adjustments:

Major            Difficulty  Slams  Adj Slams  
Australian Open        0.76     11        8.3  
French Open            0.68      5        3.4  
Wimbledon              1.25      3        3.7  
US Open                1.07      5        5.4  
Total                           24       20.8

The same process–adjusting each slam for difficulty, and normalizing for era–makes milder tweaks to Williams’s and Graf’s totals. Serena ends up with 23.3, and Graf with 21.9. Neither is enough to give us reason to change how we view those players’ accomplishments. And both are better than Court’s modified tally.

The small herd of GOATs

Remember that this is not an era adjustment. To the contrary, this calculation is based on the simplifying assumption that all eras are equal, except for the fact that for many years, some of the best players didn’t travel to Australia, making that major easier to win than the others.

These numbers also–obviously!–don’t tell us that Court wasn’t one of the best ever. Even if she had skipped her home slam, she still would’ve retired with 13 majors, plus a pile of doubles grand slam trophies and a long list of other career accomplishments. If Australia were less geographically remote, she probably wouldn’t have won those eleven titles–but she may well have won eight.

For all of Court’s accomplishments, she loses her top spot on the sport’s most hallowed list once we account for the weakness of the early Australian Open draws. At the very least, she falls behind Williams and Graf. Remember that my adjustments are conservative ones, so if we collect more data and discover that we should more aggressively discount her 1960’s Australian titles, her resulting total might leave her closer to 18, tied with Evert and Navratilova.

Serena may never equal or beat Court’s 24 titles. But even if she retires with 23, the modern level of competition–which showed up at every major, every year–means that she already deserves her place atop the leaderboard.

There’s Always a Chance: Marie Bouzkova Edition

Last night in Toronto, 91st-ranked qualifier Marie Bouzkova won her quarter-final match against 4th-ranked Simona Halep. Halep retired with a leg injury after losing the first set, so there’s a caveat–even if we were prepared to read too much into a single match, we wouldn’t attribute a lot of meaning to this one. But it’s a big accomplishment for the 21-year-old Czech, who earned her second top-ten scalp of the week and will advance to her first Premier-level semi-final, against no less of an obstacle than Serena Williams.

Here’s the nutty thing: It was Bouzkova’s 62nd match of the 2019 season, her 61st against someone with a WTA ranking. She got the win against the highest-ranked foe–Halep–but just last week, she lost to 636th-ranked CoCo Vandeweghe, her lowest-ranked opponent of the year. Yeah, the caveats keep coming: Vandeweghe is coming back from injury and is surely better than a ranking outside the top 600, and the ITF Transition Tour hijinks mean that the ranking system didn’t work as usual in 2019. Some players who would normally have a very low ranking, like the Kazakh wild card who Bouzkova crushed a couple of weeks ago, don’t count.

Still. 61 matches, with a win against the highest-ranked player and a loss against the lowest.

That sent me to my database, which had plenty more surprises in store. Going back less than a decade, to 2010, I found 127 players who recorded the same oddball combination of feats in a single season, minimum 30 matches. (To be consistent with the Halep result, I included retirements if at least one set was completed.) While many of the players won’t be of wide interest–last year, one of the exemplars was Mira Antonitsch, who didn’t play anyone ranked in the top 400–63 of the 127 player-seasons involved beating a top-100 opponent, 44 included the defeat of someone in the top 50, and 25 were highlighted by a top-ten upset.

Three of them included Halep as the top-ten scalp! That makes Bouzkova the fourth player to beat Halep, not face anyone higher ranked, and also lose to her lowest-ranked opponent of the season. (Through eight months, anyway.) Halep shouldn’t feel too bad, though, as Angelique Kerber has been the extreme-ranked loser in five such cases, four of them in 2017. Ouch.

Here are the 25 player-seasons between 2010 and 2018 in which a WTAer beat her highest-ranked opponent and lost to her lowest:

Year  Player       High-Ranked  Rk  Low-Ranked  Rk       
2017  Kasatkina    Kerber       1   Kanepi      418      
2018  Hsieh        Halep        1   Gasparyan   410      
2010  Jankovic     Serena       1   Diyas       268      
2010  Clijsters    Wozniacki    1   G-Vidagany  258   *  
2014  Cornet       Serena       1   Townsend    205      
2010  Yakimova     Jankovic     2   Dellacqua   980      
2017  Bouchard     Kerber       2   Duval       896   *  
2017  Vesnina      Kerber       2   Azarenka    683      
2016  Bencic       Kerber       2   Boserup     225      
2014  Rybarikova   Halep        2   Eguchi      183      
2017  Mladenovic   Kerber       2   Andreescu   167   *  
2018  Goerges      Wozniacki    3   Serena      451      
2014  Tomljanovic  Radwanska    3   A Bogdan    308      
2015  Mladenovic   Halep        3   Savchuk     262      
2017  Kerber       Pliskova     4   Stephens    934      
2014  Pavlyu'ova   Radwanska    4   Wozniak     241      
2017  Dodin        Cibulkova    5   Rybarikova  453      
2017  Bellis       Radwanska    6   Azarenka    683      
2018  Buyukakcay   Ostapenko    6   Di Sarra    555      
2017  Sakkari      Wozniacki    6   Potapova    454      
2015  L Davis      Bouchard     7   E Bogdan    527      
2015  Ostapenko    S-Navarro    9   Dushevina   1100  *  
2016  KC Chang     Vinci        10  S Murray    862      
2018  Pera         Konta        10  Hlavackova  825      
2018  Danilovic    Goerges      10  Pegula      620

* also faced one unranked player

A quick glance is all it takes to establish that Vandeweghe isn’t the first lowest-ranked player to inspire a “yeah, but” reaction. The list of purportedly weak opponents is very strong for one made up of players with an average ranking outside of the top 500. We have stars such as Victoria Azarenka (twice) and Serena as well as a helping of prospects such as Bianca Andreescu and Victoria Duval.

Consider this as today’s reminder of the limitations of the WTA computer rankings. They tell us who has won a lot of matches in the last 52 weeks, not necessarily who is playing well right now. These cases include many of the most extreme mismatches between official ranking and on-the-day ability. I don’t think it says anything meaningful about a player to show up on this list–though Kerber’s many appearances (as both player and scalp!) are a good summary of her disappointing 2017 campaign.

Bouzkova will remain on the list for at least a couple more days: Serena is currently ranked 10th and both of the other semi-finalists are ranked lower, so Halep will remain her “toughest” opponent. Despite the Czech’s breakout week, it would be understandable if she found herself overawed to face a 23-time slam champion across the net. But one thing is certain: Bouzkova couldn’t care less about the number next to the name.

GOAT Races: Forecasting Future Slams With a Monkey

After Novak Djokovic won his 16th career major at Wimbledon this year, more attention than ever focused on the all-time grand slam race. Roger Federer has 20, Rafael Nadal has 18, and Djokovic is–by far–the best player in the world on the surface of the next two slams. This is anybody’s ballgame.

Forecasting tennis is hard, and that’s just if you’re trying to pick the results of tomorrow’s matches. Players improve and regress seemingly at random, making it difficult to predict what the ranking table will look like only a few months from now. Fans love to speculate about which of the big three will, in the end, win the most slams, but there are an awful lot of unknowns to contend with.

One can imagine some way to construct a crystal ball to get these numbers in a rigorous way. Consider each player’s age, his likely career length, his chances of injury, his recent performance at each of the four slams, his current ranking, the quality of the field on each surface, and probably more, and maybe you could come up with some plausible numbers. Or… what if we skip most of that, and build the simplest model possible?

Enter the monkey

Baseball statheads are familiar with the Marcel projection system, named after a fictional monkey because it “uses as little intelligence as possible.” Just three years of results and an age adjustment. It isn’t perfect, and there are plenty of “obvious” improvements that it leaves on the table. But as in tennis, baseball stats are noisy. For most purposes, a “basic” forecasting system is as good as a complicated one, and over the years, Marcel has outperformed a lot of models that are considerably more complex.

Let’s apply primate logic to slam predictions. First, I’m going to slightly re-cast the question to something a bit more straightforward. Instead of forecasting “career” slam results, we’re going to focus on major titles over the next five years. (That should cover the big three, anyway.) And in keeping with Marcel, we’ll use just a few inputs: slam semi-finals, finals, and titles for the last three years, plus age. Actually, we’re going to lop off a bit of the monkey’s brain right away, because slam results from three years ago aren’t that predictive. So our list of inputs is even shorter: two years of slam semi-finals, finals, and titles, plus age.

The resulting model is pretty good! For players who have reached a major semi-final in any of the last eight slams, it predicts 40% of the variation in next-five-years slam titles. Without building the hyper-complex, optimal model, we don’t know exactly how good that is, but for a forecast that extends so far into the future, capturing almost half of the player-to-player variation in slam results sounds good to me. Think of all the things we don’t know about the slams in 2022, let alone 2024: who is still playing, who gets hurt, who has improved enough to contend, which prospects have come out of nowhere, and so on. Point being, the best model is going to miss a lot, so we shouldn’t set our standards too high.

Follow the monkey

The two-years-plus-age algorithm is so simple that you can literally do it on the back of an envelope. For any player, count his semi-final appearances (won or lost), final appearances (won or lost), and titles at the last four slams, then do the same for the previous four. Then note his age at the start of the next major. Start with zero points, then follow along:

  • add 15 points for each semi-final appearance in the last four slams
  • add 30 points for each final appearance in the last four slams
  • add 90 points for each title in the last four slams
  • add 6 points for each semi-final appearance in the previous four slams
  • add 12 points for each final appearance in the previous four slams
  • add 36 points for each title in the previous four slams
  • if the player is older than 27, subtract 8 points for each year he is older than 27
  • if the player is younger than 27, add 8 points for each year he is younger than 27
  • divide the sum by 100

That’s it! Let’s try Djokovic. In the last four majors, he’s won three titles and made one more semi-final. In the four before that, he won one title. He’ll enter the US Open at 32 years of age. Here goes:

  • +60 (15 points for each of his four semi-finals in the last four slams)
  • +90 (30 points for each of his three finals in the last four slams)
  • +270 (90 points for each of his three titles in the last four slams)
  • +6 (6 points for his 2017 Wimbledon semi-final)
  • +12 (12 points for his 2017 Wimbledon final)
  • +36 (36 points for his 2017 Wimbledon title)
  • -40 (Novak is 32, so we subtract 8 points for each of the 5 years he is older than 27)

Add it all up, and you get 434. Divide by 100, and we’re predicting 4.34 more slams for Novak.

Next-level GOAT trolling

I promise, I went about this project solely as a disinterested analyst. I just wanted to know how accurate a bare-bones long-term slam forecast could be. My goal was not to make you tear your hair out. But hey, you were probably going to lose your hair anyway.

Here is the number of slams that the model predicts for the big three between the 2019 US Open and 2024 Wimbledon:

  • Djokovic: 4.34
  • Nadal: 2.22
  • Federer: 0.26

You probably don’t need me to do the math for the next step, but you know I can’t not do it. Projected career totals:

  • Djokovic: 20.34
  • Federer: 20.26
  • Nadal: 20.22

Or, since we live in a world where you can’t win fractional majors:

  • Djokovic: 20
  • Federer: 20
  • Nadal: 20

Ha.

Back to the model

Djokovic’s forecast of 4.34 is quite high, in keeping with a player who has won three of the last four majors. For each year since 1971, I calculated a slam prediction for every player who had made a major semi-final in the previous two years–a total of more than 800 forecasts. Only 14 of those forecasts were higher than 4.34, and several of those belonged to the big three. Here are the top ten:

Year  Player         Age   Predicted  Actual     
2008  Roger Federer   26        6.38       5     
2007  Roger Federer   25        5.86       7     
2016  Novak Djokovic  28        5.20       6  *  
2005  Roger Federer   23        4.91      11     
2011  Rafael Nadal    24        4.89       5     
2006  Roger Federer   24        4.86      10     
2017  Novak Djokovic  29        4.79       4  *  
2012  Novak Djokovic  24        4.68       8     
1989  Mats Wilander   24        4.65       0     
1988  Ivan Lendl      27        4.56       2 

* actual slam counts that could still increase

All of these predictions are based on data available at the beginning of the named year. So the top row, 2008 Federer, is the forecast for Federer’s 2008-12 title count, based on his 2006-07 performance and his age entering the 2008 Australian. Had the model existed back then, it would have guessed he’d win a half-dozen slams in that time period. He came close, winning five.

There will be plenty of noise at the extreme ends of any model like this. At the beginning of 2005, the algorithm pegged Federer to win “only” five of the next twenty majors. Instead, he won 11. I can’t imagine any data-based system would have been so optimistic as to guess double digits. On the flip side, the 1989 edition of the monkey would’ve been nearly as hopeful for Mats Wilander, who was coming off a three-slam campaign. Sadly for the Swede, a gang of youngsters overtook him and he never made another major final.

Let’s also take a look at the next 10 rosiest forecasts, plus the current guesstimate for Djokovic:

Year  Player          Age  Predicted  Actual     
2010  Roger Federer    28       4.48       2     
1981  Bjorn Borg       24       4.47       1     
1996  Pete Sampras     24       4.47       6     
1975  Jimmy Connors    22       4.45       2     
Curr  Novak Djokovic   32       4.34       0  *  
1980  Bjorn Borg       23       4.28       3     
2013  Novak Djokovic   25       4.24       7     
2009  Roger Federer    27       4.20       4     
1995  Pete Sampras     23       4.16       7     
2009  Rafael Nadal     22       4.12       8     
1979  Bjorn Borg       22       4.09       5 

Plenty more noise here, with outcomes between 0 and 8 slams. Still, the average result of the 10 other predictions on this list is 4.5 slams, right in line with our forecast for Novak.

Missing slams…

The model expects that the big three will win around seven of the next twenty slams. You might reasonably wonder: What about the other thirteen?

The monkey only considers players with a slam semi-final in the last eight majors, so the forecasts shouldn’t add up to 20. There’s a chance that the champions in 2023 and 2024 aren’t yet on our radar, and many young names of interest to pundits these days, like Alexander Zverev, Felix Auger Aliassime, and Daniil Medvedev, haven’t yet reached the final four of a major. Here are the players for whom we can make predictions:

Player                 Predicted Slams  
Novak Djokovic                    4.34  
Rafael Nadal                      2.22  
Dominic Thiem                     0.71  
Stefanos Tsitsipas                0.63  
Hyeon Chung                       0.38  
Lucas Pouille                     0.31  
Kyle Edmund                       0.30  
Roger Federer                     0.26  
Juan Martin del Potro             0.19  
Marco Cecchinato                  0.06  
----------------                  ----  
TOTAL                             9.40 

(The five other players with semi-final appearances since the 2017 US Open are forecast to win zero slams.)

Yeah, I know, Lucas Pouille and Hyeon Chung aren’t really better bets to win a slam than Federer is. But they are (relatively) young, and the model recognizes that many players who reach slam semi-finals early in their careers are able to build on that success.

More to the point, we’re leaving a lot of majors on the table. If the overall forecast is correct, that list of players will win fewer than half of the next 20 slams, leaving at least ten championships to players who have yet to win a major quarter-final.

…and age

Remember, I retro-forecasted every five-year period back to 1971-75. Over the 44 five-year spans starting each season between 1971 and 2014, the model typically predicted that the players it knew about–the ones who had reached slam semi-finals in the last two years–would win 13 of the next 20 slams. In fact, those on-the-radar players combined to win an average of 12 majors in the ensuing five-year spans.

Only in the last few years has the total number of predicted slams fallen below 10. The culprit is age: Recall that every forecast has an age adjustment, and we subtract 8 points (0.08 slams) for each year a player is older than 27. That’s a 0.4-slam penalty for both Djokovic and Nadal, and it’s 0.8 slams erased from Federer’s future tally. Thus, the model predicts that the big three are fading, and there aren’t many youngsters (like Pouille and Chung) on the list to compensate.

How you interpret these big three forecasts in light of the “missing” slams depends on a couple of factors:

  • Has the aging curve for superstars has changed? Is 30 the new 25; 32 the new 27?
  • Will the next few generations of players soon be good enough to topple the big three?

There’s plenty of evidence that the aging curve has changed, that we should expect more from 30-somethings these days than we did in the 1980s and 1990s. That would close much of the gap. Let’s say we set the new peak age at 31, four years later than the men’s Open Era average of 27. That would add 0.32 slams to every player’s forecast, possibly adding one more slam to each of the big three’s forecasted total. Overall, it would add a bit more than an additional three slams to the total of the the previous table, putting that number close to the historical average of 13.

Shifting the age adjustment doesn’t disentangle the big three, though, because it affects them all equally. It just means a three-way tie at 21 is a bit more likely than a three-way tie at 20.

The second question is the more important–and less predictable–one. It’s hard enough to know how well a single player will be competing in three, four, or five years. (Or, sometimes, tomorrow.) But even if we could puzzle out that problem, we’d be left with the still more difficult task of predicting the level of competition. Entering the 2003 season, the monkey would have opined that the then-current crop of stars–men who made slam semis in 2001 and 2002–would account for a combined 13 majors between 2003 and 2007. That included 2.5 for Lleyton Hewitt, plus one apiece for Thomas Johansson, Albert Costa, Pete Sampras, Marat Safin, David Nalbandian, and Juan Carlos Ferrero. Those seven men won only two. The entire group of 20 players who merited forecasts entering the 2003 Australian Open won only three.

We’ll probably never establish exactly how strong that group was in comparison with other eras. What we know for sure is that none of those men were as good as Federer in 2003-05, and by the end of the five-year span, they’d been shunted aside by Nadal as well. (Only Nalbandian ranked in the 2007 year-end top ten.) The generation of Zverev/Tsitsipas/Auger-Aliassime/etc won’t be as good as peak Big Four, but the course of the next 20 slams will depend a lot more on those players that it will on the (relatively) more predictable career trajectories of Djokovic, Federer, and Nadal.

So we’re left with a stack of known unknowns and error bars wider than a shanked Federer backhand. But based on what we do know, the top of the all-time slam leaderboard is going to get even more crowded. At least, that’s what the monkey says.

Anatomy of Alex de Minaur’s Serving Masterclass

The ATP Atlanta event is typically packed with big servers. John Isner won five titles in six years between 2013 and 2018, during which time the only man to stop him was Nick Kyrgios–in two tiebreaks, naturally. The last champion before Isner took over was Andy Roddick. It’s a fast hard court and the weather is often scorching, so the tournament tends to be a week-long ace festival.

The 2019 titlist posted another wave of eye-popping service numbers, winning four matches without facing a single break point, and winning more than 90% of his first serve points in each match. Those positively Isnerian numbers didn’t belong to the big man himself, nor were they posted by heir apparent Reilly Opelka. The serve king in Atlanta this year was the “six-feet tall” (sure, buddy) Australian grinder, Alex de Minaur.

Unlike many of his peers, de Minaur doesn’t make his money with a big serve. In the last 52 weeks, both Isner and Opelka have hit aces on one-quarter of their serve points. The Aussie’s 52-week rate is a mere 4.5%. He posted a tour-level career best of 14.8% against Taylor Fritz in the Atlanta final (excluding a Bernard Tomic retirement), but failed to reach double digits in second round against Bradley Klahn, or in the semi-final against Opelka. Last week, de Minaur proved that there are a lot of ways to win serve points without necessarily piling up the aces.

Strike one

The easiest non-ace route to victory is the unreturned serve. Players don’t have the same level of control over the rate of unreturned serves that they do with aces. But many great serves are reachable–if not effectively returnable–so they don’t go down in the ace column. The unreturned-but-not-ace category was de Minaur’s bread and butter in Atlanta.

According to the point-by-point log of the final in the Match Charting Project dataset, Fritz put only 57% of the Aussie’s serves back in play. Across over 1,300 MCP-charted hard court matches from the 2010s, the ATP tour average is 70% returned serves, and de Minaur’s opponents have traditionally done even better than that. De Minaur’s unreturned-serve rate of 43% is exceptionally good, ranking in the 90th percentile of service performances. He was even better against Opelka. Only 5 of his 93 service points went for aces, but 38 more didn’t come back. That’s an unreturned-serve rate of 46%, a 94th-percentile-level showing.

Strike two

De Minaur was even better when the serve wasn’t quite as good. Coaches and commentators like to talk about the “plus one” tactic: Hit a strong serve and get in position to make an aggressive play on whatever comes back. This is where the Aussie truly excelled in the title match.

In addition to the 43% of unreturned serves against Fritz, another 26% of his service points fell into the “plus one” category: second-strike shots that his opponent couldn’t handle. Tour average is 15%, and again, de Minaur hasn’t always been this good. His average over 15 charted hard-court matches in 2018 was only 12.6%. His 26% rate on Sunday ranks in the 98th percentile among charted hard-court matches. Of the 67 single-match performances on record that were better than 26%, 15 were recorded by Roger Federer. Most players never have such a good day in the plus-one category.

Strike three

Even the best servers have to deal with the occasional long rally. In our sample of charted hard-court matches, 40% of points see the returner survive the plus-one shot and put the ball back in play. From that point, the rally is more balanced, and returners win a bit more than half of points. (That’s partly because 4-shot rallies are more common than 5-shot rallies, and so on, and because a 4-shot rally, by definition, is won by the returner. Put another way, once you exclude 3-or-fewer-shot rallies, you bias the sample toward the returner; if you excluded 4-or-fewer-shot rallies, you would bias the sample toward the server, because 5-shot rallies make up a disproportionate amount of the remaining points.)

Serving like de Minaur did, he didn’t see nearly so many “long” rallies. 22% of his service points against Fritz, and 29% against Opelka, reached four shots. Facing the typical one-dimensional big server, this is the returner’s chance to even the score. But de Minaur is known more for his ground game than his service. In the final, he won 58% of these points, good enough for the 83rd percentile in our sample.

De Minaur’s performance on longer rallies didn’t really move the needle on Sunday, mostly because he so effectively prevented points from lasting that long. But the fact that he won more than half of the extended exchanges is a reminder that a great serving performance depends on more than just the serve. On a good day, even a six-footer can post numbers that leave Isner and Opelka in the dust. It isn’t always about the aces.

Roger Federer, Lottery Winner

In today’s third-round match in Rome, Roger Federer posted a truly unusual stat line. He beat Borna Coric in three sets, 2-6 6-4 7-6(7), winning 95 points to Coric’s 107. That’s a total-points-won rate (TPW) 47.0%, not unheard of for a match winner, but near the lower limit of what’s possible. By Dominance Ratio (DR)–the ratio of return points won to serve points lost–Fed comes out at 0.78, where 1.0 represents an evenly-split match. He has won only 24 times in his career with a DR below 1.0, and today was the first time since 2015. These types of decisions are often referred to as “lottery matches,” because there is more luck than usual involved in the result.

Not only did Federer win the match with a TPW below 50% and a DR below 1.0, all three of his individual sets were below those numbers. He won 23 of 55 points in the first set, 31 of 64 in the second, and 41 of 83 in the third. The low total in the first set is to be expected–he lost that set badly. But often, low numbers for an entire match stem from a bad performance in a single set, like the swoon in a 7-6 1-6 7-6 contest. Coric outplayed him–narrowly, at least–in all three sets.

You might suspect that this is extremely rare, and you’d be right. Only 4.5% of ATP tour-level matches end in favor of the player who won fewer points, and 7.2% go the direction of a player with a DR below 1.0. Those numbers usually overlap, but not always. Roughly 4.0% of matches are won by a player with a TPW below 50% and a DR below 1.0. Individual sets are even more likely to be awarded to the player who won more points. Just 2.4% of sets are won by the man who lost more points. The frequency of DR < 1.0 is 7.4%, about the same as at the match level.

It turns out that there is a precedent–exactly one!–for Fed’s feat, of winning a match with TPW < 50% and DR < 1.0 in each of three sets. That’s one previous occurence in my dataset of point-by-point sequences for over 17,000 ATP tour-level matches since 2010. Inevitably, John Isner was involved. At Memphis in 2017, Isner lost his quarter-final match to Donald Young, 7-6 3-6 7-6. Young won only 46.9% of total points, and his DR was 0.66, both marks among the lowest you’ll ever see for a winner. Like Federer, Young came close in the sets he won, tallying 49.3% of all points in both the first and third set. By saving eight of nine break points and withstanding the Isner serve in the tiebreaks, Young managed to overcome a statistically superior opponent.

Federer’s victory today wasn’t particularly reliant on break point performance, though fans will be encouraged that he converted two of his four opportunities. Much has been written about Roger’s ineffectiveness in this sort of match–against his 24 wins with a sub-1.0 DR, he has 49 losses with a DR above 1.0–and break point futility is often to blame. While big servers tend to play a lot of close matches, Federer has managed to record plenty of wins without relying on the lucky ones.

With a guaranteed place in the prominent parts of the record book, Fed is making a move on the obscure pages in the back. Having repeatedly shown us that he can win matches by outplaying the guy on the other side of the net, he finally came up with a victory when the stats pointed in the other direction.

Belinda Bencic Won a Historically Difficult Title, Just Not Last Week

Italian translation at settesei.it

Belinda Bencic is back among the WTA elites. Last week in Dubai, she won her first Premier-level title since 2015, knocking out four top-ten players in the process. They were hardly dominant victories, with all four going to deciding sets and two of the four culminating in final-set tiebreaks, but there is no question that the 21-year-old Swiss is once again a threat at the tour’s biggest events.

Her string of top-ten victories leaves us to wonder how her title stacks up against similar feats in the past. Most relevant is the path Bencic took to her last Premier title, the 2015 Canadian Open. Four years ago in Toronto, she defeated four members of the top six, including then-top-ranked Serena Williams in the semi-final and Simona Halep in the championship match. Even the two lower-ranked opponents she faced that week were dangerous players then ranked in the top 25, Eugenie Bouchard and Sabine Lisicki. Those two presented more serious challenges than Bencic’s first two matches last week against Lucie Hradecka and Stefanie Voegele.

Spoiler alert: Toronto was the tougher path. It wasn’t the most difficult of all time, but it’s in the conversation. Bencic’s Dubai title surely wasn’t easy, but it wasn’t quite as unusual as last weekend’s press made it out to be.

Quantifying path difficulty

This is something we’ve done before. I’ve written several articles comparing the quality of opposition faced in slams, particularly as it applies to the ATP’s big three. It’s more complicated to compare all WTA events, in part because there are so many different levels of tournament, and the categorizations have changed over the years. But we can wave some of that aside for today’s purposes.

Here’s the simple algorithm to measure the difficulty of a player’s path to a title:

  • Pick a standard Elo rating for the type of tournament won. (In this case, we’re using 1900 for hard-court wins. We’d use lower numbers for clay and grass, but it gets complicated, and it’s more practical for today’s purposes to focus solely on hard-court events.)
  • Find the surface-weighted Elos of each opponent she played in the tournament
  • For each opponent, calculate the odds using the standard Elo rating and the opponent’s Elo rating.
  • Calculate the difficulty for each match as one minus the odds in the previous step.
  • Sum the single-match difficulties.

In the grand slam exercises I’ve done in the past, I’ve taken a final step of normalizing the results so that an average major title is exactly 1.0. Here, the idea of ‘average’ is more nebulous, so we’ll leave our results un-normalized.

The average difficulty of a hard-court title (excluding majors and year-end championships) is about 1.8. Bencic’s 2015 Toronto run was 3.64, and her path last week was 3.01.

It’s hotter in Miami (and Indian Wells)

One of the variables that influences path difficulty is number of matches. Bencic played six last week (as she did at the 2015 Canadian Open), but the top eight seeds played only five. At Indian Wells and Miami, the top 32 seeds play up to six matches, but those might be expected to present more challenges than Bencic’s six in Dubai, since the round-of-64 opponent has already won a match.

Certainly it has turned out that way. Here are the top ten most difficult hard-court WTA title paths since 2000:

Year  Event          Winner             Matches  Difficulty  
2010  Miami          Kim Clijsters            6        3.80  
2011  Miami          Victoria Azarenka        6        3.78  
2007  Miami          Serena Williams          6        3.65  
2015  Canadian Open  Belinda Bencic           6        3.64  
2012  Indian Wells   Victoria Azarenka        6        3.59  
2018  Cincinnati     Kiki Bertens             6        3.54  
2000  Miami          Martina Hingis           6        3.46  
2002  Miami          Serena Williams          6        3.45  
2008  Miami          Serena Williams          6        3.37  
2013  Miami          Serena Williams          6        3.35

Seven of the ten are from Miami, an event with a grand-slam-like field. Indian Wells is similar, but featured a weaker draw for most of the 21st century because Serena and Venus Williams chose not to play there. Bencic’s Toronto run is one of only two in the top ten outside of the March sunshine swing. The other is Kiki Bertens’s path to last year’s Cincinnati title, in which she also defeated Halep, Petra Kvitova, and Elina Svitolina, albeit not quite in the same order than Bencic did last week.

Also hot in Dubai

I calculated title difficulty for about 600 hard-court champions going back to 2000. Bencic’s Dubai path doesn’t register among the very most challenging, but it still stands above most of the pack. Here are the next 25 toughest routes, including every path rated a 3.0 or above:

Year  Event         Winner              Matches  Difficulty  
2016  Wuhan         Petra Kvitova             6        3.32  
2000  Indian Wells  Lindsay Davenport         6        3.32  
2014  Beijing       Maria Sharapova           6        3.30  
2008  Olympics      Elena Dementieva          6        3.27  
2009  Indian Wells  Vera Zvonareva            6        3.27  
2007  Indian Wells  Daniela Hantuchova        6        3.23  
2002  Filderstadt   Kim Clijsters             5        3.23  
2013  Beijing       Serena Williams           6        3.21  
2018  Doha          Petra Kvitova             6        3.18  
2002  Los Angeles   Chanda Rubin              5        3.18  
2000  Los Angeles   Serena Williams           5        3.16  
2009  Miami         Victoria Azarenka         6        3.15  
2003  Miami         Serena Williams           6        3.13  
2002  Indian Wells  Daniela Hantuchova        6        3.10  
2018  Wuhan         Aryna Sabalenka           6        3.08  
2008  Indian Wells  Ana Ivanovic              6        3.08  
2012  Tokyo         Nadia Petrova             6        3.08  
2010  Sydney        Elena Dementieva          5        3.06  
2010  Indian Wells  Jelena Jankovic           6        3.03  
2000  Sydney        Venus Williams            6        3.02  
2000  Sydney        Amelie Mauresmo           4        3.02  
2019  Dubai         Belinda Bencic            6        3.01  
2009  Tokyo         Maria Sharapova           6        3.00  
2002  San Diego     Venus Williams            5        3.00  
2001  Sydney        Martina Hingis            4        2.99

There’s Belinda again, at 32nd overall. Historically, the February tournaments in the Gulf haven’t been the toughest on the calendar, at least compared with Indian Wells, Miami, and Sydney. Yet Kvitova took an even more difficult path to the title last year in Doha. (Dubai and Doha trade tournament levels each year. As a Premier 5, Doha was worth more points in 2018; Dubai took over the status and was worth more points in 2019.) She also plowed through four top-ten opponents, and she needed to beat 33rd-ranked Agnieszka Radwanska just to earn a place in the round of 16.

Strong but weaker

Again, Bencic’s Dubai title was an impressive feat. But as we’ve seen, it pales in comparison with her previous Premier title. I suppose she might have won anyway if faced with more difficult competition, but that pair of third-set tiebreaks suggests she was pushed to the limit as it was.

While the current WTA field is extremely deep, packed with very good players, the lack of one historically great superstar (or more!) shows up in the Elo ratings. Of the 35 champions shown in the two tables above, 12 had to beat a player with a surface-weighted rating of 2240 or higher, and 12 more needed to get past an opponent rated 2100 or above. Bencic’s toughest task last week was Halep, at 2054. While it isn’t easy to knock off several consecutive foes in the 2000 range, it’s not the same as including one victory over a superstar like Serena, Venus, Maria Sharapova, or Victoria Azarenka at her peak.

At the 2015 Canadian Open, Bencic counted Serena among the vanquished. Maybe in another four years, when the Swiss is due for her next odds-defying Premier title, she’ll face down a couple of new young superstars and earn a place at the top of this list.

Juan Ignacio Londero’s First Five ATP Match Wins

Italian translation at settesei.it

Last week’s ATP 250s had their share of surprises, with all three top seeds falling in their first matches. But the biggest shock of all was reserved for Sunday, when 25-year-old Argentine Juan Ignacio Londero capped an unexpected breakthrough week in Cordoba with a title. The hometown wild card pummelled Federico Delbonis and then came from behind to defeat Guido Pella a three-set final. Londero was playing just his fourth tour-level event, and his first tour-level win came on Tuesday, a first-round upset of fifth-seed Nicolas Jarry.

There aren’t many players who’ve managed to win a title the same week as their first match win on tour. Going back to 1990, I found only five others who matched Londero’s feat:

Player                Age   Year  Event        
Nicolas Lapentti      19.1  1995  Bogota       
Lleyton Hewitt        16.9  1998  Adelaide     
Juan Ignacio Chela    20.5  2000  Mexico City  
Santiago Ventura      24.4  2004  Casablanca   
Steve Darcis          23.3  2007  Amersfoort   
Juan Ignacio Londero  25.5  2019  Cordoba  

It’s a diverse group. Lleyton Hewitt announced his presence with a title as he embarked on a Hall of Fame career, Nicolas Lapentti had great things ahead of him as well, and Juan Ignacio Chela would go on to win six more titles. (Next time a player named Juan Ignacio wins his first ATP match, watch out!) The other two players broke through at older ages and provide better clues as to what we should expect from Londero. Steve Darcis won one more title within a year of his first, and stuck around long enough to crack the top 40 at age 33. Santiago Ventura never played another final, and his career peak ranking was 65, just four spots above Londero’s new level.

A perfect 25

Still, pointing out that Londero is unlikely to develop into a top ten player doesn’t mean we shouldn’t celebrate his accomplishment. I found over 1,000 players who won their first tour-level match since 1990, and only 24% of them managed to win their second match at the same tournament, let alone the title. The average first-time winner claimed a mere 1.3 matches, including their debut win. In addition to the six titlists, only nine reached the final and 43 made it to the semis after recording their first win.

The results for debut winners are even more bleak when we narrow our focus to players in Londero’s age group. Despite the increasing age of the men’s tennis population, if a player hasn’t made an impact on tour before age 25, he is unlikely to do so. 17% of our first-time winners were 25 or older, and Londero is the only one of them to reach the final in his breakthrough event. These 185 players combined for only 53 wins after their first-round milestones, and four of those wins were recorded by Londero last week.

Pessimistic as this sounds, there are a few encouraging precedents for the Argentine to follow. Paolo Lorenzi won his first tour-level match about one month younger than Londero’s current age. It took Lorenzi nearly another decade to hoist his first ATP trophy, and he’s still hovering just ouside the top 100 at age 37. Tennys Sandgren (who, coincidentally, lost to Lorenzi in New York last night), didn’t win a tour-level match until he was 26. Six months later he was in the quarter-finals of the Australian Open. The most extreme late bloomer is Victor Estrella, who was almost 33 years old at the time of his first ATP match win, which he followed with a tour-level title 18 months later.

Of course, Lorenzi is one of a kind, and the unexpected feats achieved by Sandgren and Estrella have minimal predictive value. Beyond the thrill of winning his hometown tournament, the most important implication of the title for Londero is that it launches his ranking into the top 70. He gets a place in the Roland Garros main draw, and in the next twelve months, he’ll have a number of other opportunities to play tour-level events. He deserves it: My Elo rankings suggest he is not only a top-70 player overall, but he is just outside the top 40 on clay courts. Londero’s title truly came out of nowhere, but there’s no reason to be suprised the next time he posts an excellent result on clay.

Top Seed Upsets in ATP 250s

Italian translation at settesei.it

In a typical week, no one would notice if Fabio Fognini, Karen Khachanov, and Lucas Pouille combined to go 0-3. This week is different, as those three men held the top seeds at the ATP events in Cordoba, Sofia, and Montpellier. After their first-round byes, each of them lost in the second round, to Aljaz Bedene, Matteo Berrettini, and Marcos Baghdatis, respectively. At least two of the top seeds pushed their opponents to three sets, while Fognini lasted only 71 minutes.

This is not the first time a trio of number one seeds have suffered first-match upsets in the same week. Amazingly, it’s not even the first such occurrence in this very week on the calendar. Two years ago, when the South American event was played in Quito, the results were the same: top seeds Marin Cilic, Ivo Karlovic, and Dominic Thiem all failed to win a match. Thiem’s vanquisher, Nikoloz Basilashvili, even extended the streak the following week, heading to Memphis and handing Karlovic his second straight second-round ouster.

Predictable upsets?

Focusing on these losses, it’s natural to wonder whether top seeds are particularly fragile in this sort of tournament. There’s certainly a logic to it. The number one seed at an ATP 250 is usually ranked in the top 20, and is the sort of player who might have considered taking the week off. He knows that more ranking points are available at slams and Masters, so winning a smaller event isn’t his highest priority. His opponent, on the other hand, is competing every chance he gets, and the points on offer at a smaller event could make a big difference in his standing. Further, he has already played–and won–his first-round match, so he might be performing better than usual, or the conditions might suit him particularly well.

Let’s put it to the test. Since 2010, not counting this week’s carnage, I found 267 non-Masters events at which a top seed got a first-round bye and completed his second-round match. (Additionally, there have been three retirements and one withdrawal; only one of those resulted in a loss for the top seed.) The number one seeds had a median rank of 10, and the underdogs had a median rank of 89. Based on my surface-weighted Elo ratings at the time of each match, the favorites should have won 81.5% of the time. That’s better than this week’s trio of top-seeded losers, who were 64% (Fognini), 80% (Khachanov), and 69% (Pouille) favorites.

As it happened, the unseeded challengers were more successful than expected. The favorites won only 76.8% of those matches–a rate low enough that there is only a 3% probability it is due to chance alone. It’s not an overwhelming effect–certainly not enough that we should have predicted this week’s results–but it seems that a few of the top seeds are showing up unmotivated and a handful of the underdogs are playing better than expected.

Riding the wave

What about the underdog winners? Once they’ve defeated the top seed, how many capitalize on the opportunity? Berrettini came back to beat Fernando Verdasco in his quarter-final match today, while Baghdatis and Bedene play later. My forecasts believe that, of the three, Bedene has the best chance of claiming a title, though still less than a one-in-five shot at doing so.

In our subset of 267 matches, the underdog won 66 of them. More than half the time, though, that was the end of the run. 38 of the 66 (58%) fell in the quarter-finals. Another 17 lost in the semis. Whatever works so well for these underdogs in the second round disappears afterward. In the 105 matches contested by these 66 men in the quarter-finals and beyond, Elo thinks they should have won 44.9% of them. Instead, they managed only 42.3%.

There’s still a bit of hope. Five men knocked out the top seed in the second round and went on to win the entire tournament. One of those was a challenger we’ve already mentioned: Estrella, who knocked out Karlovic and went on to hoist the trophy in Quito two years ago. Maybe there’s some magic in week six. This week’s trio of underdogs would surely love to think so.

Novak Djokovic and the Narrowing Slam Race

Italian translation at settesei.it

It doesn’t take a statistician, or even a spreadsheet, to recognize that the 2019 Australian Open wasn’t Novak Djokovic’s most difficult path to a major title. We can debate whether the straight-set win over Rafael Nadal in the final was due to Djokovic’s utter dominance or a subpar performance from (a possibly still recovering) Rafa. But there’s more to a grand slam title than the final, and the only top-18 opponent Novak faced in the first six rounds was Kei Nishikori, who retired after 52 minutes.

On the traditional grand slam leaderboard, quality of competition doesn’t matter. Roger Federer has 20, Nadal has 17, and now Djokovic has 15. As I’ve written before, the race is closer than that, since Nadal’s and Djokovic’s opponents have, on average, been stronger than Federer’s. My metric for “adjusted slams” estimates the likelihood that a typical major titlist would defeat the specific seven opponents that a player faced, based on their surface-weighted Elo at the time of the match. (I’ve also used this approach for Masters titles.) The explanation is a mouthful, but the underlying idea is simple: Some majors represent greater achievements than others, both because some eras offer stiffer competition and because some draws are particularly daunting.

A slam title against an average level of competition is worth exactly 1. Tougher paths are worth more than 1, and easier draws are worth less. Here is the current leaderboard, with each player’s raw tally, average difficulty rating of their titles, and adjusted total:

Player          Slams  Avg Diff  Adj Slams  
Roger Federer      20      0.88       17.7  
Rafael Nadal       17      1.01       17.1  
Novak Djokovic     15      1.11       16.6 

(The numbers in this post do not all precisely agree with those I’ve published in the past, because I’ve improved the accuracy of my Elo-based rating system. All three of the players have seen their adjusted slam totals decrease, because the improved Elo algorithm eliminates some of the Elo “inflation” that overvalued recent achievements.)

These three guys have often had to go through each other, but Djokovic has had the toughest paths of all. The average difficulty of his first 12 majors was 1.2, higher than all but three of Rafa’s titles, one of Roger’s, and two of those won by Pete Sampras. Only recently has he been able to boost his total without quite so much of a challenge. His Australian Open title was worth 0.84 majors, only the fourth of his titles against a below-average set of opponents. It was, however, tougher than Wimbledon or the US Open last year, which were worth 0.77 and 0.65, respectively.

It’s unlikely, of course, that the current leaderboard–adjusted or otherwise–will be the final reckoning among these three men. But on the adjusted list, they will probably remain tightly packed. Because the rest of the pack has weakened, with Andy Murray and Stan Wawrinka no longer regular features of the second week, major titles aren’t what they used to be. Early in the decade, it wasn’t uncommon for a player to beat multiple members of the big four en route to a title and add at least 1.2 to his adjusted tally.

In 2018, slam difficulty was barely half of that recent peak level:

Year    Avg Diff  
2002        0.73  
2003        0.65  
2004        0.82  
2005        0.95  
2006        0.77  
2007        0.93  
2008        1.05  
2009        1.00  
2010        0.95  
2011        1.19  
2012        1.23  
2013        1.22  
2014        1.28  
2015        1.12  
2016        1.27  
2017        0.91  
2018        0.69

This could all change, especially if Djokovic wins a Roland Garros title by upsetting Nadal. (Nothing generates high competition-adjusted numbers like beating Nadal on clay.) But it’s more likely that these three men will have to keep incrementing their totals by 0.6s and 0.7s. While that could be enough to put Rafa or Novak on top by the end of the 2019, it won’t give anyone a commanding lead. It’s a good thing that there’s a lot more to the GOAT debate than slam totals, because slam totals–when properly adjusted for the difficulty of achieving them–make it awfully hard to pick a winner.

The Naomi Osaka First-Set Guarantee

Italian translation at settesei.it

Today in the Australian Open quarter-finals, Naomi Osaka recorded a routine victory, beating 6th seed Elina Svitolina 6-4 6-1. She’ll face Karolina Pliskova in tomorrow’s semi-final, and she has a chance to finish the tournament as the top-ranked player in the world.

(See the bottom of this post for updates.)

Osaka’s sprint to the finish line against Svitolina was what we’ve come to expect from the 21-year-old. The Eurosport commentators shared a remarkable stat: The last 59 times Osaka has won the first set, she has gone on to win the match. (On Eurosport during the match, they said 57, making today’s win 58, but I believe they left out a 2017 win by retirement against Heather Watson in which the first set was completed.) The last time she failed to convert a one-set advantage into a victory was the final match of her 2016 season, in Tianjin against Svetlana Kuznetsova.

Of course, winning the first set is a big advantage for anyone. If two players are evenly matched and there’s no momentum effect, the winner of the first set has a 75% chance of finishing the job. In the real world, the woman who takes the first set is usually the superior player, so her odds in the second and third sets are even better still. On the 2018 WTA tour, the player who claimed first set went on to win the match 81.5% of the time.

Even if Osaka’s theoretical odds of converting one-set advantages are even higher, 59 matches in a row is one heck of a feat. Only 15 women have an active streak of 10 or more consecutive first-set conversions, and a mere four hold a running streak of at least 20. In addition to Osaka, Aryna Sabalenka has converted 25 straight first-set victories, Qiang Wang has won 27 in a row, and Serena Williams is ready to pounce as soon as Osaka falters, with a current tally of 51. Serena’s string of consecutive conversions stretches over an even longer span, back to April 2016, in Miami. (Remember who came back to beat her? Svetlana Kuznetsova.)

It’s no surprise to see Serena showing up near the top of this list. After several years of looking up various tennis records and streaks, I’ve discovered a few general rules. First, if you think you’ve found a noteworthy recent achievement, Serena did it better. Second, if it involves brushing aside the tour’s rank and file, Steffi Graf was even better than Serena. And third, no matter how impressive Serena’s and Steffi’s feats, the all-time record will belong to either Chris Evert or Martina Navratilova.

The first-set-conversion streak no different. In addition to her current streak of 51 straight, Serena won 61 in a row in 2002-03. That’s two matches and three places above Osaka, but it’s only 37th on the all-time list. Graf converted first-set advantages for more than twice as long, tallying 126 in a row from 1989 to 1991. As impressive as that is, my third rule holds with a vengeance: Evert converted 220 in a row between 1978 and 1981 to earn top billing on this list. Navratilova comes in second, but with the consolation that she holds third place as well. Martina and Steffi are the only women with multiple triple-digit streaks.

Here are the longest first-set conversion streaks held by players in the top 40. Many of these women put together multiple streaks of 60 or more, and in those cases I’ve listed only their longest:

Rank  Player                   Matches     Span     Notes  
1     Chris Evert                  220  1978-81  + 3 more  
2     Martina Navratilova          172  1982-84  + 5 more  
4     Steffi Graf                  126  1989-91  + 3 more  
6     Monica Seles                 112  1991-93  + 1 more  
7     Mary Joe Fernandez           105  1989-91            
8     Pam Shriver                  105  1986-88            
9     Vera Zvonareva               103  2006-08            
12    Martina Hingis                86  1996-97            
14    Arantxa Sanchez Vicario       85  1992-93            
16    Victoria Azarenka             79  2011-13            
17    Maria Sharapova               77  2010-12  + 1 more  
19    Margaret Court                74  1969-77            
21    Venus Williams                73  1999-01            
22    Sue Barker                    70  1973-78            
23    Evonne Cawley                 69  1978-80  + 1 more  
24    Lindsay Davenport             67  1999-00  + 1 more  
25    Tracy Austin                  67  1979-80            
26    Virginia Wade                 66  1975-78            
28    Gabriela Sabatini             65  1990-91            
30    Andrea Jaeger                 64  1981-82            
33    Claudia Kohde Kilsch          63  1986-87            
34    Kerry Reid                    62  1969-77            
37    Serena Williams               61  2002-03            
39    Anna Chakvetadze              60  2006-07            
40    Naomi Osaka                   59  2017-19  (active)

* Unfortunately all of these numbers come with a huge caveat. My historical WTA database isn’t perfect. I know that there are Evert and Navratilova matches missing, along with a handful of later results. For records like this, a single missing match could mean that Evert really had two streaks of 110 each, or any number of other permutations that would render my all-time list incorrect. So please, take these records as unofficial, and maybe the WTA will query their own–presumably more complete–database to produce a better list.

This is good company for the reigning US Open champion, and it looks even better if we narrow our view to 21st-century players. Only five of the women ahead of her on the list are active, and four of those are winners of multiple majors–another club that the 21-year-old could join this week. Her semi-final opponent, Karolina Pliskova, executed her own history-making comeback against Serena today. But if Pliskova finds herself down a set to Osaka, even she may not be enough of an escape artist to fight back against the best front-runner in women’s tennis.

Update: Osaka finished off the 2019 Australian Open with two more first-set conversions. In both the semi-final against Pliskova and the final against Kvitova, she won the the first set and went on to win in three. Thus, her streak is up to 61 and she has matched Serena’s best.