Aslan Karatsev Isn’t Better Than Novak Djokovic, But…

What’s better, winning 15 of 17 matches, or going undefeated for 9?

Even if you know that the 15-2 guy is Aslan Karatsev in 2021, and the 9-0 guy is Novak Djokovic this year, there’s no obvious answer. Sure, Djokovic beat Karatsev easily, and Novak’s nine wins included a grand slam title. We know Djokovic is the better player–he’s got more than a decade of proof to support that claim–and no one in their right mind would take Karatsev’s last three months over Novak’s.

True as all of that is, it’s not the question I’m asking.

The player with the 15-2 record has two advantages over his 9-0 peer. First, he has more wins. (Mind-blowing stuff, I know.) Second and more importantly, he has more evidence of his current level, even if it includes two losses. The 9-0 guy could go undefeated for 17 matches… but he could also end up 11-6. His nine-match record simply doesn’t give us as much information.

Again, if you know which players I’m talking about, that doesn’t matter–we have 1,100 matches worth of information about Djokovic, most of which say that his 9-0 is business as usual. He might not win his next eight matches, but he’s certainly not going to lose more than a few of them.

The yElo light at the end of the tunnel

If you’ve been reading my last couple of posts, you know where I’m going with this.

Last week, I introduced the concept of yElo. The “y” stands for year, but it can be used for any unit of time shorter than an entire career. Instead of using every bit of available information, we look only at a designated time frame, such as the 2021 season. While maintaining our knowledge of other players (e.g. Andrey Rublev is a really tough opponent; Egor Gerasimov not so much), we treat each player as if we know nothing else about him.

So truly, we’re comparing Karatsev’s 15-2 with Djokovic’s 9-0, taking into account the quality of their competition.

Plug every ATPer’s 2021 season into the formula, and here are the yElo leaders, through last weekend’s finals in Dubai and Acapulco:

Rank  Player                  W-L  yElo  
1     Aslan Karatsev         15-2  2082  
2     Novak Djokovic          9-0  2081  
3     Daniil Medvedev        13-2  2061  
4     Andrey Rublev          15-3  2006  
5     Marton Fucsovics       14-4  2000  
6     Stefanos Tsitsipas     14-4  1983  
7     Alexander Zverev        9-4  1922  
8     Matteo Berrettini       8-2  1918  
9     Jeremy Chardy          13-6  1915  
10    Lloyd Harris           11-5  1878  
11    Jannik Sinner           9-4  1848  
12    Alexei Popyrin          9-3  1836  
13    Roberto Bautista Agut   8-7  1831  
14    Taylor Fritz            7-4  1830  
15    Sebastian Baez         14-1  1820  
16    Felix Auger Aliassime   8-4  1818  
17    Karen Khachanov         9-5  1810  
18    Mackenzie McDonald     11-5  1809  
19    Tomas Machac           10-3  1806  
20    Daniel Evans            6-3  1800

Yes, Karatsev really does outscore Djokovic. Barely.

We are accustomed to 52-week rankings and Elo ratings that carefully weigh an entire career’s worth of work. So this is a deeply weird list, with only a handful of players anywhere near where we’d expect. #15 and #19 are Challenger-level guys, for crying out loud!

Embrace the race

The official Race to Turin doesn’t look as bizarre as the yElo list, but imagine showing it to someone in December, with Karatsev 5th, Marton Fucsovics 7th, and Rafael Nadal outside the top 20. Both the Race and the yElo list are “wrong” in the traditional sense, but they tell us much more about the 2021 season than the old-fashioned rankings do.

Tennis’s relentless focus on the long view sucks some excitement out of the season. Think of virtually any team sport. A month into the season, some unheralded club has gotten off to a hot start, and at least in some quarters, that’s the story–can they keep it up? should we have seen this coming all along? Nobodies are cast in the role of front-runners, and established stars play the part of underdogs.

In tennis, nobodies are… well, nobodies who won a few matches lately. Superstars play the part of superstars who’ve been taking some time off. Sure, we know that Djokovic and Nadal are going to end up near the top of the rankings list in November, just like we know the Dodgers and Yankees will be in the playoffs. But that doesn’t mean we ought to take it as a foregone conclusion from day one. In baseball, as the saying goes, everybody’s in first place on Opening Day.

Embracing the race–focusing on which players are leading the pack at each point throughout the season–doesn’t have to mean throwing away longer-term rankings. The traditional calculations should still be used for tournament entries and (maybe) for seedings. Top players have earned as much, and tournament entry is a factor that isn’t present in the major team sports.

Everybody wants to know how the ATP will survive when the Big Three are out of the picture. Well, this is a start–pay attention to who’s winning in 2021. If we take yElo’s word for it, a virtual nobody emerged to overtake Djokovic for the #1 spot going into Miami! An Argentinian prospect is playing like a top-15 guy just by winning a bunch of Challengers! Jeremy Chardy is more than just a hitting partner for the other Frenchmen!

The stories are out there, just like they are every year. It’s a shame that they get buried by all the talk about players who won last year.

I’ve added men’s and women’s yElo ratings to the Tennis Abstract website, and they’ll be updated weekly.

The Best 22-Match yElo Streaks

Earlier this week I wrote about Garbine Muguruza’s outstanding start to the season, and I introduced a new method to quantify a player’s level in a relatively short time span. Instead of using traditional Elo, which takes into account everything we know about a player, my new metric, yElo, uses what we know about everyone else, but treats a player’s short-term performance as if it is all we know about her. The parameters for yElo, such as k-value, are the same as the ones I’ve arrived at to make “regular Elo” as predictive as possible.

In other words, we measure Muguruza’s 22 matches in 2021 as if she had never played a WTA event before. As we saw in my earlier post, this approach considers the strength of opponents each player faced, and it rates her 18-4 record as better than anyone else in 2021, including Naomi Osaka’s 10-0 start.*

* excluding walkovers, which I ignore for all versions of Elo and yElo.

Muguruza’s season start has been outstanding and it is definitely underrated by the official WTA rankings and maybe even by the race, but I don’t want to make too much of it–one title in five tournaments in hardly world-historical stuff. On the other hand, it’s a good way to get our feet wet with a new metric that I think will prove useful for a wide range of tennis comparisons.

Garbine vs Garbine

The Spaniard won majors in 2016 and 2017, and she briefly reached number one in the rankings in September of 2017. Those achievements belong on a Hall of Fame plaque over her recent Dubai title and Yarra River Classic final. But was she really playing better back then?

She was not! I ran the yElo formula for every 22-match sequence in Muguruza’s career. The best of the bunch–again, taken entirely out of context, as if we know nothing beyond those 22 matches–was a run late in 2015 when she reached the Wuhan final, won Beijing, then went undefeated in the WTA Finals round robin stage. Her yElo based on those 22 matches was 2172, narrowly better than her 2021 yElo of 2160.

The more memorable moments of her career don’t quite stack up:

Elo   W-L   Span                            
2172  17-5  2015 Wim R16 - WTA Finals RR    
2160  18-4  2021 Abu Dhabi R64 - Dubai F    
2148  18-4  2017 Birmingham R32 - Cinci F   
2122  19-3  2017 Wimb R128 - USO R16 (#1)   
2084  17-5  2017 Miami R64 - Wimb F         
2076  16-6  2016 Doha QF - Roland Garros F 

I haven’t shown every 22-match sequence of her career, because that list is long and boring–the streaks heavily overlap with each other, and thus there are often tiny differences between them. But it is instructive to look at the time periods that ended at key moments.

The best of that bunch was the 22-match run ending with Muguruza’s 6-1 6-0 beatdown of Simona Halep at the 2017 Cincinnati final. That set the stage for her ascent to #1, though the ranking move didn’t happen until after the US Open. That streak is close to her current level. The 22 matches leading up to the official #1 takeover are a bit lower (she lost to Petra Kvitova at the US Open, which was less forgivable then than now), and the timespans ending with her two slam finals are still further down the list.

Don’t misunderstand–Muguruza was playing very well throughout all of these time periods. But when we crunch the numbers, we find that her current level is roughly on par with the best she’s ever played.

Garbine vs the world

Metrics are a lot more informative once we gain some context. Many of you probably have a good sense of what regular Elo ratings mean–2100+ is outstanding, 2000+ is top ten-ish, 1900+ is approximately the top 20, and so on. We can piggyback on that for yElo. When Muguruza’s 22-match yElo this season is 2160, it really does mean that, when feeding that very limited set of results into the Elo formula, it thinks Muguruza’s level is close to that of the best player in the world.

Well… the best player in the world right now. There’s no truly dominant force in women’s tennis at the moment, so we’re not seeing players at the top end of the all-time Elo scale. In regular Elo, peak Martina Navratilova and peak Steffi Graf topped 2600, more than 400 points above Osaka’s current rating of 2189. It will not surprise you, then, to learn that Navratilova, Graf, Serena Williams, Chris Evert, and many others put together 22-match runs* that make Muguruza’s 2021 season look positively pedestrian.

* yes, I know how ridiculous it is that this whole article is based on the arbitrary 22-match time span. We could do the same stuff with the more natural-sounding 20-match span, but there wouldn’t be an intuitive way to fit Muguruza’s current run into the discussion. And let’s face it, 20 is just as arbitrary as 22.

Out of my entire database on women’s tennis results going back to 1950 or so, about 100 women have enjoyed a 22-match run that outscores Muguruza’s best. The top of the list is the end of Navratilova’s 1983 season, which is worth a yElo of 2445. Close behind is Monica Seles, who reached 2438 with a streak starting at the end of 1992 and extending into the 1993 season. Three more women topped 2400, another 27 exceeded 2300, and 46 more put together 22 consecutive matches worth at least 2200.

Here are the 15 active women who’ve played at least as well as Muguruza for their best 22-match spans:

yElo  Player                W-L   Year(s)  
2389  Serena Williams       21-1  2001-02  
2386  Venus Williams        22-0  2000     
2335  Kim Clijsters         20-2  2002-03  
2332  Victoria Azarenka     22-0  2012     
2234  Vera Zvonareva        18-4  2008     
2217  Svetlana Kuznetsova   19-3  2004     
2217  Naomi Osaka           20-2  2019-20  
2209  Samantha Stosur       20-2  2010     
2205  Petra Kvitova         19-3  2011-12  
2205  Simona Halep          20-2  2018     
2196  Caroline Garcia       18-4  2017     
2186  Ashleigh Barty        19-3  2019     
2180  Angelique Kerber      18-4  2015-16  
2174  Carla Suarez Navarro  18-4  2015     
2172  Garbine Muguruza      17-5  2015

With the caveat that I haven’t spent much of my life thinking about the best 22-match runs in women’s tennis history, this seems like a credible list. I particularly like how yElo manages to consider strength of opponent to the point that an 18-4 run*, like Zvonareva’s in 2008, can outrank so many 20-2s. (Vera even beats a few 22-0s from the amateur era.)

* the link shows a few extra matches–the 18-4 run starts in the QFs of Guangzhou and ends in the Tour Finals semi-final. Note again that yElo skips retirements.

I hope you find the new yElo metric as interesting as I do. I’ll definitely be doing more with it, since I suspect it has value even outside the narrow context of one player and a single timespan of arbitrary lenth.

Repurposing Elo for Streaks, Seasons, and Garbine Muguruza

Elo is a fantastic tool for its explicit purpose: estimating the skill level of players based on available information. For instance, my WTA ratings currently rank Ashleigh Barty second. That seems plausible enough–it may be correct to give her the edge in a head-to-head matchup with everyone on tour except for Naomi Osaka. But with women pursuing such different schedules this season, a rating is only so useful.

For all of Barty’s or Osaka’s skill, is it right to say either one of them has had a better 2021 season than Garbine Muguruza? Osaka won the Australian Open, so she has a valid claim. Barty’s argument is a lot more tenuous, based on only eight victories. The Spaniard’s case writes itself–only a handful of players are up to double digits in wins this year, and Muguruza already has 18. How could we decide? If Elo is the smart version of the official rankings, what’s the smart version of the official race?

Starting fresh

The Elo algorithm itself offers a solution. A big part of the reason Muguruza is rated 4th on my current Elo list–and not higher–is her career before 2021. We had hundreds of matches worth of data on Garbine before January 1st, and it would be silly to throw all that away. Her 18-4 start is fantastic, but it doesn’t supersede everything that came before. It just gives us reason to update our rating.

Here’s where the ranking/race analogy is useful. The official rankings use a time span of 52 weeks (or more). The race restarts on January 1st. We could do the exact same thing with Elo, throwing away all results from the previous year and starting over, but that would be wasteful–it wouldn’t allow us to take into account whether players had faced particularly easy or tough draws, for instance.

The solution is to set Elo ratings back to zero (or 1500, in Elo parlance) one player at a time.

Take Muguruza. Instead of starting the year with a rating of 1981 and a history of several hundred matches, we pretend to know nothing about her. We give her a newbie’s rating of 1500 and a history of zero matches. Then we run the Elo algorithm to update her rating over the course of her 22 matches. First she faces Kristina Mladenovic (with her actual rating at the time of 1817), and improves to 1605. Then she beats Aliaksandra Sasnovich (and her rating of 1805), and improves to 1692. Repeat for each of her 2021 results, and the end result is a rating of 2160–almost 100 points higher than her current “real Elo” rating and within shouting distance of Osaka’s 2189.

To compare players, work through the same steps for everybody else, calculating their current-season rating as if they played their first career match in January.

It’s worth taking a moment to think about exactly what we’re measuring. That outstanding 2160 rating is what you get if a complete unknown shows up with zero match experience, then goes on the 22-match run that has been Muguruza’s season so far. The difference between real-Garbine and fake-newbie-Garbine is that the real one has an extensive track record that tells us she’s always been good–but that she probably isn’t quite this good.

I call it … yElo

This approach is “Elo for seasons” or “year Elo”–yElo*. It doesn’t have to be limited to calendar years, as the same approach would be useful to comparing, say, 20-match segments. It allows us to take advantage of the Elo algorithm–and the well-informed ratings of other players–to measure partial careers.

* you can pronounce it like the color “yellow,” but I prefer to say it like Phil Dunphy from Modern Family answering the phone.

Muguruza’s 2160 rating sure looks good, so how does it stack up against the rest of the tour? Here’s the 2021 top 20, considering players with at least five match wins through the Dubai and Guadalajara finals last weekend:

Rank  Player                W-L  yElo  
1     Garbine Muguruza     18-4  2160  
2     Naomi Osaka          10-0  2094  
3     Jessica Pegula       15-5  2002  
4     Serena Williams       8-1  1997  
5     Elise Mertens        11-2  1971  
6     Karolina Muchova      7-1  1953  
7     Aryna Sabalenka      11-4  1943  
8     Iga Swiatek          10-3  1941  
9     Daria Kasatkina      10-4  1910  
10    Barbora Krejcikova   10-5  1905  
11    Shelby Rogers         9-4  1902  
12    Jil Teichmann         9-5  1899  
13    Anett Kontaveit       9-4  1897  
14    Jennifer Brady        9-4  1892  
15    Cori Gauff           11-5  1885  
16    Danielle Collins      9-4  1883  
17    Ashleigh Barty        8-2  1878  
18    Sara Sorribes Tormo   9-2  1867  
19    Ann Li                5-1  1864  
20    Simona Halep          6-2  1854 

Like any Race list in March, this isn’t really reflective of skill. But when we consider the small amount of data it has to work with for each player, it’s … pretty good?

Again, you can quibble over whether Osaka or Muguruza has had the better season, but this approach weighs the better winning percentage and stronger average opponent against the much higher absolute win count and gives us a credible answer. Muguruza’s additional evidence of good tennis playing puts her ahead of Osaka’s evidence of short-term unbeatability.

While yElo is basically just a toy–it certainly doesn’t have the same predictive value as regular Elo–this initial look makes me like it. The possibilities are endless, from more sophisticated race tracking, to ranking the greatest seasons of all time, to comparing a player’s current hot streak to what’s she’s done in the past. Stay tuned, as I’m sure I’ll have more yElo results to report in the future.

So, About Those Stale Rankings

Both the ATP and WTA have adjusted their official rankings algorithms because of the pandemic. Because many events were cancelled last year (and at least a few more are getting canned this year), and because the tours don’t want to overly penalize players for limiting their travel, they have adopted what is essentially a two-year ranking system. For today’s purposes, the details don’t really matter–the point is that the rankings are based on a longer time frame than usual.

The adjustment is good for people like Roger Federer, who missed 14 months and is still ranked #6. Same for Ashleigh Barty, who didn’t play for 11 months yet returned to action in Australia as the top seed at a major. It’s bad for young players and others who have won a lot of matches lately. Their victories still result in rankings improvements, but they’re stuck behind a lot of players who haven’t done much lately.

The tweaked algorithms reflect the dual purposes of the ranking system. On the one hand, they aim to list the best players, in order. On the other hand, they try to maintain other kinds of “fairness” and serve the purposes of the tours and certain events. The ATP and WTA computers are pretty good at properly ranking players, even if other algorithms are better. Because the pandemic has forced a bunch of adjustments, it stands to reason that the formulas aren’t as good as they usually are at that fundamental task.

Hypothesis

We can test this!

Imagine that we have a definitive list, handed down from God (or Martina Navratilova), that ranks the top 100 players according to their ability right now. No “fairness,” no catering to the what tournament owners want, and no debates–this list is the final word.

The closer a ranking table matches this definite list, the better, right? There are statistics for this kind of thing, and I’ll be using one called the Kendall rank correlation coefficient, or Kendall’s tau. (That’s the Greek letter τ, as in Τσιτσιπάς.) It compares lists of rankings, and if two lists are identical, tau = 1. If there is no correlation whatsoever, tau = 0. Higher tau, stronger relationship between the lists.

My hypothesis is that the official rankings have gotten worse, in the sense that the pandemic-related algorithm adjustments result in a list that is less closely related to that authoritative, handed-down-from-Martina list. In other words, tau has decreased.

We don’t have a definitive list, but we do have Elo. Elo ratings are designed for only one purpose, and my version of the algorithm does that job pretty well. For the most part, my Elo formula has not changed due to the pandemic*, so it serves as a constant reference point against which we can compare the official rankings.

* This isn’t quite true, because my algorithm usually has an injury/absence penalty that kicks in after a player is out of action for about two months. Because the pandemic caused all sorts of absences for all sorts of reasons, I’ve suspended that penalty until things are a bit more normal.

Tau meets the rankings

Here is the current ATP top ten, including Elo rankings:

Player       ATP  Elo  
Djokovic       1    1  
Nadal          2    2  
Medvedev       3    3  
Thiem          4    5  
Tsitsipas      5    6  
Federer        6    -  
Zverev         7    7  
Rublev         8    4  
Schwartzman    9   10  
Berrettini    10    8

I’m treating Federer as if he doesn’t have an Elo rating right now, because he hasn’t played for more than a year. If we take the ordering of the other nine players and plug them into the formula for Kendall’s tau, we get 0.778. The exact value doesn’t really tell you anything without context, but it gives you an idea of where we’re starting. While the two lists are fairly similar, with many players ranked identically, there are a couple of differences, like Elo’s higher estimate of Andrey Rublev and its swapping of Diego Schwartzman and Matteo Berrettini.

Let’s do the same exercise with a bigger group of players. I’ll take the top 100 players in the ATP rankings who met the modest playing time minimum to also have a current Elo rating. Plug in those lists to the formula, and we get 0.705.

This is where my hypothesis falls apart. I ran the same numbers on year-end ATP rankings and year-end Elo ratings all the way back to 1990. The average tau over those 30-plus years is about 0.68. In other words, if we accept that Elo ratings are doing their job (and they are indeed about as predictive as usual), it looks like the pandemic-adjusted official rankings are better than usual, not worse.

Here’s the year-by-year tau values, with a tau value based on current rankings as the right-most data point:

And the same for the WTA, to confirm that the result isn’t just a quirk of the makeup of the men’s tour:

The 30-year average for women’s rankings is 0.723, and the current tau value is 0.764.

What about…

You might wonder if the pandemic is wreaking some hidden havoc with the data set. Remember, I said that I’m only considering players who meet the playing time minimum to have an Elo rating. For this purpose, that’s 20 matches over 52 weeks, which excludes about one-third of top-100 ranked men and closer to half of top-100 women. The above calculations still consider 100 players for year-end 2020 and today, but I had to go deeper in the rankings to find them. Thus, the definition of “top 100” shifts a bit from year-end 2019 to year-end 2020 to the present.

We can’t entirely address this problem, because the pandemic has messed with things in many dimensions. It isn’t anything close to a true natural experiment. But we can look only at “true” top-100 players, even if the length of the list is smaller than usual for current rankings. So instead of taking the top 100 qualifying players (those who meet a playing time minimum and thus have an Elo ranking), we take a smaller number of players, all of whom have top-100 rankings on the official list.

The results are the same. For men, the tau based on today’s rankings and today’s Elo ratings is 0.694 versus the historical average of 0.678. For women, it’s 0.721 versus 0.719.

Still, the rankings feel awfully stale. The key issue is one that Elo can’t help us solve. So far, we’ve been looking at players who are keeping active. But the really out-of-date names on the official lists are the ones who have stayed home. Should Federer still be #6? Heck if I know! In the past, if an elite player missed 14 months, Elo would knock him down a couple hundred points, and if that adjustment were applied to Fed now, it would push down tau. But there’s no straightforward answer for how the inactive (or mostly inactive) players should be rated.

What we’ve learned today

This is the part of the post where I’m supposed to explain why this finding makes sense and why we should have suspected it all along. I don’t think I can manage that.

A good way to think about this might be that there is a sort of tour-within-a-tour that is continuing to play regularly. Federer, Barty, and many others haven’t usually been part of it, while several dozen players are competing as often as they can. The relative rankings of that second group are pretty good.

It doesn’t seem quite fair that Clara Tauson is stuck just inside the top 100 while her Elo is already top-50, or that Rublev remains behind Federer despite an eye-popping six months of results while Roger sat at home. And for some historical considerations–say, weeks inside the top 50 for Tauson or the top 5 for Rublev–maybe it isn’t fair that they’re stuck behind peers who are choosing not to play, or who are resting on the laurels of 18-month-old wins.

But in other important ways, the absolute rankings often don’t matter. Rublev has been a top-five seed at every event he’s played since late September except for Roland Garros, the Tour Finals, and the Australian Open, despite never being ranked above #8. When the tour-within-a-tour plays, he is a top-five guy. The likes of Rublev and Tauson will continue to have the deck slightly stacked against them at the majors, but even that disadvantage will steadily erode if they continue to play at their current levels.

Believing in science as I do, I will take these findings to heart. That means I’ll continue to complain about the problems with the official rankings–but no more than I did before the pandemic.

Podcast Episode 86: A New Documentary on Guillermo Vilas and the No. 1 Ranking

Episode 86 of the Tennis Abstract Podcast features Jeff and co-host Carl Bialik, of the Thirty Love podcast, discussing the new Netflix doc Guillermo Vilas: Settling the Score.

The Argentine star was a multi-slam winner in the 1970s, yet he never reached the top of the official ATP ranking list. The film covers journalist Eduardo Puppos’s quest to prove that Vilas deserved to be #1. Over the course of the episode, we ponder the importance of the top ranking, the vagaries of the ATP ranking algorithm, how Elo rates Vilas’s peak years, and the ATP’s response to Vilas’s case for the top spot. We didn’t love the documentary, but the issues it raises are fun to debate.

Fans of the TA podcast will also want to check out Dangerous Exponents, the new Covid-19 podcast that Carl Bialik and I are doing. Episode 3 will be available later today.

Thanks for listening!

(Note: this week’s episode is about 48 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

There’s Always a Chance: Marie Bouzkova Edition

Last night in Toronto, 91st-ranked qualifier Marie Bouzkova won her quarter-final match against 4th-ranked Simona Halep. Halep retired with a leg injury after losing the first set, so there’s a caveat–even if we were prepared to read too much into a single match, we wouldn’t attribute a lot of meaning to this one. But it’s a big accomplishment for the 21-year-old Czech, who earned her second top-ten scalp of the week and will advance to her first Premier-level semi-final, against no less of an obstacle than Serena Williams.

Here’s the nutty thing: It was Bouzkova’s 62nd match of the 2019 season, her 61st against someone with a WTA ranking. She got the win against the highest-ranked foe–Halep–but just last week, she lost to 636th-ranked CoCo Vandeweghe, her lowest-ranked opponent of the year. Yeah, the caveats keep coming: Vandeweghe is coming back from injury and is surely better than a ranking outside the top 600, and the ITF Transition Tour hijinks mean that the ranking system didn’t work as usual in 2019. Some players who would normally have a very low ranking, like the Kazakh wild card who Bouzkova crushed a couple of weeks ago, don’t count.

Still. 61 matches, with a win against the highest-ranked player and a loss against the lowest.

That sent me to my database, which had plenty more surprises in store. Going back less than a decade, to 2010, I found 127 players who recorded the same oddball combination of feats in a single season, minimum 30 matches. (To be consistent with the Halep result, I included retirements if at least one set was completed.) While many of the players won’t be of wide interest–last year, one of the exemplars was Mira Antonitsch, who didn’t play anyone ranked in the top 400–63 of the 127 player-seasons involved beating a top-100 opponent, 44 included the defeat of someone in the top 50, and 25 were highlighted by a top-ten upset.

Three of them included Halep as the top-ten scalp! That makes Bouzkova the fourth player to beat Halep, not face anyone higher ranked, and also lose to her lowest-ranked opponent of the season. (Through eight months, anyway.) Halep shouldn’t feel too bad, though, as Angelique Kerber has been the extreme-ranked loser in five such cases, four of them in 2017. Ouch.

Here are the 25 player-seasons between 2010 and 2018 in which a WTAer beat her highest-ranked opponent and lost to her lowest:

Year  Player       High-Ranked  Rk  Low-Ranked  Rk       
2017  Kasatkina    Kerber       1   Kanepi      418      
2018  Hsieh        Halep        1   Gasparyan   410      
2010  Jankovic     Serena       1   Diyas       268      
2010  Clijsters    Wozniacki    1   G-Vidagany  258   *  
2014  Cornet       Serena       1   Townsend    205      
2010  Yakimova     Jankovic     2   Dellacqua   980      
2017  Bouchard     Kerber       2   Duval       896   *  
2017  Vesnina      Kerber       2   Azarenka    683      
2016  Bencic       Kerber       2   Boserup     225      
2014  Rybarikova   Halep        2   Eguchi      183      
2017  Mladenovic   Kerber       2   Andreescu   167   *  
2018  Goerges      Wozniacki    3   Serena      451      
2014  Tomljanovic  Radwanska    3   A Bogdan    308      
2015  Mladenovic   Halep        3   Savchuk     262      
2017  Kerber       Pliskova     4   Stephens    934      
2014  Pavlyu'ova   Radwanska    4   Wozniak     241      
2017  Dodin        Cibulkova    5   Rybarikova  453      
2017  Bellis       Radwanska    6   Azarenka    683      
2018  Buyukakcay   Ostapenko    6   Di Sarra    555      
2017  Sakkari      Wozniacki    6   Potapova    454      
2015  L Davis      Bouchard     7   E Bogdan    527      
2015  Ostapenko    S-Navarro    9   Dushevina   1100  *  
2016  KC Chang     Vinci        10  S Murray    862      
2018  Pera         Konta        10  Hlavackova  825      
2018  Danilovic    Goerges      10  Pegula      620

* also faced one unranked player

A quick glance is all it takes to establish that Vandeweghe isn’t the first lowest-ranked player to inspire a “yeah, but” reaction. The list of purportedly weak opponents is very strong for one made up of players with an average ranking outside of the top 500. We have stars such as Victoria Azarenka (twice) and Serena as well as a helping of prospects such as Bianca Andreescu and Victoria Duval.

Consider this as today’s reminder of the limitations of the WTA computer rankings. They tell us who has won a lot of matches in the last 52 weeks, not necessarily who is playing well right now. These cases include many of the most extreme mismatches between official ranking and on-the-day ability. I don’t think it says anything meaningful about a player to show up on this list–though Kerber’s many appearances (as both player and scalp!) are a good summary of her disappointing 2017 campaign.

Bouzkova will remain on the list for at least a couple more days: Serena is currently ranked 10th and both of the other semi-finalists are ranked lower, so Halep will remain her “toughest” opponent. Despite the Czech’s breakout week, it would be understandable if she found herself overawed to face a 23-time slam champion across the net. But one thing is certain: Bouzkova couldn’t care less about the number next to the name.

Picking Favorites With Better Davis Cup Rankings

Yesterday, the ITF announced the seedings for the first new-look Davis Cup Finals, to be held in Madrid this November. The 18-country field was completed by the 12 home-and-way ties contested last weekend. Those 12 winners will join France, Croatia, Spain, and USA (last year’s semi-finalists) along with the two wild cards, recent champions Argentina and Great Britain.

The six nations who skipped the qualifying round will make up five of the top six seeds. (Spain is 7th, while Belgium, who had to qualify, is 4th.) The preliminary round of the November event will feature six round-robin groups of three, each consisting of one top-six seed, a second country ranked 7-12, and a third ranked 13-18. Seeding really matters, as a top position (deserved or not!) guarantees that a side will avoid dangerous opponents like last year’s finalists France and Croatia. Even the difference between 12 and 13 could prove decisive, as a 7-through-12 spot ensures that a nation will steer clear of the always-strong Spaniards, who are seeded 7th.

The seeds are based on the Davis Cup’s ranking system, which relies entirely on previous Davis Cup results. While the formula is long-winded, the concept is simple: A country gets more points for advancing further each season, and recent years are worth the most. The last four years of competition are taken into consideration. It’s not how I would do it, but the results aren’t bad. Four or five of the top six seeds will field strong sides, and one of the exceptions–Great Britain–would have done so had Andy Murray’s hip cooperated. Spain is obviously misranked, but given the limitations of the Davis Cup ranking system, it’s understandable, as the 2011 champions spent 2015 and 2016 languishing outside the World Group.

We can do better

The Davis Cup rankings have several flaws. First, they rely heavily on a lot of old results. If we’re interested in how teams will compete in November, it doesn’t matter how well a side fared three or four years ago, especially if some of their best players are no longer in the mix. Second, they don’t reflect the change in format. Until last year, doubles represented one rubber in a best-of-five-match tie. A good doubles pair helped, but it wasn’t particularly necessary. Now, there are only two singles matches alongside the doubles rubber. The quality of a nation’s doubles team is more important than it used to be.

Let’s see what happens to the rankings when we generate a more forward-looking rating system. Using singles and doubles Elo, I’m going to make a few assumptions:

  • Each country’s top two singles players have a 75% chance of participating (due to the possibility of injury, fatigue, or indifference), and if either one doesn’t take part, the country’s third-best player will replace him.
  • Same idea for doubles, but the top two doubles players have an 85% chance of showing up, to be replaced by the third-best doubles player if necessary.
  • The three matches are equally important. (This isn’t technically true–the third match is likely to be necessary less than half the time, though when it does decide the tie, it is twice as important as the other two matches.)
  • Andy Murray won’t play.

Those assumptions allow us to combine the singles and doubles Elo ratings of the best players of each nation. The result is a weighted rating for each side, one that has a lot of bones to pick with the official Davis Cup rankings.

Forward-looking rankings

The following table shows the 18 countries at the Davis Cup finals along with the 12 losing qualifiers. For each team, I’ve listed their Davis Cup ranking, and their finals seed (if applicable). To demonstrate my results, I’ve shown each nation’s weighted Elo rank and rating and their hard-court Elo rank and rating. The table is sorted by hard-court Elo:

Country  DC Rank  Seed  Elo Rank   Elo  sElo Rank  sElo  
ESP            7     7         1  1936          1  1891  
CRO            2     2         2  1898          2  1849  
FRA            1     1         3  1880          3  1845  
USA            6     6         4  1876          4  1835  
RUS           21    17         7  1855          5  1827  
AUS            9     9         5  1857          6  1820  
SRB            8     8         8  1849          7  1808  
GER           11    11         6  1855          8  1799  
AUT           16              10  1800          9  1766  
ARG            3     3         9  1803         10  1755  
                                                         
Country  DC Rank  Seed  Elo Rank   Elo  sElo Rank  sElo  
GBR            5     5        11  1796         11  1750  
SUI           24              14  1763         12  1749  
ITA           10    10        12  1780         13  1745  
CAN           14    13        13  1777         14  1744  
JPN           17    14        15  1735         15  1719  
BEL            4     4        17  1688         16  1673  
CZE           13              16  1712         17  1661  
NED           19    16        18  1685         18  1643  
BRA           28              20  1659         19  1638  
IND           20              21  1652         20  1621  
                                                         
Country  DC Rank  Seed  Elo Rank   Elo  sElo Rank  sElo  
SVK           29              22  1645         21  1617  
CHI           22    18        19  1682         22  1609  
KAZ           12    12        26  1582         23  1574  
COL           18    15        24  1597         24  1551  
SWE           15              27  1570         25  1542  
BIH           27              28  1552         26  1540  
POR           26              23  1610         27  1535  
HUN           23              25  1583         28  1533  
UZB           25              29  1491         29  1489  
CHN           30              30  1468         30  1465

Spain is the comfortable favorite, regardless of whether we look at overall Elo or hard-court Elo. When the draw is conducted, we’ll see which top-six seed is unlucky enough to end up with the Spaniards in their group, and whether the hosts will remain the favorite.

The biggest mismatch between the Davis Cup rankings and my Elo-based approach is in our assessment of the Russian squad. Daniil Medvedev is up to sixth in my singles Elo ratings, with Karen Khachanov at 10th. Those ratings might be a little aggressive, but as it stands, Russia is the only player with two top-ten Elo singles players. Spain is close, with Rafael Nadal ranked 2nd and Roberto Bautista Agut 11th, and the hosts have the additional advantage of a deep reservoir of doubles talent from which to choose.

In the opposite direction, my rankings do not forecast good things for the Belgians. David Goffin has fallen out of the Elo top 20, and there are no superstar doubles players to pick up the slack. In a just world, Spain and Belgium will land in the same round-robin group–preferably one without the Russians as well.

Madrid or Maldives

The results I’ve shown assume that every top singles player has the same chance of participating. That’s certainly not the case, with high-profile stars like Alexander Zverev telling the press that they’ll be spending the week on holiday in the Maldives. Some teams are heavily dependent on one singles player who could make or break their chances with a decision or an injury.

As it stands, Germany is 8th in the surface-weighted Elo. If we take Zverev entirely out of the mix, they drop to a tie for 14th with Japan. It’s something the German side would prefer to avoid, but it’s not catastrophic, partly because the Germans were never among the favorites, and partly because Zverev could play only one singles rubber per tie and the doubles replacements are competent.

Even more reliant on a single player is the Serbian side, which qualified last weekend without the help of their most dangerous threat, Novak Djokovic. With Djokovic, the Serbs rank 7th–a case where my surface Elo ratings almost agree with the official rankings. But without the 15-time major winner, the Serbs fall down to a tie with Belgium in 16th place. While the Serbs are unlikely to take home the trophy regardless, Novak would make a huge difference.

The draw will take place next Thursday. We’ll check back then to see which sides have the best forecasts, nine months out from the showdown in Madrid.

The Unique Late-Career Surge of Mihaela Buzarnescu

The newest member of the WTA top 32 got there the hard way. Mihaela Buzarnescu, who achieved her latest career-high ranking with a run to the final of last week’s Prague event, where she lost a three-setter to Petra Kvitova, made her professional debut 14 years ago. Despite a dose of junior success, including a junior doubles title at the 2006 US Open, she didn’t crack the top 100 until last October.

This isn’t how tennis career trajectories are supposed to work. Yes, the game is getting older and stars are extending their careers, but Buzarnescu’s year-long winning spree, in which she has climbed from outside the top 400 to inside the top 40, began after her 29th birthday. The closer we look at what the Romanian has achieved, and the age at which she’s doing so, the more unusual it appears.

The oldest top 100 debuts

Since the beginning of the 1987 season, 630 women have debuted in the top 100. Their average age, on the Monday they reached the ranking threshold, is just under 20 years and 6 months. Only 29 of the 630–less than five percent–broke into the top 100 after their 26th birthday.

Only 14 players did so after turning 27:

Player                 Debut  Age (Y)  Age (D)  Peak Rank  
Tzipi Obziler       20070219       33      306         75  
A. Villagran Reami  19880801       31      359         99  
Mihaela Buzarnescu  20171016       29      165         32  
Julie Ditty         20071105       28      305         89  
Eva Bes Ostariz     20010716       28      183         90  
Mashona Washington  20040719       28       49         50  
Maureen Drake       19990201       27      317         47  
Tatjana Maria       20150406       27      241         46  
Hana Sromova        20051107       27      211         87  
Laura Siegemund     20150914       27      193         27  
Flora Perfetti      19960708       27      160         54  
Louise Allen        19890227       27       51         83  
Kristina Barrois    20081020       27       20         57  
Iryna Bremond       20111017       27       11         93

Buzarnescu doesn’t quite top this list, but she is certainly a more consequential force on tour than either of the women who debuted at a more advanced age. Tzipi Obziler fought her way through the lower levels of the game for just as long as Buzarnescu did, though she never cracked the top 70. Adriana Villagran Reami played a limited schedule; she may have had the skills to play top-100 tennis long before the ranking table made it official, but she was never a tour regular.

The most comparable player to Buzarnescu is Laura Siegemund, who reached a double-digit ranking a few years ago, and has since climbed as high as No. 27. Of the oldest top-100 debutants, though, very few have continued to ascend the rankings as far as Buzarnescu and Siegemund have.

Here are the oldest top-100 debuts of players who went on to crack the top 32:

Player                      Debut  Age (Y)  Age (D)  Peak  
Mihaela Buzarnescu       20171016       29      165    32  
Laura Siegemund          20150914       27      193    27  
Sybille Bammer           20050822       25      117    19  
Shinobu Asagoe           20000710       24       12    21  
Manon Bollegraf          19880215       23      310    29  
Johanna Konta            20140623       23       37     4  
Anne Kremer              19981019       23        2    18  
Lesia Tsurenko           20120528       22      364    29  
Kveta Peschke            19980420       22      286    26  
Petra Cetkovska          20071022       22      256    25  
Tathiana Garbin          20000214       22      229    22  
Li Na                    20041004       22      221     2  
Mara Santangelo          20040202       22      219    27  
Ginger Helgeson Nielsen  19910325       22      192    29  
Casey Dellacqua          20070806       22      176    26

Here’s an indication of just how young women’s tennis is: The 9th-oldest top-100 debutant on this list achieved her feat before her 23rd birthday. Put another way: Of the 107 women to break into the top 100 after their 23rd birthday, only eight went on to a ranking of No. 32 or better. By comparison, about one-third of all top-100 players peak at a ranking in the top 32. In this category, Buzarnescu is charting entirely new territory.

Making up for lost time

The last six months or so have been a whirlwind for the Romanian, as she has gone from a fringe tour player that no one had ever heard of, to a solid tour regular that … well, most fans still don’t know much about. Many players need some time to adjust to the higher level of competition and spend months, even years, stagnating in the rankings. Buzarnescu, on the other hand, has barely stopped to take a breath.

It took 203 days from her top-100 debut last October to her latest career-high at No. 32 on Monday. Siegmund, by comparison, needed 315 days; Sybille Bammer took 574 days; Roberta Vinci, who eventually cracked the top ten, required 2,520 days, or nearly seven years. The average player who reaches the top 32 needs two and a half years between her first appearance in the top 100 and clearing the higher bar.

Buzarnescu’s climb doesn’t fit the mold of older debuts. Her climb has more in common with those of teenage sensations. Again since 1987, here are the 20 quickest ascents:

Player              Age (Y)  Age (D)  Peak  Ascent Days  
Jennifer Capriati        14       11     1            0  
Anke Huber               15      266     4           49  
Agnes Szavay             18      164    13           77  
Lindsay Davenport        16      238     1          112  
Naoko Sawamatsu          17       31    14          119  
Clarisa Fernandez        20      265    26          133  
Maria Sharapova          16       58     1          133  
Serena Williams          16       52     1          133  
Miriam Oremans           20      145    25          140  
Venus Williams           16      301     1          147  
Sofia Arvidsson          21      223    29          154  
Leila Meskhi             19      308    12          168  
Tatiana Golovin          16       22    12          175  
Eugenie Bouchard         19       42     5          189  
Martina Hingis           14       31     1          189  
Ana Ivanovic             16      361     1          196  
Conchita Martinez        16      107     2          196  
Mihaela Buzarnescu       29      165    32          203  
Darya Kasatkina          18      137    11          203  
Ashleigh Barty           20      316    16          210

The player Buzarnescu knocked out of the top 20: Kim Clijsters. She is the only woman on the list to have cracked the top 100 after her 22nd birthday, yet here she is, climbing from No. 101 to No. 32 in less time than 92% of her peers.

Common sense suggests that Buzarnescu can climb only so much higher: Most players don’t set new career highs in their 30s, especially those who have such a short track record of tour-level success. On the other hand, she has adapted quickly, recording her first top ten win, over Jelena Ostapenko, in February and taking a set from Kvitova in Saturday’s final.

What’s more, she’ll reap the benefits of seeds at many events, probably including Roland Garros and Wimbledon. Having proven that she can defeat top 50 players–she holds a 6-7 career record against them–her new status as a top-32 player means she’ll get plenty of opportunities to rack up points against a less-daunting brand of competition. After more a decade of fighting steeply uphill battles, she has finally–improbably–earned a place among the game’s elite. Now all she has to do is keep winning.

Feast, Famine, and Sloane Stephens

Italian translation at settesei.it

Last week, Sloane Stephens reeled off an impressive series of victories, defeating Garbine Muguruza, Angelique Kerber, Victoria Azarenka, and Jelena Ostapenko to secure the title at the WTA Premier Mandatory event in Miami.  The trophy isn’t quite as life-changing as the one she claimed at the US Open last September, but it’s a close second, and the competition she faced along the way was every bit as good.

The Miami title comes with 1,000 WTA ranking points, and by adding those to her previous tally, Stephens moved into the top ten, reaching a career high No. 9 on Monday. With two high-profile championships to her name, not to mention semifinal showings last summer in Toronto and Cincinnati, there’s little doubt she deserves it. Elo isn’t quite convinced, but its more sophisticated algorithm (and its disregard for the magnitude of the US Open and Miami titles) puts her within spitting distance of the top ten as well.

What makes Stephens’s rise to the top ten so remarkable is her efficiency in converting wins to ranking points. Since her return from injury at Wimbledon last year, she has played only 38 matches, winning 24 of them. She has suffered six first-round losses, plus two more defeats at last year’s Zhuhai Elite Trophy round-robin and another pair in the Fed Cup final against Belarus. All told, in the last nine months, she has won matches at only six different events. Her unusual record illustrates some of the quirks in the ranking system, and how a player who peaks at the right times can exploit them.

24 wins is almost never enough for a spot in the vaunted top ten. From 1990 to 2017, a player has finished a season with a top-ten ranking only seven times while winning fewer than 30 matches. Only two of those involved fewer wins than Sloane’s 24: Monica Seles‘s 1993 and 1995, the timespans leading up to her tragic on-court stabbing and following her eventual comeback. Here are the top-ten seasons with the fewest victories, including the last 52 weeks of a few players currently near the top of the WTA table:

Year  Player              YE Rk   W   L  W-L %  
1995  Monica Seles*           1  11   1    92%  
1993  Monica Seles            8  17   2    89%  
2018  Sloane Stephens**       9  24  14    63%  
2010  Serena Williams         4  25   4    86%  
1993  Jennifer Capriati       9  28  10    74%  
2015  Flavia Pennetta         8  28  20    58%  
2000  Mary Pierce             7  29  11    73%  
2004  Jennifer Capriati      10  29  12    71%  
1993  Mary Joe Fernandez      7  31  12    72%  
1995  Iva Majoli              9  31  13    70%  
2018  Venus Williams**        8  31  14    69%  
1995  Mary Joe Fernandez      8  31  15    67%  
2015  Lucie Safarova          9  32  21    60%  
2008  Maria Sharapova         9  33   6    85%  
1998  Steffi Graf             9  33   9    79%  
2018  Petra Kvitova**        10  33  14    70%

* ranking frozen after her assault

** rankings as of April 2, 2018; wins and losses based on previous 52 weeks

What almost all of these seasons have in common is exceptional performances at grand slams. Sloane won the US Open; Seles won the 1993 Australian; Serena Williams won a pair of majors in 2010; Flavia Pennetta capped an otherwise anonymous 2015 campaign with a title in New York. The slams are where the rankings points are.

Even within this group of slam successes, Sloane stands out. Of the 16 players on that list, only two–Pennetta and Lucie Safarova–won matches at a lower rate than Stephens has since her comeback. In other words, most women who are this efficient with their victories don’t lose quite so early or often at lesser events.

That 63% won-loss record is even more extreme than the above list makes it look. Of the nearly 300 year-end top-tenners since 1990, only eight finished the season with a lower win rate. Here’s that list, expanded to the top 11 to include another noteworthy recent season:

Year  Player              YE Rk   W   L  W-L %  
2014  Dominika Cibulkova     10  33  24    58%  
2000  Nathalie Tauziat       10  36  26    58%  
2015  Flavia Pennetta         8  28  20    58%  
1999  Nathalie Tauziat        7  37  25    60%  
2007  Marion Bartoli         10  47  31    60%  
2015  Lucie Safarova          9  32  21    60%  
2000  Anna Kournikova         8  47  29    62%  
2010  Jelena Jankovic         8  38  23    62%  
2018  Sloane Stephens*        9  24  14    63%  
2004  Elena Dementieva        6  40  23    63%  
2016  Garbine Muguruza        7  35  20    64%

* ranking as of April 2, 2018; wins and losses based on previous 52 weeks

There’s not much overlap between these lists; the first group generally missed some time, then made up for it by scoring big at slams, while the second group slogged through a long season and leveled up with a strong finish or two at a major. The typical player with a 63% winning percentage doesn’t end up in the top ten: She wraps up the season, on average, in the mid-twenties. At least that’s better than the average 24-win season: Those result in year-end finishes near No. 40.

Stephens has always been a big-match player: She made an early splash at the 2013 Australian Open, reaching the semifinals and upsetting Serena as a 19-year-old, and her overall career record at majors (66%) is nearly ten percentage points higher than her record at other tour events (57%). For all that, she will probably not conclude 2018 with such a extreme set of won-loss numbers. To do so, she’d probably need to win a major to replace her 2017 US Open points while losing early at most other events. Recovered from injury, Stephens may maintain her feast-or-famine ways to some degree, but it’s unlikely she’ll continue to display such extreme peaks and valleys.

Measuring the Impact of Wimbledon’s Seeding Formula

Italian translation at settesei.it

Unlike every other tournament on the tennis calendar, Wimbledon uses its own formula to determine seedings. The grass court Grand Slam grants seeds to the top 32 players in each tour’s rankings, and then re-orders them based on its own algorithm, which rewards players for their performance on grass over the last two seasons.

This year, the Wimbledon seeding formula has more impact on the men’s draw than usual. Seven-time champion Roger Federer is one of the best grass court players of all time, and though he dominated hard courts in the first half of 2017, he still sits outside the top four in the ATP rankings after missing the second half of 2016. Thanks to Wimbledon’s re-ordering of the seeds, Federer will switch places with ATP No. 3 Stan Wawrinka and take his place in the draw as the third seed.

Even with Wawrinka’s futility on grass and the shakiness of Andy Murray and Novak Djokovic, getting inside the top four has its benefits. If everyone lives up to their seed in the first four rounds (they won’t, but bear with me), the No. 5 seed will face a path to the title that requires beating three top-four players. Whichever top-four guy has No. 5 in his quarter would confront the same challenge, but the other three would have an easier time of it. Before players are placed in the draw, top-four seeds have a 75% chance of that easier path.

Let’s attach some numbers to these speculations. I’m interested in the draw implications of three different seeding methods: ATP rankings (as every other tournament uses), the Wimbledon method, and weighted grass-court Elo. As I described last week, weighted surface-specific Elo–averaging surface-specific Elo with overall Elo–is more predictive than ATP rankings, pure surface Elo, or overall Elo. What’s more, weighted grass-court Elo–let’s call it gElo–is about as predictive as its peers for hard and clay courts, even though we have less grass-court data to go on. In a tennis world populated only by analysts, seedings would be determined by something a lot more like gElo and a lot less like the ATP computer.

Since gElo ratings provide the best forecasts, we’ll use them to determine the effects of the different seeding formulas. Here is the current gElo top sixteen, through Halle and Queen’s Club:

1   Novak Djokovic         2296.5  
2   Andy Murray            2247.6  
3   Roger Federer          2246.8  
4   Rafael Nadal           2101.4  
5   Juan Martin Del Potro  2037.5  
6   Kei Nishikori          2035.9  
7   Milos Raonic           2029.4  
8   Jo Wilfried Tsonga     2020.2  
9   Alexander Zverev       2010.2  
10  Marin Cilic            1997.7  
11  Nick Kyrgios           1967.7  
12  Tomas Berdych          1967.0  
13  Gilles Muller          1958.2  
14  Richard Gasquet        1953.4  
15  Stanislas Wawrinka     1952.8  
16  Feliciano Lopez        1945.3

We might quibble with some these positions–the algorithm knows nothing about whatever is plaguing Djokovic, for one thing–but in general, gElo does a better job of reflecting surface-specific ability level than other systems.

The forecasts

Next, we build a hypothetical 128-player draw and run a whole bunch of simulations. I’ve used the top 128 in the ATP rankings, except for known withdrawals such as David Goffin and Pablo Carreno Busta, which doesn’t differ much from the list of guys who will ultimately make up the field. Then, for each seeding method, we randomly generate a hundred thousand draws, simulate those brackets, and tally up the winners.

Here are the ATP top ten, along with their chances of winning Wimbledon using the three different seeding methods:

Player              ATP     W%  Wimb     W%  gElo     W%  
Andy Murray           1  23.6%     1  24.3%     2  24.1%  
Rafael Nadal          2   6.1%     4   5.7%     4   5.5%  
Stanislas Wawrinka    3   0.8%     5   0.5%    15   0.4%  
Novak Djokovic        4  34.1%     2  35.4%     1  34.8%  
Roger Federer         5  21.1%     3  22.4%     3  22.4%  
Marin Cilic           6   1.3%     7   1.0%    10   1.0%  
Milos Raonic          7   2.0%     6   1.6%     7   1.7%  
Dominic Thiem         8   0.4%     8   0.3%    17   0.2%  
Kei Nishikori         9   1.9%     9   1.7%     6   1.9%  
Jo Wilfried Tsonga   10   1.6%    12   1.4%     8   1.5%

Again, gElo is probably too optimistic on Djokovic–at least the betting market thinks so–but the point here is the differences between systems. Federer gets a slight bump for entering the top four, and Wawrinka–who gElo really doesn’t like–loses a big chunk of his modest title hopes by falling out of the top four.

The seeding effect is a lot more dramatic if we look at semifinal odds instead of championship odds:

Player              ATP    SF%  Wimb    SF%  gElo    SF%  
Andy Murray           1  58.6%     1  64.1%     2  63.0%  
Rafael Nadal          2  34.4%     4  39.2%     4  38.1%  
Stanislas Wawrinka    3  13.2%     5   7.7%    15   6.1%  
Novak Djokovic        4  66.1%     2  71.1%     1  70.0%  
Roger Federer         5  49.6%     3  64.0%     3  63.2%  
Marin Cilic           6  13.6%     7  11.1%    10  10.3%  
Milos Raonic          7  17.3%     6  14.0%     7  15.2%  
Dominic Thiem         8   7.1%     8   5.4%    17   3.8%  
Kei Nishikori         9  15.5%     9  14.5%     6  15.7%  
Jo Wilfried Tsonga   10  14.0%    12  13.1%     8  14.0%

There’s a lot more movement here for the top players among the different seeding methods. Not only do Federer’s semifinal chances leap from 50% to 64% when he moves inside the top four, even Djokovic and Murray see a benefit because Federer is no longer a possible quarterfinal opponent. Once again, we see the biggest negative effect to Wawrinka: A top-four seed would’ve protected a player who just isn’t likely to get that far on grass.

Surprisingly, the traditional big four are almost the only players out of all 32 seeds to benefit from the Wimbledon algorithm. By removing the chance that Federer would be in, say, Murray’s quarter, the Wimbledon seedings make it a lot less likely that there will be a surprise semifinalist. Tomas Berdych’s semifinal chances improve modestly, from 8.0% to 8.4%, with his Wimbledon seed of No. 11 instead of his ATP ranking of No. 13, but the other 27 seeds have lower chances of reaching the semis than they would have if Wimbledon stopped meddling and used the official rankings.

That’s the unexpected side effect of getting rankings and seedings right: It reduces the chances of deep runs from unexpected sources. It’s similar to the impact of Grand Slams using 32 seeds instead of 16: By protecting the best (and next best, in the case of seeds 17 through 32) from each other, tournaments require that unseeded players work that much harder. Wimbledon’s algorithm took away some serious upset potential when it removed Wawrinka from the top four, but it made it more likely that we’ll see some blockbuster semifinals between the world’s best grass court players.