Serena’s 23 vs Margaret’s 24

Since 2017, Serena Williams has held 23 major titles, leaving her just one shy of Margaret Court’s 24. The Williams-Court comparison forces us to think across eras in the same way that Federer-vs-Laver does, with the additional complication that Court has earned herself extreme dislike among many fans and fellow champions.

Let’s set aside the off-court stuff and work this out. The pro-Court case is simple: 24 is greater than 23, and you have to evaluate players relative to their own eras. The pro-Serena side is equally straightforward: 11 of Court’s 24 titles came in Australia, before Melbourne was a mandatory tour stop. Regardless of the era, Court’s home event was weaker back then.

As much as possible, I’m going to try to hold to the “relative to their own era” assumption. Everyone seems to accept it when it comes to Laver-vs-Federer. Plus, if we drop that constraint, the whole exercise is meaningless. With improved technology, fitness, and coaching, of course today’s players are better. But that’s not what people are talking about when they pick a side of Serena-vs-Margaret or Rod-vs-Roger.

Attentive readers of this blog might recall I took a stab at this problem back in 2019. That attempt relied on some extreme approximating due to the lack of pre-Open Era women’s tennis data. Regular readers will also know that the state of pre-Open Era women’s tennis data has vastly improved in the last few months. Tennis Abstract, plus the associated GitHub repo, now contains thousands of match results back to the mid-1950s.

Adjusting Australia

Let’s be clear: I’m not about to settle whether Margaret Court or Serena Williams (or someone else) is the GOAT of women’s tennis. That debate depends on much more than grand slam titles.

Today’s question is: How do Williams’s 23 titles stack up against Court’s 24?

That boils down to an even simpler question: How do Court’s 11 Australian titles measure up against other slams, then and now?

The anecdotal evidence is strongly anti-Margaret. As I mentioned in this morning’s Expected Points, the 1960 Australian Championships–Court’s first major title–had a 32-player draw (strike one), and 30 of those players were Australian (strikes two and three). Yes, it was a strong era for Australian women’s tennis, especially a few years later, but the tournament was hardly a showcase of international superstars. As such, it isn’t what we think of as a “major” tournament these days.

I’ve done a lot of “slam adjustments,” mostly to track the difficulty of the majors won by Djokovic, Federer, and Nadal. (Here’s the most recent.) The basic approach is simple. For each tournament, take the winning player’s draw, and for each match, calculate the chance that an average slam winner on that surface would beat that set of opponents. (Odds are determined by my Elo ratings, which are based on results before the event.) Take the resulting probabilities–on average, around 14% between 1952 and 2020–and normalize them, so that a mid-range slam draw is 1.0. Tougher draws are higher than 1, and easier draws are lower.

Equalizing the eras

This type of adjustment gets us most of the way there, but it doesn’t directly confront the “relative to the era” issue. The field in general was more lopsided in the 1960s than it is now, with a handful of very strong players swatting away a pack of also-rans who struggled to win more than a game or two per set against the elites. That in itself is a point in favor of Serena (and modern players in general), but again, on the Laver-vs-Federer principle, that’s not what we’re talking about today.

The easiest way to express this idea that all eras are equivalent is to use as a standard each season’s Wimbledon, the one tournament that everybody always wanted to play, and almost everyone actually did play. To avoid year-to-year fluctuations based on short-term injuries, we’ll make things a bit more resilient and compare the strength of each year’s Australian draw to the average strength of that year’s Wimbledon and US draws.

For example, my slam adjustments consider 1960 to be a strong year. Maria Bueno’s Wimbledon title was 40% more difficult than the average slam draw, and Darlene Hard’s US victory was about 30% tougher than usual. Court’s Australian title that year comes out as exactly average, so we compare Australia’s 1.0 to the average of Wimbledon and the US ( (1.4 + 1.3) / 2 = 1.35), and the 1960 Australian title, relative to the era, measures as:

1 / 1.35 = 0.75

The mostly-Australian field wasn’t as weak as the caricature makes it out to be, but it was weaker than the marquee majors that year.

Here is how the strength of the Australian draw has evolved relative to the other grass- and hard-court slams from 1952 to the present:

Except for an outlier in 1965, when Bueno, Billie Jean King, and several other international stars turned up, the Australian Championships was a second class member of the grand slam club until around 1980. It’s had plenty of weak years since then, as well, partly because of players who skipped due to injury, and partly due to contenders losing early, giving the eventual winners easier paths.

The main event

Margaret Court won the Australian 11 times. By this measure of relative strength, those titles were worth 62% as much as the other majors in those years. The strenght of individual titles ranged from a low of 0.29 in 1961, when no international elites made the trip, to a high of 1.02 in 1965, when the field was positively star-studded.

Serena Williams has won the Australian seven times. It is tempting to leave that “7” as is, because Melbourne is now a mandatory tour stop and virtually every woman on tour considers it one of the top targets in her season. However, we should treat Serena’s seven the same way we adjusted Court’s 11. For all the era differences, some things remain the same, like jetlag and the difficulty of playing top-flight tennis only a few weeks into the season.

Williams’s seven were worth, on average, 88% as much as the other majors in their respective years. The weakest of the bunch was her last, in 2017. So many top players lost early that Serena never faced a top-eight opponent.

Court’s 11 titles, then, are equivalent to about 7 non-Australian majors–a penalty of four. Serena’s 7 are worth about 6 non-Australian majors–a penalty of one.

The final, adjusted tally: Williams 22, Court 20.

Margaret Court was one of the greatest players of all time, but her position the all-time grand slam singles list depends too much on the shifting status of her home event. When we properly account for the Australian tournament’s position for decade as the most minor major, Court loses her remaining claim to the top spot. Serena may yet win 24, but to match or exceed Court, she shouldn’t have to.

How Should We Value the Masters and Premier Titles in the Bubble?

Tennis is back, but plenty of top players are still at home–or crashing out in the early rounds of their first tournament in months. While the ATP “Cincinnati” Masters event delivered the expected winner in Novak Djokovic, the Serb never had to face a top-ten opponent. The same was true of Victoria Azarenka, who won the WTA Premier tournament with the benefit of Naomi Osaka’s withdrawal in the final round, and without playing a top-tenner on her way there.

The tennis world’s “asterisk” talk has mostly focused on the US Open, since most people care about slams and don’t care about anything else. But judging from these easy paths to the two Cincinnati titles, should we be talking asterisk about the event just passed?

Novak’s 35th, but not (quite) his easiest

Last week, I explained why I thought the asterisk talk was premature, if not wrong. The field doesn’t matter, because the player who wins the title faces only a handful of players. The presence of, say, Rafael Nadal doesn’t have much to do with the difficulty of winning the title unless the eventual winner has to go through Rafa. If the champion’s opponents are very good, the path to the title is hard; if they are relatively weak, the path to the title is easy. Keep in mind I’m using the terms “good” and “weak” in theoretical terms. On paper, Djokovic was fortunate that his semi-final and final opponents were ranked 12th and 30th, respectively, and his title path was “easy.” As it happened, he was forced to work hard for both wins.

We now know that the title paths of the Cincinnati champions were relatively easy. But just how weak were they?

I calculate the difficulty of a path-to-the-title by determining the probability that the average Masters champion on that surface would beat the opponents that the champion faced. By using the “average Masters champion,” we are taking the skill level of the actual champ out of the equation, and looking only at the quality of his opposition. The resulting numbers vary wildly, from 2.5%–the odds that a typical Masters champion would have beaten the players that Jo Wilfried Tsonga defeated to win the 2014 Canada Masters–to 61.2%–the chances that an average titlist would have beaten the players that confronted Nikolay Davydenko at the 2006 Paris Masters.

Novak’s number this week was 40.5%. In other words, an average hard-court Masters champion would have a four-in-ten shot at beating the five guys that fate threw in Djokovic’s path. That’s the 11th easiest Masters title since 1990:

Title Odds  Tournament       Winner             
61.2%       2006 Paris       Nikolay Davydenko  
50.5%       2012 Paris       David Ferrer       
49.8%       2000 Paris       Marat Safin        
48.3%       2004 Paris       Marat Safin        
47.0%       1999 Paris       Andre Agassi       
44.5%       2013 Shanghai    Novak Djokovic     
43.3%       2002 Madrid      Andre Agassi       
42.9%       2005 Paris       Tomas Berdych      
41.4%       2009 Canada      Andy Murray        
41.3%       2017 Paris       Jack Sock          
40.5%       2020 Cincinnati  Novak Djokovic     
39.6%       2011 Shanghai    Andy Murray        
39.1%       2019 Canada      Rafael Nadal       
37.9%       2008 Rome        Novak Djokovic     
36.2%       2007 Cincinnati  Roger Federer

Unless we’re prepared to put a permanent asterisk next to the Paris Masters, we should hold off on cheapening this year’s Cincinnati title. Surprisingly, Djokovic’s path was even easier at the 2013 Shanghai Masters. He had to face two top-ten opponents in the final rounds (Tsonga and Juan Martin del Potro), but Elo didn’t think that highly of them at the time.

Azarenka: asterisk squared

Evaluating the WTA title is trickier. Part of the problem is the small number of “Premier Mandatory” events, and the fact that two of them (Indian Wells and Miami) have substantially larger draws, and are thus that much harder to win. The even bigger issue is how to think about Azarenka’s final-round walkover.

Let’s start with the numbers. If we consider the five opponents that Vika defeated on court and calculate the odds that an average WTA Premier (not just Premier Mandatory) champion would beat them, her path-to-the-title number is 20.7%. If we add Osaka to the mix, on the theory that Azarenka should get credit for beating her, the resulting number is 7.4%.

Compared to the ATP numbers above, those sound pretty good. But the devil lies in the tournament-category details–the average WTA Premier event is much weaker than a marquee (dare I say “premier”?) tour stop like Cincinnati. Here’s how the Cinci title-paths stack up for the last dozen years:

20.7%       2020  Victoria Azarenka  (W/O Osaka)  
7.4%        2020  Victoria Azarenka  (d. Osaka)   
7.3%        2016  Karolina Pliskova             
5.5%        2010  Kim Clijsters                 
5.5%        2012  Li Na                         
5.3%        2015  Serena Williams               
4.5%        2011  Maria Sharapova               
4.3%        2014  Serena Williams               
4.2%        2017  Garbine Muguruza              
3.9%        2019  Madison Keys                  
2.9%        2013  Victoria Azarenka             
2.0%        2009  Jelena Jankovic               
1.3%        2018  Kiki Bertens

20.7% is respectable for a run-of-the-mill Premier–in fact, Vika’s 2016 Brisbane title was almost exactly the same, at 20.8%. But Cincinnati reliably offers tougher competition. Even if we factor in the difficulty of beating Osaka, Azarenka’s path was (barely) the easiest at the event since the Premier-level designation came into being.

Yay, nay, meh

I’ll reiterate a main point from my last article about the US Open asterisk debate: There’s no simple yes or no answer when it comes to whether a title should “count.” (That’s assuming that you even think there are circumstances under which a title should be formally discounted.) Long before the COVID-19 pandemic messed with everything, there were titles–even at the grand slam level–that were a lot easier to win than others.

Djokovic’s championship falls squarely within the usual continuum, even if it will go down as one of his least challenging. Azarenka’s is tougher to define, but more because of Osaka’s withdrawal than because of the weakness of the field. The level of competition, despite missing many top players, was plenty good enough to offer Azarenka a path to the title that was comparable at least one recent Cinci championship, and plenty of other top-tier events.

With that in mind, I’ll leave you with a couple of predictions. First: the US Open champions will face relatively easy paths to their titles, but like Djokovic’s, they will fall on the established continuum. And second: by the end of the fortnight, you’ll hope to never hear the word “asterisk” again.

US Open Asterisk Talk is Premature. It Might be Flat-Out Wrong.

Many high-profile players will be missing from the 2020 US Open. Rafael Nadal opted out of the abbreviated North American swing, and Roger Federer will miss the rest of the season due to injury. More than half of the WTA top ten is skipping Flushing Meadows as well. The thinned-out fields increase the odds that a few remaining favorites, such as Novak Djokovic and Serena Williams, add another major trophy to their collection.

As a result, pundits and fans are discussing whether the 2020 US Open deserves an “asterisk.” The idea is that, because of the depleted fields, this slam is worth less than others, so much so that the history books* should note the relative meaninglessness of this year’s titles.

* Nobody buys history books anymore, so we’re really talking** about a page on the US Open website, and a never-ending edit war on Wikipedia.

** Yes, I see the irony.

From what I’ve seen, people are thinking about this the wrong way. Yes, a weak field makes it easier–in theory–to win the tournament. It’s certainly true that the 2020 champions won’t have to go through Nadal or Ashleigh Barty to get their hardware. But the field isn’t what matters.

The field isn’t what matters

I repeated that on purpose, because it’s that important. The winner of a grand slam must get through seven matches. The difficulty of securing the title depends almost entirely on his or her opponents in those seven matches. Each main draw consists of 128 players, but 120 of them are mostly irrelevant.

I say “mostly” because I can foresee some objections. Sometimes a player can compete so hard in a loss that they weaken their opponent for the next round. Take the 2009 Madrid Masters, in which Nadal needed four hours to defeat Djokovic in the semi-final, then lost to Federer in the final. We could say that Djokovic’s presence was relevant, even though Federer won the title without playing him. That sort of thing happens, though probably not as much as you think. Even when it does, it needn’t be a top tier player who wears out their opponent in an early round.

Another objection is that a depleted field affects seedings. For instance, Serena’s current WTA ranking is 9th, an unenviable position going into most slams. The 9th seed lines up for a fourth-round match with a top-eight player, meaning that she could face four top-eight players en route to the title. But with all the absences, Williams will instead be seeded third, behind only Karolina Pliskova and Sofia Kenin.

I’m not dismissing these concerns out of hand. They do matter a bit. But they only matter insofar as they affect the way the tournament plays out. The difference between the difficulties facing the 3rd and 9th seeds could be enormous … or it could be nothing, especially if the draw is riddled with early upsets.

Difficulty is a continuum

Even if you grant some credence to the objections above (or others that I haven’t mentioned), I hope you’ll agree that the most meaningful obstacles standing between a player and a grand slam title are the seven opponents he or she will need to overcome.

If those seven opponents are, on average, very strong, we would say that the player faced a particularly tough path to a slam title. Take Stan Wawrinka’s 2014 Australian Open title: he beat both Djokovic and Nadal at a time when those two were dominating the game. If the collective skill level of the seven opponents doesn’t amount to much–at least by grand slam standards–we’d say it was an easy path. For example, Federer clinched the 2006 Australian Open despite facing only a single player ranked in the top 20, and none in the top four.

We can quantify path difficulty in a variety of ways. One approach that will be useful here is to calculate the odds that an average slam champion would beat those seven opponents. The difference between easy and hard championships is enormous. The typical major titlist (that is, someone with an Elo rating around 2100) would have had a 3.3% chance of beating the seven men that Wawrinka drew in Melbourne the year that he won. Only two slam paths have ever been tougher: Mats Wilander’s routes to the 1982 and 1985 French Open titles. By contrast, the average slam champion would have had a 51% chance of going 7-0 when faced by Federer’s 2006 Australian Open draw.

The extreme “easy” draw is fifteen times easier than the extreme “hard” draw. Fifteen times! You can find plenty of champions for any approximate level of difficulty in between those extremes. The typical slam champ would’ve had a 10% chance of doing what Djokovic did in progressing through seven rounds at the 2011 US Open. Same in New York in 2012. Andy Murray’s 2016 Wimbledon path would have given the average champion a 20% chance. The 2018 Roland Garros draw was manageable for Rafael Nadal, and a typical major titlist would have had a 30% chance of securing those seven match wins.

None of this is to say that any of those players did or didn’t “deserve” their titles. Federer didn’t choose his 2006 Melbourne opponents any more than Wawrinka selected his foes eight years later. The trophy is the same, and in many important ways, their achievements are the same–both of the Swiss stars swept away all of their opponents, who in turn were the best performers (at least during those fortnights) of the players who showed up.

Asterisks for everybody

Here’s another thing 2006 Roger and 2014 Stan had in common: Almost all of the best players in the world participated in the tournaments that they ultimately won. (I say “almost” because defending champion Marat Safin was injured and missed the 2006 Aussie Open.) The “field” was effectively the same, but to win the titles, one player cruised through a two-week cakewalk and the other needed to put together one of the most impressive final weeks of the modern era.

Tennis fans have collectively decided that each major title counts as “one.” It doesn’t have to be that way: We could give more “slam points” for achievements like Wawrinka’s and grant fewer for the easy ones. Most people don’t like this idea, and I admit that it sounds a bit weird. I’m not advocating it for general use, though it is an interesting concept that I’ve pursued in a number of earlier articles, showing that Djokovic’s majors are–on average–more impressive than Nadal’s, which in turn have been tougher than Federer’s. Weighting majors by difficulty results in some changes in the order of the all-time grand slam list, ensuring that fans of all players hate me because I wrote some code and played with some spreadsheets.*

* With, I admit, malice aforethought.

Adjusting slam counts for difficulty is, in a sense, asterisking every slam title. The tricky draws get an acknowledgement of their difficult, and the ones that opened up get tweaked to account for their ease. It’s a continuum, not a simple up-and-down decision between normal slams and abnormal slams.

The 2020 US Open champions will probably have title paths that sit in the easier half of that continuum. But even that modest claim is far from guaranteed.

Let’s say Venus Williams recaptures her vintage form and wins the title, beating 3rd seed Serena in the quarter-finals, 2nd seed Kenin in the semis, and top seed Pliskova in the title match. (It doesn’t matter if the surprise winner is Venus–it could be any lower-ranked player, though Venus seems more plausible than most.) An average slam champion would beat those three players in succession about 37% of the time. 37% is already lower odds than about 20% of women’s slam draws in the last 45 years. (Kenin’s Australian Open title rated 39%.)

37% for Venus’s hypothetical title isn’t even the whole story–four more rounds of journeywomen would knock the number down to around 26%–harder than one-third of women’s slam draws. Add in another tricky opponent or two–maybe Cori Gauff, or Petra Kvitova in the fourth round–and suddenly the path to the 2020 US Open women’s championship is just as hard as the typical slam.

It’s even easier to illustrate how the 2020 US Open men’s title could be as difficult as many other slams. By the numbers, simply upsetting Djokovic (simply! ha!) is more difficult than it was to defeat all seven of Federer’s opponents at the 2006 Australian Open. That’s right: Six withdrawals and one win over Novak wouldn’t be the easiest slam victory in the last 15 years. Tack on six actual wins, including a few against strong opponents, and the result is a seven-match path that stands up against the typical non-pandemic slam.

Ironically, the player who could win the title with the weakest possible draw is Djokovic. It would be odd to claim that any of Novak’s accomplishments should be asterisked, but it does make things much simpler when he doesn’t have to beat himself.

Masked competitiveness

Once again, the field doesn’t really matter. When we focus on the players who are in New York instead of the few dozen who aren’t, we see that the ingredients are in place for a couple of respectable path to US Open titles. Wilander’s and Wawrinka’s marks are probably safe, but it’s more than possible that the winners will have faced competition equivalent to that of the average slam champ.

At the very least, we don’t know any better until the tail end of the second week. Until then, asterisk talk is premature. After that, it will probably be moot.

The Best Draw That Money Can Buy

Italian translation at settesei.it

Last week featured two events on the WTA calendar. First, both chronologically and by every conceivable ranking except for “most Hungarian,” was the Dubai Open, a Premier 5 event offering over $500,000 and 900 ranking points for the winner. The other was the Hungarian Open in Budapest, a WTA International tournament with $43,000 and 280 ranking points going to the champion. No top player would seriously consider going to Budapest, even before considering potential appearance fees and WTA incentives.

Fifteen of the top twenty ranked women went to Dubai, and the top seed in Budapest, defending champ Alison Van Uytvanck, was ranked 50th. Every Budapest entrant ranked in the top 72 got a top-eight seed, including a couple of players who would have needed to play qualifying just to earn a place in the Dubai main draw.

The rewards offered by the Dubai event and supported by the structure of the WTA tour make this an easy scheduling decision for many players. But at some point, if the rest of the field is zigging toward the Gulf, might it be better to zag toward Central Europe? Van Uytvanck would have been an underdog to reach even the third round of the richer event, yet she defended her title in Budapest. Marketa Vondrousova, who would have been stuck in Dubai qualifying, reached the Hungarian Open final. Opting for the smaller stage almost definitely proved the wise choice for those two women. Did other, better-ranked players leave money or ranking points on the table?

Motivations

Scheduling decisions depend on a lot of factors. Some women might prefer to play the event with the highest-quality field, both to test themselves against the best and to give themselves an opportunity for the circuit’s richest prizes. Others might head for the marquee events because of their doubles prowess: Timea Babos was part of the top-seeded doubles team in Dubai, but was the lowest-ranked direct entry in singles. Still others might choose to play closer to home or at tournaments they’ve enjoyed in the past.

For all that, ranking points should come first, with prize money also among the top considerations. Ranking points determine one’s ability to enter future events and to remain on tour. Prize money is necessary to cover the vast expenses necessary to bankroll a traveling support staff.

Dubai-versus-Budapest offers a fairly “pure” experiment, because both are played on similar surfaces and neither event is in the middle of a mini-circuit of events in a single region. Yes, Dubai immediately follows Doha, but that trip requires a flight, and most players headed back to Europe or North America after the tournament. Opting for one event over the other doesn’t substantially complicate anyone’s travel plans, like it would for an ATPer to mix and match destinations from the South American golden swing and the simultaneous European indoor circuit.

Revealed preferences

Let’s see which of the two main factors played a bigger role in scheduling decisions last week. To determine each player’s options, I tried to reconstruct as much as possible what information each woman had at her disposal six weeks earlier, on January 7th, when entry applications and stated preferences for Dubai and Budapest were due. I used the January 7th rankings to project how a player would be seeded at either event, and Elo ratings as of that date to forecast how far she would advance in each draw.

The major difficulty of this kind of simulation is the composition of the draws themselves. From our vantage point after the events, we know who opted for each draw as well as which players were unable to compete. In early January, none but the best-connected players would have known which of her peers would head in which direction, and no one at all could have known that Caroline Wozniacki would be a late withdrawal from Dubai, or that a viral illness would knock Kirsten Flipkens out of the Hungarian Open. Still, the resulting 2019 draws were very similar to what players could have predicted based on the player fields in 2018. So to simulate each player’s options, we’ll use the fields as they turned out to be.

Let’s start with Carla Suarez Navarro, the highest-ranked woman (at the January 7th entry deadline) who wasn’t seeded in Dubai. She ended up reaching the quarter-finals at the Premier event, in part because Kristina Mladenovic did her the favor of ousting Naomi Osaka from that section of the draw. For her efforts, Suarez Navarro grabbed 190 ranking points and almost $60,000. She would have needed to win the Budapest title to garner more points. And with a champion’s purse of “only” $43,000 in Hungary, she would have needed to rob a bank to improve on her Dubai prize money check.

However, that isn’t what Suarez Navarro should have anticipated taking home from Dubai. Sure, she should be optimstic about her own potential, but smart scheduling demands some degree of realism. I ran simulations of both the Dubai tournament (before the draw was made, so she doesn’t always end up in Osaka’s quarter) and the Budapest event with the Spaniard as the top seed and the rest of the field (minus last-in Arantxa Rus) unchanged. These forecasts suggest that Suarez Navarro only had a 12% chance of reaching the Dubai quarters, and that her expected ranking points in the Gulf were much lower:

Event     Points  Prize Money  
Dubai         76     $28.121   
Budapest     111     $15.384

(prize money in thousands of USD)

In all of these simulations, I’ve calculated points and prize money as weighted averages. Suarez Navarro had a 37% chance of a first-round loss, so that’s a 37% chance of one ranking point and first-round-loser prize money. And so on, for all of the possible outcomes at each event. For the Spaniard, her expected ranking points were nearly 50% higher as the top seed in Budapest. But because the Dubai prize pot is so much larger, her expected check was almost twice as big at the tournament she chose.

Consistent incentives

The total purse in Dubai was more than eleven times bigger than the prize money on offer in Hungary, while the points differed by only a factor of three. Thus, it’s no surprise that Suarez Navarro’s incentives are representative of those faced by many more women. I ran the same simulations for 26 more players: All of the competitors who gained direct entry into Dubai but were unseeded, plus Bernarda Pera, who would have been seeded in Budapest but instead played qualifying in the Gulf.

The following table shows each player’s expected points and prize money for Dubai (D-Pts and D-Prize), along with the corresponding figures for Budapest (B-Pts and B-Prize):

Player                    D-Pts   D-Prize   B-Pts   B-Prize   
Dominika Cibulkova           96   $36.794     130   $18.291   
Lesia Tsurenko               84   $31.528     119   $16.695   
Carla Suarez Navarro         76   $28.121     111   $15.384   
Aliaksandra Sasnovich        75   $27.920     111   $15.364   
Dayana Yastremska            72   $26.716     107   $14.803   
Anastasia Pavlyuchenkova     72   $26.590     106   $14.721   
Barbora Strycova             67   $24.809     102   $14.096   
Donna Vekic                  66   $24.143     100   $13.717   
Katerina Siniakova           63   $23.157      95   $13.062   
Ekaterina Makarova           58   $21.543      90   $12.265   
                                                              
Player                    D-Pts   D-Prize   B-Pts   B-Prize   
Petra Martic                 57   $21.019      88   $11.960   
Su Wei Hsieh                 54   $19.863      84   $11.396   
Belinda Bencic               53   $19.813      84   $11.372   
Ajla Tomljanovic             53   $19.530      82   $11.181   
Shuai Zhang                  49   $18.350      77   $10.416   
Sofia Kenin                  46   $17.109      72    $9.659   
Ons Jabeur                   45   $17.077      71    $9.624   
Viktoria Kuzmova             45   $17.009      70    $9.432   
Alize Cornet                 44   $16.823      69    $9.280   
Saisai Zheng                 40   $15.436      62    $8.307   
                                                              
Player                    D-Pts   D-Prize   B-Pts   B-Prize   
Vera Lapko                   37   $14.618      57    $7.695   
Mihaela Buzarnescu           36   $14.465      56    $7.548   
Alison Riske                 35   $14.309      55    $7.445   
Kristina Mladenovic          34   $13.910      51    $6.969   
Timea Babos                  32   $13.354      48    $6.572   
Yulia Putintseva             32   $13.407      48    $6.484   
Bernarda Pera*               25   $11.830      36    $5.061

Every single player could have expected more points in Budapest and more money in Dubai. The ratios are all similar to Suarez Navarro’s. The one possible expection is Pera (hence the asterisk). My simulation assumed she came through qualifying to make the main draw, and calculated only her expected points and prize money from main draw matches. Yet simply qualifying for the main draw is worth 30 ranking points, plus whatever points a player earns by winning main draw matches. Pera was no lock to qualify, but she was favored, and usually a couple of lucky loser spots make the main draw even more achieveable. It’s possible that if we ran all those scenarios, Pera is the one player for whom Dubai offered better hopes of prize money and points.

Loss aversion and game theory

It’s no accident that Van Uytvanck was one of the few players to choose the high-points, low-prize money route. She was defending 280 points from last year’s Hungarian Open, meaning that opting for a bigger check in Dubai would have a negative impact on her ranking. The thought of losing a couple hundred ranking points has a greater influence on behavior than the chance of gaining the same amount for a player who has few to defend.

For the majority of women who will face the same decision in 2020 without many points to defend, what should they do? Assuming, as I do, that they and their coaches will all carefully study this article, what happens if more top-70 players decide to chase ranking points and flock to the smaller event?

If the Budapest field gets stronger, each entrant’s expected points and prize money will decrease; if Dubai’s field weakens, each player there can anticipate a better chance of more points and even more money. As the entry system is currently structured, in which each player must state their preferences without knowledge of their peers’ choices, we can’t count on reaching an equilibrium. Even if every single player aimed solely to maximize ranking points, there wouldn’t be enough information available to reliably make the right choice. It’s conceivable, though unlikely, that a Budapest could attract a stronger field and end up offering lower expected prize money checks and ranking points.

But don’t fret, dear readers and schedule optimizers. There are external factors and there always will be. And in this case, virtually all of those factors pull players to the bigger money event. (Even Hungarian heroine Babos skipped her home tournament.) At least a half-dozen of the players listed above are doubles elites, making it likely they’ll choose the Premier event. Others–probably many others–will go where the money is, because they like money.

Even those who don’t play doubles and don’t like money will chase the biggest available pot of ranking points, not entirely unlike the way people play the lottery. The WTA offers a very limited set of opportunities to earn 900 points in a single week. You can get close to 900 points with three International championships, but there’s a finite number of weeks on the annual schedule–not to mention a limited number of matches in each player’s body! Lots of people stock up on lottery tickets despite unfavorable odds, and players will continue to enter higher-profile events even if their expected points are higher on smaller stages. The chance of a prestigious title, however slim, doesn’t show up in a purely actuarial calculation.

The success of Belinda Bencic–expected Dubai points, 53; expected Budapest points, 84; actual Dubai points, 900–will keep players chasing the big prizes. That’s good news for level-headed would-be optimizers. Those players willing to forego the skyscrapers, the shopping malls, and the prize money next year aren’t about to lose this opportunity. Budapest will almost certainly remain a better option for players who want to improve their ranking.

Quantifying Cakewalks, or The Time Rafa Finally Got Lucky

Italian translation at settesei.it

During this year’s US Open, much has been made of some rather patchy sections of the draw. Many great players are sitting out the tournament with injury, and plenty of others crashed out early. Pablo Carreno Busta reached the quarterfinals by defeating four straight qualifiers, and Rafael Nadal could conceivably win the title without beating a single top-20 player.

None of this is a reflection on the players themselves: They can play only the draw they’re dealt, and we’ll never know how they would’ve handled a more challenging array of opponents. The weakness of the draw, however, could affect how we remember this tournament.  If we are going to let the quality of the field color our memories, we should at least try to put this year’s players in context to see how they compare with majors in the past.

How to measure draw paths

There are lots of ways to quantify draw quality. (There’s an entire category on this blog devoted to it.) Since we’re interested in the specific sets of opponents faced by our remaining contenders, we need a metric that focuses on those. It doesn’t really matter that, say, Nick Kyrgios was in the draw, since none of the semifinalists had to play him.

Instead of draw difficulty, what we’re after is what I’ll call path ease. It’s a straightforward enough concept: How hard is it to beat the specific set of guys that Rafa (for instance) had to play?

To get a number, we’ll need a few things: The surface-weighted Elo ratings of each one of a player’s opponents, along with a sort of “reference Elo” for an average major semifinalist. (Or finalist, or title winner.) To determine the ease of Nadal’s path so far, we don’t want to use Nadal’s Elo. If we did that, the exact same path would look easier or harder depending on the quality of the player who faced it.

(The exact value of the “reference Elo” isn’t that important, but for those of you interested in the numbers: I found the average Elo rating of every slam semifinalist, finalist, and winner back to 1988 on each of the three major surfaces. On hard courts, those numbers are 2145, 2198, and 2233, respectively. When measuring the difficulty of a path to the semifinal round, I used the first of those numbers; for the difficulty of a path to the title, I used the last.)

To measure path ease, then, we answer the question: What are the odds that an average slam semifinalist (for instance) would beat this particular set of players? In Rafa’s case, he has yet to face a player with a weighted-hard-court Elo rating above 1900, and the typical 2145-rated semifinalist would beat those five players 71.5% of the time. That’s a bit easier than Kevin Anderson‘s path the semis, but a bit harder than Carreno Busta’s. Juan Martin del Potro, on the other hand, is in a different world altogether. Here are the path ease numbers for all four semifinalists, showing the likelihood that average contenders in each round would advance, giving the difficulty of the draws each player has faced:

Semifinalist   Semi Path  Final Path  Title Path  
Nadal              71.5%       49.7%       51.4%  
del Potro           9.1%        7.5%       10.0%  
Anderson           69.1%       68.9%       47.1%  
Carreno Busta      74.3%       71.2%       48.4%

(We don’t yet know each player’s path to the title, so I averaged the Elos of possible opponents. Anderson and Carreno Busta are very close, so for Rafa and Delpo, their potential final opponent doesn’t make much difference.)

There’s one quirk with this metric that you might have noticed: For Nadal and del Potro, their difficulty of reaching the final is greater than that of winning the title altogether! Obviously that doesn’t make logical sense–the numbers work out that way because of the “reference Elos” I’m using. The average slam winner is better than the average slam finalist, so the table is really saying that it’s easier for the average slam winner to beat Rafa’s seven opponents than it would be for the average slam finalist to get past his first six opponents. This metric works best when comparing title paths to title paths, or semifinal paths to semifinal paths, which is what we’ll do for the rest of this post.

Caveats and quirks aside, it’s striking just how easy three of the semifinal paths have been compared to del Potro’s much more arduous route. Even if we discount the difficulty of beating Roger Federer–Elo thinks he’s the best active player on hard courts but doesn’t know about his health issues–Delpo’s path is wildly different from those of his semifinal and possible final opponents.

Cakewalks in context

Semifinalist path eases of 69% or higher–that is, easier–are extremely rare. In fact, the paths of Anderson, Carreno Busta, and Nadal are all among the ten easiest in the last thirty years! Here are the previous top ten:

Year  Slam             Semifinalist               Path Ease  
1989  Australian Open  Thomas Muster                  84.1%  
1989  Australian Open  Miloslav Mecir                 74.2%  
1990  Australian Open  Ivan Lendl                     73.8%  
2006  Roland Garros    Ivan Ljubicic                  73.7%  
1988  Australian Open  Ivan Lendl                     72.2%  
1988  Australian Open  Pat Cash                       70.1%  
2004  Australian Open  Juan Carlos Ferrero            69.2%  
1996  US Open          Michael Chang                  68.8%  
1990  Roland Garros    Andres Gomez                   68.4%  
1996  Australian Open  Michael Chang                  66.2%

In the last decade, the easiest path to the semifinal was Stan Wawrinka‘s route to the 2016 French Open final four, which rated 59.8%. As we’ll see further on, Wawrinka’s draw got a lot more difficult after that.

Del Potro’s draw so far isn’t quite as extreme, but it is quite difficult in the historical context. Of the nearly 500 major semifinalists since 1988, all but 15 are easier than his 9.1% path difficulty. Here are the top ten, all of whom faced draws that would have given the average slam semifinalist less than an 8% chance of getting that far:

Year  Slam             Semifinalist              Path Ease  
2009  Roland Garros    Robin Soderling                1.6%  
1988  Roland Garros    Jonas Svensson                 1.9%  
2017  Wimbledon        Tomas Berdych                  3.7%  
1996  Wimbledon        Richard Krajicek               6.4%  
2011  Wimbledon        Jo Wilfried Tsonga             6.6%  
2012  US Open          Tomas Berdych                  6.8%  
2017  Roland Garros    Dominic Thiem                  6.9%  
2014  Australian Open  Stan Wawrinka                  7.0%  
1989  Roland Garros    Michael Chang                  7.1%  
2017  Wimbledon        Sam Querrey                    7.5%

Previewing the history books

In the long term, we’ll care a lot more about how the 2017 US Open champion won the title than how he made it through the first five rounds. As we saw above, three of the four semifinalists have a path ease of around 50% to win the title–again, meaning that a typical slam winner would have a roughly 50/50 chance of getting past this particular set of seven opponents.

No major winner in recent memory has had it so easy. Nadal’s path would rate first in the last thirty years, while Carreno Busta’s or Anderson’s would rate in the top five. (If it comes to that, their exact numbers will depend on who they face in the final.) Here is the list that those three men have the chance to disrupt:

Year  Slam             Winner                  Path Ease  
2002  Australian Open  Thomas Johansson            48.1%  
2001  Australian Open  Andre Agassi                47.6%  
1999  Roland Garros    Andre Agassi                45.6%  
2000  Wimbledon        Pete Sampras                45.3%  
2006  Australian Open  Roger Federer               44.5%  
1997  Australian Open  Pete Sampras                44.4%  
2003  Australian Open  Andre Agassi                43.9%  
1999  US Open          Andre Agassi                41.5%  
2002  Wimbledon        Lleyton Hewitt              39.9%  
1998  Wimbledon        Pete Sampras                39.1%

At the 2006 Australian Open, Federer lucked into a path that was nearly as easy as Rafa’s this year. His 2003 Wimbledon title just missed the top ten as well. By comparison, Novak Djokovic has never won a major with a path ease greater than 18.7%–harder than that faced by more than half of major winners.

Nadal has hardly had it easy as he has racked up his 15 grand slams, either. Here are the top ten most difficult title paths:

Year  Slam             Winner                Path Ease  
2014  Australian Open  Stan Wawrinka              2.2%  
2015  Roland Garros    Stan Wawrinka              3.1%  
2016  Us Open          Stan Wawrinka              3.2%  
2013  Roland Garros    Rafael Nadal               4.4%  
2014  Roland Garros    Rafael Nadal               4.7%  
1989  Roland Garros    Michael Chang              5.0%  
2012  Roland Garros    Rafael Nadal               5.2%  
2016  Australian Open  Novak Djokovic             5.4%  
2009  US Open          J.M. Del Potro             5.9%  
1990  Wimbledon        Stefan Edberg              6.2%

As I hinted in the title of this post, while Nadal got lucky in New York this year, it hasn’t always been that way. He appears three times on this list, facing greater challenges than any major winner other than Wawrinka the giant-killer.

On average, Rafa’s grand slam title paths haven’t been quite as harrowing as Djokovic’s, but compared to most other greats of the last few decades, he has worked hard for his titles. Here are the average path eases of players with at least three majors since 1988:

Player           Majors        Avg Path Ease  
Stan Wawrinka         3                 2.8%  
Novak Djokovic       12                11.3%  
Rafael Nadal         15                13.6%  
Stefan Edberg         4                14.6%  
Andy Murray           3                18.8%  
Boris Becker          4                18.8%  
Mats Wilander         3                19.8%  
Gustavo Kuerten       3                22.0%  
Roger Federer        19                23.5%  
Jim Courier           4                26.4%  
Pete Sampras         14                28.9%  
Andre Agassi          8                32.3%

If Rafa adds to his grand slam haul this weekend, his average path ease will take a bit of a hit. Still, he’ll only move one place down the list, behind Stefan Edberg. After more than a decade of battling all-time greats in the late rounds of majors, it’s fair to say that Nadal deserved this cakewalk.


Update: This post reads a bit differently than when I first wrote it: I’ve changed the references to “path difficulty” to “path ease” to make it clearer what the metric is showing.

Nadal and Anderson advanced to the final, so we can now determine the exact path ease number for whichever one of them wins the title. Rafa’s exact number remains 51.4%, and should he win, his career average across 16 slams will increase to about 15%. Anderson’s path ease to the title is “only” 41.3%, which would be good for ninth on the list shown above, and just barely second easiest of the last 30 US Opens.

Putting the Antalya Draw Into Perspective

This is a guest post by Peter Wetz.

When the pre-Wimbledon grass court tournament in Antalya was announced by the ATP in May 2016, some people were scratching their heads: Which top players will be willing to play in Antalya, Turkey one week ahead of Wimbledon? Even more so, because one week earlier two events are played in London and Halle, the latter being considerably closer to London. If a player wanted to participate in Antalya, he would have to fly from Halle (or London) to Antalya and then back to London for Wimbledon, not an ideal itinerary.

Taking a glance at the entry list, the doubts are verified: After Dominic Thiem, the only top 10 player entered in the event, there were just three other men (Paolo Lorenzi, Viktor Troicki and Fernando Verdasco) ranked within the top 40. Only three (Thiem, Verdasco, and Lorenzi) of the 28 players who were directly accepted to the main draw of the event, will be seeded at Wimbledon.

But how weak is the field really compared to others? Of course there are countless ways to measure the strength of a draw, but for a quick and dirty approach we will simply look at two measures, that is, the last direct acceptance (LDA) and the mean rank of quarterfinalists.

The LDA is the rank of the last player who gained direct entrance into a tournament’s main draw excluding lucky losers, qualifiers and special exempts. Comparing the last direct acceptance of the Antalya draw (86, Radu Albot) to all other ATP Tour level events with a draw size of 32 or 28 players, it turns out that Antalya is at the 39th percentile. This means that 39% of the other tournaments have a better/lower (or equal) LDA and that 61% have a worse/higher LDA, respectively. The following image shows a percentile plot of LDAs of tournaments since 2012, highlighting this week’s event in Antalya:

The fact that the LDA compares well against the other tournaments tells us that despite the lack of top ranked seeds, the field seems to be more dense at the bottom. Not that bad after all?

Let us take a look at the mean rank of the eight players who made it into the quarterfinals. Choosing quarterfinalists limits the calculation to the players who were able to perform well at the event, winning at least one, and usually two, matches. This should reduce some of the noise in the data that would be otherwise included due to lucky first round wins.

The mean rank of the quarterfinalists at the Antalya Open 2017 is 109. Out of the 726 tournaments since 2000 with 32 or 28 player draws which were considered in this analysis, only 35 tournaments had a higher mean rank of players at the quarterfinal stage. With nine out of those 35 tournaments, the Hall of Fame Tennis Championships at Newport–which takes place each year after Wimbledon–stands out from the pack. As the following plot shows, the Antalya Open is at the 95th percentile in this category. This seems to be more aligned with what we would have expected.

To provide some context, the following table lists the top 10 tournaments with links to the draws having the worst mean rank of quarterfinalists.

#  Tournament           Mean QF Rank
1  Newport '10          240
2  Newport '01          197
3  Delray Beach '16     191
4  Moscow '13           166
5  Newport '11          166
6  Newport '07          165
7  s-Hertogenbosch '09  164
8  Newport '08          163
9  Gstaad '14           156
10 Amsterdam '01        152
...
36 Antalya '17          109

The seeds are to blame for this: Of the eight seeds, only Verdasco managed to win a match. The other seven went winless. We have to go back as far as 1983’s Tel Aviv tournament to find a draw where only one seed won a match. In Tel Aviv, however, the third seed Colin Dowdeswell won three matches all in all, whereas Fernando Verdasco crashed out in the second round. By the way, Tel Aviv 1983 marks the first title of the then 16 years and 2 months old Aaron Krickstein, still the youngest player to win a singles title on the ATP Tour. That only two out of eight seeds win their first match happens about once per year. The last time this happened at the 2016 Brasil Open, where only Pablo Cuevas and Federico Delbonis won matches as seeds.

Despite the presence of only one top 30 player in this year’s Antalya draw, the middle and bottom of the field looked surprisingly solid, as we saw when considering the last direct acceptance. However, if we take into account the development of the tournament and calculate the mean rank of quarterfinalists, it becomes clear that the field got progressively weaker. Still, there have been worse draws in the past and there will doubtless be worse draws in future. Maybe even in the not too distant future, if we take a glance at this year’s Newport entry list.

Peter Wetz is a computer scientist interested in racket sports and data analytics based in Vienna, Austria.

Diego Schwartzman’s Return Game Is Even Better Than I Thought

Click for an Italian translation

Diego Schwartzman is one of the most unusual players on the ATP tour. Even shorter than David Ferrer, his serve will never be a weapon, so the only way he can compete is by neutralizing everyone else’s offerings and winning baseline battles. Up to No. 34 in this week’s official rankings and No. 35 on the Elo list, he’s proven he can do that against some very good players.

Using the ATP stats leaderboard at Tennis Abstract, we can get a quick sense of how his return game compares with the elites. At tour level in the last 52 weeks (through Monte Carlo), he ranks third with 42.3% return points won, behind only Andy Murray and Novak Djokovic. He is particularly effective against second serves, winning 56.6% of those, better than anyone else on tour. He has broken in 31.8% of his return games, another third-place showing, this time behind Murray and Rafael Nadal.

Yet the leaderboard warns us to tread carefully. In the last year, Murray’s opponents have been far superior to Schwartzman’s, with a median rank of 24 and a mean rank of 41.5. The Argentine’s opponents have rated at 45.5 and 54.8, respectively. Murray, Djokovic, and Nadal are far better all-around players than Schwartzman, so they regularly reach later rounds, where the quality of competition goes way up.

Competition quality is one of the knottiest aspects of tennis analytics, and it is far from being solved. If we want to compare Murray to Djokovic, competition quality isn’t such a big factor. One or the other might get lucky over a span of months, but in the long run, the two best players on tour will face roughly equivalent levels of competition. But when we expand our view to players like Schwartzman–or even a top-tenner such as Dominic Thiem–we can no longer assume that opponent quality will even out. To use a term from other sports, the ATP has a very unbalanced schedule, and the schedule is always more challenging for the best players.

Correcting for competition quality is also key to understanding how any particular player evolves over time. If a player’s results improve, he’ll usually start facing more challenging competition, as Schwartzman is doing this spring in his first shot at the full slate of clay-court Masters events. If his return numbers decline, is he actually playing worse, or is he simply competing at his past level against tougher opponents?

Adjusting for competition

To properly compare players, we need to identify similarities in their schedules. Any pair of tour regulars have played many of the same opponents, even if they’ve never played each other. For instance, since the beginning of last season, Murray and Djokovic have faced 18 of the same players–some more than once. Further down the ranking list, players tend to have fewer opponents in common, but as we’ll see, that’s an obstacle we can overcome.

Here’s how the adjustment works: For a pair of players, find all the opponents both men have faced on the same surface. For example, both Murray and Djokovic have played David Goffin on clay in the last 16 months. Murray won 53.7% of clay return points against the Belgian, while Djokovic won only 42.1%, meaning that Djokovic returned about 22% worse than Murray did. We repeat the process for every surface-player combination, weight the results so that longer matches (or larger numbers of matches) count more heavily, and find the average.

When we do that for the top two men, we find that Djokovic has returned 2.3% better. (That’s a percentage, not percentage points. A great returner wins about 40% of return points, and a 2.3% improvement on that is roughly 41%.) Our finding suggests that Murray has faced somewhat weaker-serving competition: Since the beginning of 2016, he has won 42.9% of return points, compared to Djokovic’s 43.3%–a smaller gap than the competition-adjusted one.

It takes more work to reliably compare someone like Schwartzman to the elites, since their schedules overlap so much less. So before adjusting Diego’s return numbers, we’ll take several intermediate steps. Let’s start with the world No. 3 Stanislas Wawrinka. We follow the above process twice: Once for Wawrinka and Murray, then again for Stan and Novak. Run the numbers, and we find that Wawrinka’s return game is 22.5% weaker than Murray’s and 24.3% weaker than Djokovic’s. Wawrinka’s rates relative to the other two players correspond very well with what we already found, suggesting that Djokovic is a little better than his rival. Weighting the two numbers by sample size–which, in this case, is almost identical–we slightly adjust those two comparisons and conclude that Wawrinka’s return game is 22.4% worse than Murray’s.

Generating competition-adjusted numbers for each subsequent player follows the same pattern. For No. 4 Federer, we run the algorithm three times, one for each of the players ranked above him, then we aggregate the results. For No. 34 Schwartzman, we go through the process 33 times. Thanks to the magic of computers, it takes only a few seconds to adjust 16 months worth of return stats for the ATP top 50.

Below are the results for 2016-17. Players are ranked by “relative return points won” (REL RPW), where a rating of 1.0 is arbitrarily given to Murray, and a rating of 0.98 means that a player wins 2% fewer return points than Murray against equivalent opposition. The “EX RPW” column puts those numbers in a more familiar context: The top-ranked player’s rating is set equal to 43.0%–approximately the best RPW of any player in the last few seasons–and everyone else’s is adjusted accordingly.  The last two columns show each player’s actual rate of return points won and their rank among the ATP top 50:

RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK  
1     Diego Schwartzman         1.04   43.0%   42.4%     4  
2     Novak Djokovic            1.02   42.1%   43.3%     1  
3     Andy Murray               1.00   41.2%   42.9%     2  
4     Rafael Nadal              0.98   40.3%   42.6%     3  
5     David Goffin              0.97   40.1%   41.3%     5  
6     Gilles Simon              0.96   39.6%   40.1%     9  
7     Kei Nishikori             0.95   39.3%   40.1%    10  
8     David Ferrer              0.95   39.1%   40.6%     7  
9     Roger Federer             0.94   38.7%   38.7%    15  
10    Gael Monfils              0.93   38.5%   39.8%    11  


RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK
11    Roberto Bautista Agut     0.93   38.3%   40.3%     8  
12    Ryan Harrison             0.92   37.9%   36.7%    33  
13    Richard Gasquet           0.92   37.9%   40.8%     6  
14    Daniel Evans              0.91   37.6%   36.9%    27  
15    Juan Martin Del Potro     0.91   37.5%   36.8%    32  
16    Benoit Paire              0.90   37.0%   38.1%    19  
17    Mischa Zverev             0.90   36.9%   36.9%    28  
18    Grigor Dimitrov           0.89   36.4%   38.2%    18  
19    Fabio Fognini             0.88   36.4%   39.7%    12  
20    Fernando Verdasco         0.88   36.4%   38.3%    16  

RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK
21    Joao Sousa                0.88   36.2%   38.3%    17  
22    Dominic Thiem             0.88   36.2%   38.1%    20  
23    Stani Wawrinka            0.88   36.1%   37.5%    22  
24    Alexander Zverev          0.88   36.0%   37.5%    23  
25    Albert Ramos              0.87   35.9%   38.9%    14  
26    Kyle Edmund               0.86   35.5%   36.1%    37  
27    Jack Sock                 0.86   35.5%   36.6%    34  
28    Viktor Troicki            0.86   35.4%   37.1%    26  
29    Marin Cilic               0.86   35.4%   37.3%    25  
30    Pablo Carreno Busta       0.86   35.3%   39.4%    13  

RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK
31    Milos Raonic              0.86   35.2%   36.1%    38  
32    Pablo Cuevas              0.85   35.1%   36.9%    29  
33    Tomas Berdych             0.85   35.1%   36.9%    30  
34    Borna Coric               0.85   34.9%   36.1%    39  
35    Nick Kyrgios              0.85   34.9%   35.7%    41  
36    Philipp Kohlschreiber     0.84   34.7%   37.9%    21  
37    Jo Wilfried Tsonga        0.84   34.6%   36.2%    36  
38    Sam Querrey               0.83   34.3%   34.6%    44  
39    Lucas Pouille             0.82   33.9%   36.9%    31  
40    Feliciano Lopez           0.81   33.2%   35.2%    43  

RANK  PLAYER                 REL RPW  EX RPW  ACTUAL  RANK
41    Robin Haase               0.80   33.0%   36.1%    40  
42    Paolo Lorenzi             0.80   32.9%   37.5%    24  
43    Donald Young              0.78   32.2%   36.3%    35  
44    Bernard Tomic             0.78   32.1%   34.1%    45  
45    Nicolas Mahut             0.76   31.4%   35.4%    42  
46    Steve Johnson             0.75   31.0%   33.8%    46  
47    Florian Mayer             0.74   30.3%   33.5%    47  
48    John Isner                0.73   30.0%   29.8%    49  
49    Gilles Muller             0.72   29.8%   32.4%    48  
50    Ivo Karlovic              0.63   25.9%   26.4%    50

The big surprise: Schwartzman is number one! While the average ranking of his opponents was considerably lower than that of the elites, it appears that he has faced bigger-serving opponents than have Murray or Djokovic. The top five on this list–Schwartzman, Murray, Djokovic, Nadal, and Goffin–do not force any major re-evaluation of who we consider to be the game’s best returners, but the competition-adjusted metric does offer more evidence that Schwartzman really belongs there.

There is a similar predictability at the bottom of the list. The five players rated the worst by the competition-adjusted metric–Steve Johnson, Florian Mayer, John Isner, Gilles Muller, and Ivo Karlovic–are the same five who sit at the bottom of the actual RPW ranking, with only Isner and Muller swapping places. This degree of consistency at the top and bottom of the list is reassuring: The metric is correcting for something important, but it isn’t spitting out any truly crazy results.

There are, however, some surprises. Three players do very well when their return games are adjusted for competition: Ryan Harrison, Daniel Evans, and Juan Martin del Potro, all of whom jump from the bottom half to the top 15. In a sense, this is a surface adjustment for Harrison and Evans, both of whom have played almost exclusively on hard courts. Players win fewer return points on faster surfaces (and faster surfaces attract bigger-serving competitors, magnifying the effect), so when adjusted for competition, someone who plays only on hard courts will see his numbers improve. Del Potro, on the other hand, has been absolutely hammered by tough competition, so in his case the correction is giving him credit for the difficult opponents he has had to face.

Several clay court specialists find their return stats adjusted in the wrong direction. Last week’s finalist, Albert Ramos, falls from 14th to 25th, Pablo Carreno Busta drops from 13th to 30th, and Roberto Bautista Agut and Paolo Lorenzi see their numbers take a hit as well. This is the reverse of the effect that pushed Harrison and Evans up the list: Clay-court specialists spend more time on the dirt and they play against weaker-serving opponents, so their season averages make them look like better returners than they really are. It appears that these players are all particularly bad on hard courts: When I ran the algorithm with only clay-court results, Bautista Agut, Ramos, and Carreno Busta all appeared among the top 12 in competition-adjusted return points won. It’s their abysmal hard-court performances that pull down their longer-term numbers.

Beyond RPW

This algorithm–or something like it–has a great deal of potential beyond simply correcting return points won for tour-level competition quality. It could be used for any stat, and if competition-adjusted return rates were combined with corrected rates of service points won, it would generate a plausible overall player rating system.

Such a rating system would be more valuable if the algorithm were extended to players beyond the top 50, as well. Just as Schwartzman doesn’t yet have that many common opponents with the elites, Challenger-level stalwarts don’t have share many opponents with tour regulars. But there is enough overlap that, when combining the shared opponents of dozens of players, we might be able to get a better grip on how Challenger-level competition compares to that of the highest levels. Essentially, we can compare adjacent levels–the elites to the middle of the pack (say, ATP ranks 21 to 50), the middle of the pack to the next 50, and so on–to get a more comprehensive idea of how much players must improve to achieve certain goals.

Finally, adjusting serve and return stats so that we have a set of competition-neutral numbers for every player, for each season of his career, we will gain a clearer picture of which players are improving and by how much. Official rankings and Elo ratings tell us a lot, but they are sometimes fooled by lucky breaks, close wins, or inconsistent opposition. And they cannot isolate individual stats, which may be particularly useful for developmental purposes.

Adjusting for opposition quality is standard practice for analysts of many other sports, and it will help tennis analytics move forward as well. If nothing else, it has shown us that one extreme performance–Schwartzman’s return game–is much more than a fluke, and that service return greatness isn’t limited to the big four.

Rafael Nadal’s Wide-Open Monte Carlo Draw

Italian translation at settesei.it

This afternoon, Rafael Nadal will take on Albert Ramos for a chance at his tenth Monte Carlo Masters title. Since 2005, Nadal has faced the best clay-court players in the sport and, with very few exceptions, beaten them all.

Yet this year, Nadal’s path to the trophy has been remarkably easy. The three top seeds–Andy Murray, Novak Djokovic, and Stan Wawrinka–all lost early, leaving Nadal to face David Goffin in the semifinals and Ramos (who ousted Murray) in the final. Goffin, at No. 13, was Rafa’s highest ranked opponent, followed by Alexander Zverev, at No. 20, who Nadal crushed in the third round.

When we run the numbers, we’ll see that this competition isn’t just weak: It’s the weakest faced by any Masters titlist in recent history. I’ll get into the mechanics and show you some numbers in a minute.

First, a disclaimer. By saying a draw is weak, I’m not arguing that the title “means less” or is somehow less deserved. It’s not in any way a reflection on the player. For all we know, Rafa would’ve cruised through the draw had he faced the toughest possible opponent in every round. The only thing a weak draw tells us about the champion is how to forecast his future. Had Nadal beaten multiple top-ten players this week, we might be more confident predicting future success for him than we are now, after he has beaten up on a bunch of players we already suspected he’d have no problem with.

Back to the numbers. To measure the difficulty of a player’s draw, I used jrank–my own surface-adjusted rating system, roughly similar to Elo–at the time of each Masters event back to 2002. For each tournament, I found the jrank of each player the titlist defeated, and calculated the likelihood that a typical Masters winner would beat that group of players.

That’s a mouthful, so let’s walk through an example. In the last 15 years, the median Masters winner was ranked No. 3, with a jrank (for the surface of the tournament) of about 4700, good for fourth at the moment. A 4700-rated player would have an 85.7% chance of beating Ramos, a 75.7% chance of defeating Goffin, and 87.3%, 68.4%, and 88.7% chances of knocking out Diego Schwartzman, Zverev, and Kyle Edmund, respectively. Multiply those together, and our average Masters winner would have a 34.3% chance of claiming the trophy, given that competition.

I’m using a hypothetical average Masters winner so that we measure the level of competition against a constant level. It doesn’t matter whether 2017 Nadal, peak Nadal, or someone else entirely played that series of opponents. If Djokovic had faced the same five players, we’d want the numbers to come out the same.

Here are the ten easiest paths to a Masters title since 2002, measured by this algorithm:

Year  Event                Winner          Path Ease  
2017  Monte Carlo Masters  Rafael Nadal*       34.3%  
2016  Shanghai Masters     Andy Murray         33.0%  
2011  Shanghai Masters     Andy Murray         30.8%  
2013  Madrid Masters       Rafael Nadal        30.8%  
2012  Paris Masters        David Ferrer        30.4%  
2010  Monte Carlo Masters  Rafael Nadal        27.3%  
2012  Canada Masters       Novak Djokovic      25.8%  
2014  Madrid Masters       Rafael Nadal        25.3%  
2016  Paris Masters        Andy Murray         24.7%  
2010  Rome Masters         Rafael Nadal        24.6%

* pending; extremely likely

The average ‘Path Ease’ is 15.6%, and as we’ll see in a moment, some players have had it much, much harder. In Shanghai last year, Murray certainly did not: His draw turned out much like Rafa’s this week, complete with Goffin along the way and a three-named Spaniard in the final–in his case, Roberto Bautista Agut.

Here are the ten most difficult paths:

Year  Event                 Winner              Path Ease  
2007  Madrid Masters        David Nalbandian         4.1%  
2007  Paris Masters         David Nalbandian         6.2%  
2014  Canada Masters        Jo Wilfried Tsonga       6.6%  
2011  Rome Masters          Novak Djokovic           6.6%  
2009  Madrid Masters        Roger Federer            7.0%  
2010  Canada Masters        Andy Murray              7.7%  
2004  Cincinnati Masters    Andre Agassi             7.9%  
2007  Canada Masters        Novak Djokovic           8.0%  
2009  Indian Wells Masters  Rafael Nadal             8.0%  
2002  Canada Masters        Guillermo Canas          8.4%

Those of us who remember the end of David Nalbandian‘s 2007 season won’t be surprised to see him atop this list. In Madrid, he beat Nadal, Djokovic, and Roger Federer in the final three rounds, and in Paris, he knocked out Federer and Nadal again, along with three other top-16 players. Making his paths even more difficult, he didn’t earn a first-round bye in either event.

Given that Monte Carlo is the one non-mandatory Masters event, I expected that, over the years, it would prove to have the weakest competition. That was wrong. Entering this week, Monte Carlo is only fourth-easiest of the nine current 1000-series events. Indian Wells–which requires at least six victories for a title, unlike most of the others, which require only five–has been the toughest, while Miami, which also requires six wins, is closer to the middle of the pack:

Event         Avg Path Ease  
Indian Wells          12.8%  
Canada                14.3%  
Rome                  14.6%  
Miami                 15.3%  
Cincinnati            15.7%  
Monte Carlo*          16.5%  
Madrid**              16.7%  
Paris                 16.8%  
Shanghai              21.5%

* through 2016; ** hard- and clay-court eras included

Finally, seeing the presence of Nadal, Djokovic, and Murray on the list of easiest title paths raises another question. How have the big four’s levels of competition differed at the Masters events?

Player          Titles  Avg Path Ease  
Roger Federer       26          14.6%  
Novak Djokovic      30          16.1%  
Rafael Nadal        28          16.7%  
Andy Murray         14          18.1%

not including 2017 Monte Carlo

Federer has had the most difficult paths, followed by Djokovic, Nadal, and then Murray. Assuming Rafa wins today, his number will tick up to 17.3%.

To reach ten titles at a single event, as Nadal is on the brink of doing in Monte Carlo, requires one to thrive regardless of draw luck. Rafa’s path to the trophy last year was tougher than any of his previous Monte Carlo campaigns, rating a Path Ease of 9.1%, almost difficult enough to show up on the top ten list displayed above. His 2008 title was no cakewalk either–a typical Masters winner would have only a 10.0% chance of coming through that draw successfully.

This year, Rafa’s luck has decidedly changed. To no one’s surprise, the best clay court player in history is taking full advantage.

Is Grand Slam Qualifying Worth Tanking For?

Italian translation at settesei.it

Earlier today in Hobart, Naomi Osaka lost her second-round match to Mona Barthel. Coming into the match, she was in a tricky position: If she won, she wouldn’t be able to play Australian Open qualifying. For a young player outside the top 100, a tour-level quarterfinal would be nice, but presumably Melbourne was intended to be the centerpiece of her trip to Australia.

Since she lost the match, she’ll be able to play qualifying. But what if she hadn’t? Is this a situation in which a player would benefit from losing a match?

Put another way: In a position like Osaka’s, what are the incentives? If she could choose between the International-level quarterfinal and the Slam qualifying berth, which should she pick? Or, put more crassly, should a player in this position tank?

Let’s review the scenarios. In scenario A, Osaka wins the Hobart second-rounder, reaches the quarterfinal, and has a chance to go even further. She can’t play the Australian Open in any form. In scenario B, she loses the second-rounder, enters Melbourne qualifying and has a chance to reach the main draw.

Before we go through the numbers, take a guess: Which scenario is likely to give Osaka more ranking points? What about prize money?

Scenario A is more straightforward. By reaching the quarterfinals, she earns 30 additional ranking points and US$2,590 beyond what a second-round loser makes. Beyond that, we need to calculate “expected” points and prize money, using the amounts on offer for each round and combining them with her odds of getting there.

Let’s estimate that Osaka would have about a 25% chance of winning her quarterfinal match and earning an additional 50 points and $5400. In expected terms, that’s 12.5 points and $1,350. If she progresses, we’ll give her a 25% chance of reaching the final, then in the final, a 15% chance of winning the title.

Adding up these various possibilities, from her guaranteed QF points to her 0.94% chance (25%*25%*15*) of winning the Hobart title, we see that her expected rewards in scenario A are roughly 48 ranking points and just under $4,800.

Scenario B starts in a very different place. Thanks to the recent increases in Grand Slam prize money, every player in the qualifying takes home at least US$3,150. That’s already close to Osaka’s expected financial reward from advancing in Hobart. The points are a different story, though: First-round qualifying losers only get 2 WTA ranking points.

I’ll spare you all the calculations for scenario B, but I’ve assumed that Osaka would have a 70% chance of winning qualifying round 1, a 60% chance of winning QR2, and a 50% chance of winning QR3 and qualifying. Those might be a little bit high, but if they are, consider it compensation for the possibility that she’ll reach the main draw as a lucky loser. (Also, if we knock her chances all the way down to 50%, 45%, and 40%, the conclusions are the same, even if the points and prize money in scenario B are quite a bit lower.)

Those estimated probabilities translate into an expectation of about 23 ranking points and US$11,100. Osaka isn’t guaranteed any money beyond the initial $3,150, but the rewards for qualifying are enormous, especially compared to the prize money in Hobart. A first-round main draw loser in Melbourne takes home more money than the losing finalist does in Hobart.

And, of course, if she does qualify, there’s a chance she’ll go further. Since 2000, female Slam qualifiers have reached the second round 41% of the time, the third round 9% of the time, the fourth round 1.8% of the time, and the quarterfinals 0.3% of the time. Those odds, combined with her 21% chance of reaching the main draw in the first place,  translate into an additional 7 expected ranking points and $2,600 in prize money.

All told, scenario B gives us 30 expected ranking points and US$13,600 in expected prize money.

The Slam option results in far more cash, while the International route is worth more ranking points. In the long term, those ranking points would have some financial value, possible earning Osaka entry into a few higher-level events than she would otherwise qualify for. But that value probably doesn’t overcome the nearly $9,000 gap in immediate prize money.

I hope that no player ever tanks a match at a tour-level event so they can make it in time for Slam qualifying. But if one does, we’ll at least understand the logic behind it.

Will the US Open First-Round Bloodbath Benefit Serena Williams?

After only two days of play, the US Open women’s draw is a shell of its former self.

Ten seeds have been eliminated, only the fifth time in the 32-seed era that the number of first-round upsets has reached double digits. Four of the top ten seeds were among the victims, marking the first time since 1994 that so many top-tenners failed to reach the second round of a Grand Slam.

Things are particularly dramatic in the top half of the draw, where Serena Williams can now reach the final without playing a single top-ten opponent. In a single day of play, my (conservative) forecast of her chances of winning the tournament rose from 42% to 47%, only a small fraction of which owed to her defeat of Vitalia Diatchenko.

However, plenty of obstacles remain. Serena could face Agnieszka Radwanska or Madison Keys in the fourth round, and then Belinda Bencic–the last player to beat her–in the quarters. A possible semifinal opponent is Elina Svitolina, a rising star who took a set from Serena at this year’s Australian Open.

The first-round carnage didn’t include most of the players who have demonstrated they can challenge the top seed. Five of the last six players to beat Serena–Bencic, Petra Kvitova, Simona Halep, Venus Williams, and Garbine Muguruza–are still alive. Only Alize Cornet, the 27th seed who holds an improbable .500 career record against Serena, is out of the picture.

What’s more, early-round bloodbaths haven’t, in the past, cleared the way for favorites. In the 59 majors since 2001, when the number of seeds increased to 32, the number of first-round upsets has had little to do with the likelihood that the top seed goes on to win the tournament.

In 18 of those 59 Slams, four or fewer seeds were upset in the first round. The top seed went on to win five times. In 22 of the 59, five or six seeds were upset in the first round, and the top seed won eight times.

In the remaining 19 Slams, in which seven or more seeds were upset in the first round, the top seed won only five times. Serena has “lost” four of those events, most recently last year’s Wimbledon, when nine seeds fell in their opening matches and Cornet defeated her in the third round.

This is necessarily a small sample, and even setting aside statistical qualms, it doesn’t tell the whole story. While Serena has failed to win four of these carnage-ridden majors, she has won three more of them when she wasn’t the top seed, including the 2012 US Open, when ten seeds lost in the first round and Williams went on to beat Victoria Azarenka in the final.

Taken together, the evidence is decidedly mixed. With the exception of Cornet, the ten defeated seeds aren’t the ones Serena would’ve chosen to remove from her path. While her odds have improved a bit on paper, the path through Keys, Bencic, Svitolina, and Halep or Kvitova in the final is as difficult as any she was likely to face.