October 20, 1973: Pigeon

No one ever accused Ilie Năstase of being boring. In the course of a single match, he could go from total focus and brilliant play to such extreme grandstanding that he could put a victory in doubt. There was no way of knowing which Ilie would turn up on a particular day. The stakes were irrelevant: He might clown his way through a crucial Davis Cup rubber or buckle down and obliterate an early-round foe.

By October 1973, only two things were certain. The first: Năstase was the best clay-court player in the world. Since the beginning of the year, he had won eight tournaments on dirt against only one loss. Combined with occasional success on other surfaces, he sat atop both the ATP ranking list and the Grand Prix points table.

The other apparent certainty was that he couldn’t beat Tom Okker. Since their first encounter in 1968, Okker had won six of eight. The “Flying Dutchman” held second place in the Grand Prix standings, and his combination of intensity and blistering speed was a puzzle that Năstase couldn’t solve. The Romanian had won a Davis Cup tilt in straight sets back in May, but more recently, it had been all Okker. In the semi-finals at both Los Angeles and Chicago, the fastest man on tour had beaten Năstase–twice in three weeks.

Something had to give. On October 20th, the two men met in yet another semi, this time on the high-altitude clay of the Madrid Open. Năstase had been his usual inscrutable self, meandering through early-round three-setters with no-names Jose Guerrero and Julian Ganzabal, then brushing aside the much stronger Mark Cox and Niki Pilić. Okker hadn’t been much steadier, dropping two sets but turning in a confident win over the fast-rising 21-year-old from Argentina, Guillermo Vilas.

In the semi, Okker took the first set, 6-4, and Năstase stormed back to grab the second, 6-1. The Romanian kept streaking, all the way to 5-2, 40-0 in the decider.

There were no computers in the press boxes of 1973, but it didn’t take statistical proof to know that the match was in the bag. At a rough estimate, Năstase’s chances of winning, at triple match point with a two-break advantage, were 99.8%. Mercurial as he was, even Ilie couldn’t throw this one away.

And then he did.

Okker easily saved the first two match points, then took the third with a let-cord winner. Năstase had spent most of the third set distracted, griping about the chilly conditions, a less-than-enthusiastic crowd, and the state of the court. The unlucky dribbler pushed him over the edge. Even in such a mood, the Romanian could beat most players, but Okker wouldn’t be denied: He didn’t allow Nastase another game, and the match went to the underdog, 6-4, 1-6, 7-5.

The loss didn’t threaten Ilie’s status as the leader in the Grand Prix race; his lead was effectively insurmountable. Still, who would consider him the best player in the game while he was Okker’s pigeon?

This being Năstase, it wasn’t quite the end of the story in Madrid. He and Okker paired up for the doubles semi-final, facing the oddball duo of Ion Țiriac–Ilie’s former mentor and doubles partner–and Björn Borg. When Okker called Țiriac a cheat and crossed the net to check a ball mark, Țiriac swung a racket at him. The Romanian veteran was immediately disqualified, and the Năstase/Okker duo cruised to the title.

It wasn’t the championship Ilie had hoped for–or expected–when he arrived in Madrid. He managed much better when Okker was playing elsewhere–or, at least, on the same side of the net.

* * *

This post is part of my series about the 1973 season, Battles, Boycotts, and Breakouts. Keep up with the project by checking the TennisAbstract.com front page, which shows an up-to-date Table of Contents after I post each installment.

You can also subscribe to the blog to receive each new post by email:

 

A Note Regarding Iga Świątek and Carlos Alcaraz

The Tennis 128 returns tomorrow, when I will unveil the 48th greatest player of the last century. Click here to read about the project and see the full list.

* * *

Iga Świątek is now a three-time major champion. Carlos Alcaraz just won his first slam. It’s easy to imagine both of them winning many, many more.

So, do they belong in the Tennis 128?

Many of you have asked me that. It is, by far, the most common question I’ve heard since kicking off the project in February. Some of you started wondering back in May, when Iga was in the middle of her winning streak and Alcaraz was proving he could hang with the big boys.

The short answer is no. Even if I hadn’t already announced players from #49 to #128, they wouldn’t get a spot.

If you think one or both of them deserve to be on the list, your reasoning probably falls into one of two categories:

  1. Peak level is extremely important, and they’ve shown themselves to be capable of truly exceptional things in a short period of time.
  2. They are young, and even very conservative forecasts of the rest of their careers add up to something special.

Both arguments are valid. The second point is especially powerful for Świątek, who is now up to three majors. Many of the players on my all-time list (and a couple of them in the to-be-announced top 48!) don’t have that many.

Here’s why these two points don’t sway me–or, to put it more accurately, why my algorithm rates players differently. First, I do give a great deal of weight to a player’s peak. But it’s not everything–even though many pundits over the years have sometimes acted that way. You can find arguments that someone like Lew Hoad is the greatest of all time, simply because he could be so exceptional on a given day.

I worked hard to find a satisfactory balance between peak and longevity. The more weight you give to a player’s peak, the wackier the list starts to look. You might not like Hoad at #74, Jim Courier at #107, or Iga at a number greater than 128. But I guarantee you that you’d have more issues with a formula-based list that gave a player’s strongest moments considerably more weight.

As a result, neither Iga nor Carlito have enough career achievements to merit a spot on the list. They probably will, and it probably won’t take long. They just don’t right now, and they can’t get there by the end of this year.

Second, no forecasting went into the making of this list. All-time greats are outliers by definition; it would be wrong to apply some generic aging curve and give them credit for future excellent seasons on that basis.

Fortunately, the lack of forecasting didn’t end up being too important. Most of the best active players are either winding down their careers or don’t yet qualify for the list.

So, where do this year’s US Open champions rank?

Świątek, with her two-major campaign, has almost definitely played her way into the top 200. A flawless end to the season–let’s say, a couple more titles plus an undefeated run at the Tour Finals–would move her up around 150.

Alcaraz had the same potential when I first looked into this issue back in May. He’s had an amazing season by any realistic standard for a 19-year-old, but it hasn’t been as otherworldly as the April/May edition of Carlos suggested it might be. A very strong finish to 2022 would move him into the top 200. A more realistic projection for the rest of his season would put him somewhere between #200 and #250.

Still, it doesn’t take that long to assemble an all-time great tennis career. Check back in twelve months. The answers to these questions could be very different.

Commercial or Political

You’ve probably heard: If you go to the Australian Open wearing a shirt that says, “Where is Peng Shuai?,” you’ll be asked required to change clothes or leave.

Surely Tennis Australia isn’t against raising awareness about a famous tennis player who accused a high-ranking political figure of sexual assault, was immediately censored, and has only been spotted in obviously scripted scenes witnessed by Chinese state media, right?

Of course not. Tennis Australia has a policy:

“Under our ticket conditions of entry we don’t allow clothing, banners or signs that are commercial or political”

This is arrant nonsense. I’m sure a more thorough statement of this policy is buried somewhere in the ticket terms and conditions that no one ever reads. I’m equally sure it is almost never enforced. And that’s the problem.

First off, most clothing is commercial. Every player on the court wears athletic gear with a (usually prominent) logo on it. Thousands of fans do the same. No, the clothing doesn’t explicitly say, “Buy Adidas!” But it doesn’t have to. Just like the slogan, “Where is Peng Shuai?” doesn’t explicitly say, “The Chinese Communist Party is detaining or censoring someone because they dared to accuse someone of a crime. They shouldn’t do that!”

And let’s face it, a whole lot of clothing is political. You don’t have to believe that everything is political to accept this. Is anyone at Melbourne Park wearing a “Black Live Matter” shirt or hat? How about the H&M tee in my kid’s wardrobe that says, “There is No Planet B?” Neither statement sets out a policy recommendation, but both are closely associated with political positions. Just like “Where is Peng Shuai?” is inoffensive unless you know why her whereabouts are unknown.

Has anyone been kicked out of the Happy Slam for wearing a BLM shirt or for a gentle nudge toward climate awareness? You know the answer to that as well as I do.

The point is, a sweeping prohibition like Tennis Australia’s is so broad as to be meaningless. It gives them political cover when there’s a slogan they want to remove, but they ignore their own rule 99% of the time. It’s only when a sponsor complains, or when they fear controversy, that the rule is enforced.

The spokesperson I quoted above continued:

“Peng Shuai’s safety is our primary concern. We continue to work with the WTA and global tennis community to seek more clarity on her situation and will do everything we can to ensure her well-being.”

Tennis Australia has now proven that this statement is false. “Commercial or political” messages are fine, except in the rare instances when they don’t approve, or they fear the backlash. Apparently “Where is Peng Shuai?” crosses the line. Don’t be fooled by the claim that this is just routine enforcement of a bland policy.

The WTA has been forceful and consistent in their handling of Peng Shuai’s disappearance, and the organization deserves great credit for that. Tennis Australia’s actions have shown just how easy it is to cave to pressure and become complicit with human rights abuses. We must hold the organization to a higher standard.

Expected Points, June 25: The Many Paths To the Eastbourne Semi-Finals

Expected Points, my new short, daily podcast, highlights three numbers to illustrate stats, trends, and interesting trivia around the sport.

Up today: Marc Polmans and Ramkumar Ramanathan fight out an old-school Wimbledon marathon, an unlikely unseeded foursome remains in the Eastbourne women’s draw, and African tennis is alive in Brazzaville.

Scroll down for a transcript.

You can subscribe on iTunes, Spotify, Stitcher, and elsewhere in the podcast universe.

Music: Love is the Chase by Admiral Bob (c) copyright 2021. Licensed under a Creative Commons Attribution Noncommercial (3.0) license. Ft: Apoxode

The Expected Points podcast is still a work in progress, so please let me know what you think.

Continue reading Expected Points, June 25: The Many Paths To the Eastbourne Semi-Finals

Expected Points, March 17: A Breakthrough Win for Lorenzo Musetti

Expected Points, my new short, daily podcast, highlights three numbers to illustrate stats, trends, and interesting trivia around the sport.

Up today: The Italian teen scores his first top-ten win, the WTA Monterrey field has an improbable favorite, and fans will have to wait for clay season for their next glimpse of Rafael Nadal or Dominic Thiem.

Scroll down for a transcript.

You can subscribe on iTunes, Spotify, Stitcher, and elsewhere in the podcast universe.

Music: Love is the Chase by Admiral Bob (c) copyright 2021. Licensed under a Creative Commons Attribution Noncommercial (3.0) license. Ft: Apoxode

The Expected Points podcast is still a work in progress, so please let me know what you think.

Continue reading Expected Points, March 17: A Breakthrough Win for Lorenzo Musetti

Expected Points, March 16: Russians in Command in St. Petersburg

Expected Points, my new short, daily podcast, highlights three numbers to illustrate stats, trends, and interesting trivia around the sport.

Up today: The St. Petersburg draw leaves little room for foreign challengers, Cristian Garin prefers to keep his clay court points short, and the upcoming Miami Open will feature a global assortment of IMG clients.

Scroll down for a transcript.

You can subscribe on iTunes, Spotify, Stitcher, and elsewhere in the podcast universe.

Music: Love is the Chase by Admiral Bob (c) copyright 2021. Licensed under a Creative Commons Attribution Noncommercial (3.0) license. Ft: Apoxode

The Expected Points podcast is still a work in progress, so please let me know what you think.

Continue reading Expected Points, March 16: Russians in Command in St. Petersburg

Tanking: A Model

The logic behind tanking a part of a tennis match–deliberately playing with less than maximum effort–is simple. If you have fallen behind early in the first set, you could choose to take it easy for the rest of the set. You probably would’ve lost the set anyway, and having semi-rested for several games, you’ll have more mental and physical energy to draw upon for the rest of the match.

By the end of this post, we’ll have some idea how useful that extra energy must be to make tanking worthwhile. It will take a few steps to get there.

The scenario

Consider some sample numbers to make this more concrete. Take two evenly matched men, each of whom win 70% of their service points. Maybe they are powerful–though not one-dimensional–servers on a reasonably fast surface. Winning seven out of ten service points means that nine out of ten games are holds of serve, so in our hypothetical match, breaks are at a premium.

Imagine that the match opens with one of those rare breaks. Given the 90% hold rate for both players, the man who got his nose in front has improved to an 83% chance of winning the set. In the simplest formulation, the player who has fallen behind faces two options for the balance of the set:

  • Continue playing at his usual level despite the low chance of winning, or
  • Take it easy, as the set is probably lost.

The tank

In the continue-as-usual scenario, our early front-runner has an 83% chance of winning the set. If both players continue playing at the same level for the duration of a best-of-five-sets match, that translates to a 62% chance of winning the match, leaving our player who decided not to tank with a 38% chance. (I’m using best-of-five because in a longer match, it’s more likely that a player can recover from losing the first set. That makes tanking a more plausible strategy.)

To evaluate the take-it-easy scenario, we need to pile on more assumptions. How much worse does a tanking player play? You will probably disagree with my estimates of the point-level costs and benefits of tanking, which is fine. I don’t have strong opinions about them, and they don’t matter much to the conclusions below. Consider these numbers just one illustration of the model. As soon as the trailing player decides to take it easy, let’s say his numbers fall to the following:

  • 20% return points won (instead of 30%)
  • 65% serve points won (instead of 70%)

That’s not a very good player–picture an unmotivated Nick Kyrgios. Down a break after the first game and playing a newly lackadaisical brand of tennis, he has a mere 1.3% chance of coming back to win the set. We’re simplifying quite a bit here, in large part because a player could always decide midway through the set to pause the tank, perhaps raising his game if he reaches 15-30 or better on his opponent’s serve. But again, this is just a model, and one I’m trying to keep from getting too complex.

The trade-off

The tanking player has, according to these assumptions, chosen to decrease his chance of winning the first set from 17% to a tick above 1%. If he received no benefit from conserving energy and both players returned to their 90% hold rate at the beginning of the next set, the tanking player’s chances of winning the match have fallen from 38% to 32%.

Clearly that’s not the whole story. A player who chooses to conserve energy at the expense of their immediate fortunes must assume that there are benefits coming later.

To further simplify, let’s assume that the tanking player loses the first set. Here are his chances of winning the match based on a few possible post-tank levels he could sustain:

  • 70% serve points won (SPW), 30% return points won (RPW): 31.3% (no benefit from tanking)
  • 71% SPW, 32% RPW: 46.3%
  • 72% SPW, 34% RPW: 61.9%
  • 73% SPW, 36% RPW: 75.8%
  • 74% SPW, 38% RPW: 93.3%

Remember that our tanking player has only a 38% chance of winning the match after sustaining the opening-game break, so the second scenario, in which his level improves to 71% SPW and 32% RPW, represents an improvement. That would be hardly noticeable over the course of three or four more sets. If the remainder of the match spanned 200 more points, it would mean winning 103 of them, instead of 100. If conserving energy early on confers even bigger benefits, it starts to look like a no-brainer.

Complications

Of course, it’s never this simple. The leading player might realize that his opponent was tanking and conserve some energy himself. The tanking player could have a hard time resuming his usual level (or better) at the right moment. Some points are more important than others, so the difference between 100 and 103 might not matter. Most matches are best-of-three, and giving up on the opening set in a shorter match is much more dangerous.

Those qualifications shouldn’t stop us from considering what tanking has to offer. While players don’t tank sets as often as they used to, there’s surely some energy-conservation benefit, and extra energy must have some value for the remainder of the match, right? I have no idea whether that value is equivalent to one point per hundred or something much higher or lower, but surely it’s possible that in some situations, it’s worth it.

The illustration I’ve used shows that the value of the extra energy doesn’t have to be that substantial to make tanking a plausible tactic. The small margins that determine the outcome of tennis matches mean that we’ll rarely recognize when a player is taking advantage of a tank, but those margins also mean that a small edge could be enough to make it worthwhile.

All calculations of game, set, and match probabilities are based on my publicly-available code.

The Most Predictable Woman in Tennis

Italian translation at settesei.it

Caroline Wozniacki is set in her habits. In the eight service games of her first round match in Charleston against Laura Siegemund last week, she followed a strict pattern: wide serve on the first point, T serve on the second, T on the third, and wide on the fourth. Aside from two missed first serves that weren’t classified as “wide” or “T”, that’s 30 points. Wozniacki served in her preferred direction on all 30. From the fifth point in each game, her choices were closer to random.

This is nothing new for the Danish former No. 1. Against Monica Niculescu in the Miami third round, she had 11 service games. In the first four points of each, she followed the exact pattern: wide/T/T/wide. 44 service points, and zero deviations from the first-serve script. The Match Charting Project (MCP) has logged over 2,600 WTA matches, and no other player has ever gone an entire match without varying their first-four-point serve direction. Wozniacki has done so 17 times.

Measuring serve predictability

Just how extreme is Caro’s reliability, and how much does she differ from the competition? Let’s take a look.

I classified each first serve as either “wide” or “T.” MCP coding provides for three categories (wide, body, and T), and where a serve is coded as “body,” I used the returner’s first shot as an indication of the serve direction. That’s not perfect, because some returners will run around a weak serve, but it gets us pretty close. I excluded unreturned body serves and body serve faults. Here is Caro’s percentage of wide serves for each point of over 1,000 charted service games:

Point  Wide%  
1st    82.8%  
2nd    17.4%  
3rd    16.7%  
4th    78.5%  
5th    52.3%  
6th    46.8%  
deuce  48.0%  
ad     50.6%

Wozniacki only varies her first serve direction on the first four points about once every five deliveries. If we convert the first four rates (82.8%, 17.4%, 16.7%, and 78.5%) to the frequency with which she hit her favored serve (82.8%, 82.6%, 83.3%, 78.5%), we get an average–call it FSP, for First Serve Predictability–of 81.8%. Only two other women with at least ten charted matches, Kateryna Kozlova and Justine Henin, exceed 70%, and Henin’s repetition has more to do with her preference for the T serve in all situations.

Amazingly, Caro’s overall numbers obscure just how often she uses the pattern these days. The MCP has 52 Wozniacki matches dating from the beginning of 2017, and that more recent subset gives us a FSP of 94.0%. I suspect that the more extreme number is a better representation of Woz’s tendencies, because the more recent data includes a broader selection of matches, including contests against weaker opponents. The MCP is not a random sample, and older matches tend to be more notable ones involving higher-quality opponents.

Wozniacki’s not-really-peers

Let’s take a look at some of the other women who are more predictable than average. The median WTAer with at least 10 charted matches in the MCP dataset has an FSP of about 58%, meaning that they might prefer one direction to the other, or that they often aim for a right-hander’s backhand, but that they vary the first serve delivery quite a bit.

Here are the 20 who change direction the least. For each player, the following table shows the frequency with which they hit a wide serve on each of the first four points, their FSP on the first four points–FSP(1-4)–and their FSP on points from the fifth onward, FSP(5+).

Player         1st  2nd  3rd  4th  FSP(1-4)  FSP(5+)  
Wozniacki      83%  17%  17%  79%       82%      52%  
Kozlova        60%  35%  10%  73%       72%      64%  
Henin          38%  11%  57%  25%       71%      66%  
Vikhlyantseva  92%  46%  38%  63%       68%      54%  
Petkovic       74%  72%  36%  38%       68%      58%  
Vondrousova    15%  63%  30%  54%       68%      68%  
Brengle        82%  67%  53%  68%       67%      56%  
Clijsters      86%  32%  61%  52%       67%      56%  
Stephens       76%  21%  53%  46%       65%      62%  
Voegele        71%  35%  59%  34%       65%      60%  
                                                      
Player         1st  2nd  3rd  4th  FSP(1-4)  FSP(5+)  
Dementieva     76%  54%  71%  60%       65%      60%  
Dodin          58%  14%  43%  43%       65%      64%  
Li Na          28%  33%  52%  33%       65%      56%  
Kerber         43%  78%  56%  67%       65%      64%  
Doi            21%  60%  64%  56%       65%      63%  
Vandeweghe     35%  35%  62%  66%       65%      55%  
A Beck         59%  24%  45%  33%       64%      61%  
Sanchez V      43%  77%  42%  65%       64%      64%  
Buzarnescu     19%  39%  58%  46%       64%      59%  
Sevastova      73%  58%  37%  60%       64%      55%

Only two servers, Kozlova and Natalia Vikhlyantseva, follow the general principle of Wozniacki’s wide/T/T/wide pattern. Many of these players, like Henin, prefer wide or T serves at all times, and others, including Andrea Petkovic and Coco Vandeweghe, often opt for one type of serve on the first two points and another on the next two. It’s tough to see much in the patterns among these players, especially since most of them are closer to the median level of predictability than they are to Wozniacki’s extreme consistency.

I included the final column, FSP(5+), to illustrate another aspect of Caro’s uniqueness. While she closely follows her script for the first four points, she reverts to almost 50/50 wide and T serves after that–even in the more extreme 2017-present subset of matches. Many of the other players on this list do not. Angelique Kerber, for instance, is a near Woz-level lock to go wide in the ad court late in games. She hits wide first serves more than 80% of the time at 40-30 or 30-40, and 73% of the time at AD-40 or 40-AD. Henin also stuck with her preferences on higher-leverage points.

Equilibrium

For whatever reason, Wozniacki is comfortable with this pattern, and is confident that it works. Or, at least, that it doesn’t work against her. It’s not a secret–the sequence came to my attention after Siegemund’s coach pointed it out during an on-court coaching visit in Charleston.

Tennis is full of decisions like this: when to follow a pattern, and how often to vary things to keep an opponent from getting too comfortable. On this week’s podcast, Carl and I speculated about how often a player would need to deploy an underarm serve in order to force a returner out of position. If Wozniacki’s tendencies are any indication, the answer is: not very often. The mere fact that Caro could serve the other direction was apparently enough to prevent Niculescu or Siegemund from pouncing on her first serves, even if Woz stuck to the script from the first game to the last.

I realize I’ve left a lot of questions unanswered. Does Caro win more first serve points when she varies her delivery more? Does she follow any similar patterns with her second serve? Does she use the results of the first four points to help decide the direction of the following points? Are there particular types of players who force her to mix things up–as Madison Keys did in the Charleston final, with her aggressive return tactics?

Keep an eye on this space–maybe I’ll be able to offer some answers. In the meantime, I hope you derive some extra enjoyment the next time you watch a Wozniacki match, knowing in advance where her next serve will go. Or, perhaps, you’ll witness one of the rare occasions when the most predictable woman in tennis goes off-script.

Thanks to Kees for charting the Siegemund match, passing along the on-court coaching conversation, and providing the impetus for this post.

WTA Aging Patterns and Bianca Andreescu’s Future

Italian translation at settesei.it

Bianca Andreescu is really good, right now. Still a few months away from her 19th birthday, she has collected her first Premier Mandatory title, beaten a few top-ten players (including Angelique Kerber twice), and climbed to 7th in the Elo ratings. She is the only teenager in the WTA top 30 and one of only five in the top 100.

The burning question about Andreescu isn’t how good she is, it’s how good she could become. It’s easy to look at the best 18-year-old in the game and imagine her becoming the best 19-year-old, best 20-year-old, and so on, until she’s at her peak age and she’s the best player in the world, period. As the sport in general has gotten older, teenage champions have become rarer, so she seems all the more destined for success. But it isn’t that simple: Prospects get injured, opponents learn how to beat them, they peak early and fizzle out. Tennis history is littered with teen starlets who failed to reach their potential.

Building an aging curve

Let’s start with the basics. What is the trajectory of the typical WTA career? Answering that question requires a whole slew of assumptions, so keep in mind that this is approximate. I found every player born between 1960 and 1989* who played at least five full** seasons, a total of about 500 players. For each one, I calculated her year-end Elo for every full season she played, as well as the difference between that year’s Elo and her peak year-end Elo.

* I wish we knew more about players born in the 1990s, since their experience is most relevant to today’s teens, but many of them have yet to reach their peaks, whenever that will be.

** I’ve defined a full season very broadly, as 20 or more completed matches at the ITF $50K level or higher.

For every player, then, we have an idea of how they aged. To get our bearings, let’s look at a couple of players with unique aging trajectories: Martina Navratilova and Venus Williams:

(Martina’s peak was about 50 Elo points higher than Venus’s, but I set them equal to each other for the purpose of this graph.)

Venus peaked at age 21 and had her last all-time-great-level season at 23, while Martina’s peak came at age 30. There’s more than one way to amass a Hall of Fame career, and it’s important to keep in mind that “average” aging patterns hide a lot of more extreme possibilities.

The usual route

When we take Venus’s and Martina’s trajectories and average them with the other 500-or-so players in our dataset, here’s what we get:

The most common peak age is 24, with 23 a very close second. In the above graph, I set peak Elo at 1,820, the average peak Elo of the players I looked at, but the absolute number isn’t important. The typical player who completes a full season at age 18 is about 70 Elo points away from her peak. There’s isn’t much downward movement in the 20s; at age 30, those players who are still active are only 43 Elo points below their peak.

There’s a poison pill in that last sentence that is difficult to avoid when analyzing aging patterns–we only know what happens to those players who are still active. That’s even more troublesome for young players. Venus, for instance, improved 211 Elo points between her year-end finish as an 18-year-old and her best year-end rating. Kerber, on the other hand, wasn’t even good enough to show up in the ratings until she was 19. If we were able to estimate Kerber’s level at that age, it would probably be very low. Thus, forecasting an 18-year-old using this dataset may understate the degree to which a player can improve.

Changing times

Using the numbers above, we can make a baseline estimate. Those players who had year-end Elo ratings as 18-year-olds typically improved about 70 more points before hitting their peak. Through her Indian Wells title, Andreescu is rated at 2,017, giving us an estimated peak of 2,087. That’s good enough for 2nd place on the current list and just inside the top 50 of all time (as measured by the player’s best year-end Elo). Still, that seems a bit modest–it doesn’t represent much of an additional improvement for a player who has come so far in just a few months.

The forecast is slightly more optimistic if we narrow our view to players born in the 1980s. It seems like a reasonable thing to do, because Andreescu is facing an era with older competition, more like the last decade than, say, the one faced by players born in the 1960s. Our dataset shrinks to about 200 players, and those players do show a bigger gap between their 18-year-old Elo rating and their career peak. The difference is about 83 points, giving Bianca a revised estimated peak of 2,100–exactly even with Simona Halep, who currently tops the list, and around the 40th best of all time.

The biggest difference in the overall aging curve and the curve for players born in the 1980s isn’t the timing of the peak, it’s the duration. I looked at several age cohorts, and the typical WTA peak is always at 23 or 24 years old. But there’s more to it than that. Take a look at the trajectory of players born in the 1960s compared to those born in the 1980s:

For the more recent generation of players, there is little difference between age 23 and 28 or 29. Even into the early 30s, those players who stick around are competing almost as well as they did at their peak.

Bespoke for Bianca

Aging patterns in women’s tennis have changed, so it’s important to look at a relevant era when there’s enough data to do so. But what if that’s not the best way to narrow our view? As I’ve noted, the average peak Elo of the 500 players in our dataset is 1820. Bianca is already 200 points higher than that. What if the best players are qualitatively different as well as quantitatively superior?

Here are 20 players whose year-end Elo at age 18 were similar to Andreescu’s current rating: the ten closest who were higher and the ten closest who were lower:

Player                     Birth Year  18yo Elo  Peak Elo  
Jelena Dokic                     1983      2110      2110  
Conchita Martinez                1972      2085      2191  
Arantxa Sanchez Vicario          1971      2084      2314  
Hana Mandlikova                  1962      2071      2160  
Iva Majoli                       1977      2067      2067  
Belinda Bencic                   1997      2066      2066  
Caroline Wozniacki               1990      2059      2194  
Lindsay Davenport                1976      2053      2353  
Nicole Vaidisova                 1989      2043      2121  
Manuela Maleeva Fragniere        1967      2035      2059  
---                                                        
Mary Pierce                      1975      2008      2161  
Ana Ivanovic                     1987      1994      2133  
Victoria Azarenka                1989      1986      2270  
Anke Huber                       1974      1980      2072  
Magdalena Maleeva                1975      1961      2024  
Agnieszka Radwanska              1989      1957      2116  
Mary Joe Fernandez               1971      1955      2110  
Anna Kournikova                  1981      1954      2020  
Kathy Rinaldi Stunkel            1967      1947      1947  
Justine Henin                    1982      1946      2411

Both halves of the list include some of the greatest of all time: Arantxa Sanchez Vicario, Lindsay Davenport, Victoria Azarenka, and Justine Henin. Yet several of these players failed to build on their early-career peaks, such as Jelena Dokic and (so far, at least) Belinda Bencic.

The average 18-year-old year-end Elo of these 20 players is 2,018, virtually the same as Andreescu’s post-Indian Wells level. The average peak year-end Elo of these 20 players is 2,145, a 120 point improvement and a more optimistic forecast than anything we’ve seen so far. That rating would put her a tick above Ana Ivanovic at her best, a bit below Hana Mandlikova at hers, and just inside the 30 greatest of all time.

This is heady stuff for a teenager, but after watching her ascent this year, it’s tough to bet against her. And as long as Kerber is in the draw, apparently, we can expect Andreescu to keep winning.

Belinda Bencic Won a Historically Difficult Title, Just Not Last Week

Italian translation at settesei.it

Belinda Bencic is back among the WTA elites. Last week in Dubai, she won her first Premier-level title since 2015, knocking out four top-ten players in the process. They were hardly dominant victories, with all four going to deciding sets and two of the four culminating in final-set tiebreaks, but there is no question that the 21-year-old Swiss is once again a threat at the tour’s biggest events.

Her string of top-ten victories leaves us to wonder how her title stacks up against similar feats in the past. Most relevant is the path Bencic took to her last Premier title, the 2015 Canadian Open. Four years ago in Toronto, she defeated four members of the top six, including then-top-ranked Serena Williams in the semi-final and Simona Halep in the championship match. Even the two lower-ranked opponents she faced that week were dangerous players then ranked in the top 25, Eugenie Bouchard and Sabine Lisicki. Those two presented more serious challenges than Bencic’s first two matches last week against Lucie Hradecka and Stefanie Voegele.

Spoiler alert: Toronto was the tougher path. It wasn’t the most difficult of all time, but it’s in the conversation. Bencic’s Dubai title surely wasn’t easy, but it wasn’t quite as unusual as last weekend’s press made it out to be.

Quantifying path difficulty

This is something we’ve done before. I’ve written several articles comparing the quality of opposition faced in slams, particularly as it applies to the ATP’s big three. It’s more complicated to compare all WTA events, in part because there are so many different levels of tournament, and the categorizations have changed over the years. But we can wave some of that aside for today’s purposes.

Here’s the simple algorithm to measure the difficulty of a player’s path to a title:

  • Pick a standard Elo rating for the type of tournament won. (In this case, we’re using 1900 for hard-court wins. We’d use lower numbers for clay and grass, but it gets complicated, and it’s more practical for today’s purposes to focus solely on hard-court events.)
  • Find the surface-weighted Elos of each opponent she played in the tournament
  • For each opponent, calculate the odds using the standard Elo rating and the opponent’s Elo rating.
  • Calculate the difficulty for each match as one minus the odds in the previous step.
  • Sum the single-match difficulties.

In the grand slam exercises I’ve done in the past, I’ve taken a final step of normalizing the results so that an average major title is exactly 1.0. Here, the idea of ‘average’ is more nebulous, so we’ll leave our results un-normalized.

The average difficulty of a hard-court title (excluding majors and year-end championships) is about 1.8. Bencic’s 2015 Toronto run was 3.64, and her path last week was 3.01.

It’s hotter in Miami (and Indian Wells)

One of the variables that influences path difficulty is number of matches. Bencic played six last week (as she did at the 2015 Canadian Open), but the top eight seeds played only five. At Indian Wells and Miami, the top 32 seeds play up to six matches, but those might be expected to present more challenges than Bencic’s six in Dubai, since the round-of-64 opponent has already won a match.

Certainly it has turned out that way. Here are the top ten most difficult hard-court WTA title paths since 2000:

Year  Event          Winner             Matches  Difficulty  
2010  Miami          Kim Clijsters            6        3.80  
2011  Miami          Victoria Azarenka        6        3.78  
2007  Miami          Serena Williams          6        3.65  
2015  Canadian Open  Belinda Bencic           6        3.64  
2012  Indian Wells   Victoria Azarenka        6        3.59  
2018  Cincinnati     Kiki Bertens             6        3.54  
2000  Miami          Martina Hingis           6        3.46  
2002  Miami          Serena Williams          6        3.45  
2008  Miami          Serena Williams          6        3.37  
2013  Miami          Serena Williams          6        3.35

Seven of the ten are from Miami, an event with a grand-slam-like field. Indian Wells is similar, but featured a weaker draw for most of the 21st century because Serena and Venus Williams chose not to play there. Bencic’s Toronto run is one of only two in the top ten outside of the March sunshine swing. The other is Kiki Bertens’s path to last year’s Cincinnati title, in which she also defeated Halep, Petra Kvitova, and Elina Svitolina, albeit not quite in the same order than Bencic did last week.

Also hot in Dubai

I calculated title difficulty for about 600 hard-court champions going back to 2000. Bencic’s Dubai path doesn’t register among the very most challenging, but it still stands above most of the pack. Here are the next 25 toughest routes, including every path rated a 3.0 or above:

Year  Event         Winner              Matches  Difficulty  
2016  Wuhan         Petra Kvitova             6        3.32  
2000  Indian Wells  Lindsay Davenport         6        3.32  
2014  Beijing       Maria Sharapova           6        3.30  
2008  Olympics      Elena Dementieva          6        3.27  
2009  Indian Wells  Vera Zvonareva            6        3.27  
2007  Indian Wells  Daniela Hantuchova        6        3.23  
2002  Filderstadt   Kim Clijsters             5        3.23  
2013  Beijing       Serena Williams           6        3.21  
2018  Doha          Petra Kvitova             6        3.18  
2002  Los Angeles   Chanda Rubin              5        3.18  
2000  Los Angeles   Serena Williams           5        3.16  
2009  Miami         Victoria Azarenka         6        3.15  
2003  Miami         Serena Williams           6        3.13  
2002  Indian Wells  Daniela Hantuchova        6        3.10  
2018  Wuhan         Aryna Sabalenka           6        3.08  
2008  Indian Wells  Ana Ivanovic              6        3.08  
2012  Tokyo         Nadia Petrova             6        3.08  
2010  Sydney        Elena Dementieva          5        3.06  
2010  Indian Wells  Jelena Jankovic           6        3.03  
2000  Sydney        Venus Williams            6        3.02  
2000  Sydney        Amelie Mauresmo           4        3.02  
2019  Dubai         Belinda Bencic            6        3.01  
2009  Tokyo         Maria Sharapova           6        3.00  
2002  San Diego     Venus Williams            5        3.00  
2001  Sydney        Martina Hingis            4        2.99

There’s Belinda again, at 32nd overall. Historically, the February tournaments in the Gulf haven’t been the toughest on the calendar, at least compared with Indian Wells, Miami, and Sydney. Yet Kvitova took an even more difficult path to the title last year in Doha. (Dubai and Doha trade tournament levels each year. As a Premier 5, Doha was worth more points in 2018; Dubai took over the status and was worth more points in 2019.) She also plowed through four top-ten opponents, and she needed to beat 33rd-ranked Agnieszka Radwanska just to earn a place in the round of 16.

Strong but weaker

Again, Bencic’s Dubai title was an impressive feat. But as we’ve seen, it pales in comparison with her previous Premier title. I suppose she might have won anyway if faced with more difficult competition, but that pair of third-set tiebreaks suggests she was pushed to the limit as it was.

While the current WTA field is extremely deep, packed with very good players, the lack of one historically great superstar (or more!) shows up in the Elo ratings. Of the 35 champions shown in the two tables above, 12 had to beat a player with a surface-weighted rating of 2240 or higher, and 12 more needed to get past an opponent rated 2100 or above. Bencic’s toughest task last week was Halep, at 2054. While it isn’t easy to knock off several consecutive foes in the 2000 range, it’s not the same as including one victory over a superstar like Serena, Venus, Maria Sharapova, or Victoria Azarenka at her peak.

At the 2015 Canadian Open, Bencic counted Serena among the vanquished. Maybe in another four years, when the Swiss is due for her next odds-defying Premier title, she’ll face down a couple of new young superstars and earn a place at the top of this list.