Podcast Episode 87: Author Sasha Abramsky on Lottie Dod, the Little Wonder

Episode 87 of the Tennis Abstract Podcast features Sasha Abramsky, author of the book Little Wonder: The Fabulous Story of Lottie Dod, the World’s First Female Sports Superstar.

Our wide-ranging chat covers many aspects of the life and times of this 19th century superstar, from her global legions of fans, to her “Battle of the Sexes”-style challenges 80 years before King-Riggs, to her unprecedented and varied string of sporting successes. We also touch on the relative dearth of tennis historiography, the chronological gap between Dod and the next generation of female athletic superstars, and whether there is a natural intersection between progressive politics and the compelling stories of tennis history.

This was a great conversation about a part of tennis history we don’t hear nearly enough about, so I hope you’ll check it out. And for the full account of Lottie Dod, be sure to pick up your copy of Sasha’s book.

Fans of the TA podcast will also want to check out Dangerous Exponents, the new Covid-19 podcast that Carl Bialik and I are doing. We released episode 3 yesterday.

(Note: this week’s episode is about 60 minutes long; in some browsers the audio player may display a different length. Sorry about that!)

Click to listen, subscribe on iTunes, or use our feed to get updates on your favorite podcast software.

Did Jimmy Connors Choke in the 1975 Wimbledon Final?

From our vantage point almost a half-century later, it’s easy to forget just how big an upset Arthur Ashe scored with his 1975 Wimbledon victory over Jimmy Connors. Connors was the top seed and defending champion, still riding high from a 1974 campaign that ranks among the best ever. Ashe was a few days short of his 32nd birthday, had a reputation of coming up short in finals, and had lost to Connors in their three previous meetings.

(For what it’s worth, my Elo algorithm thinks it was a much closer match than the bookies did at the time. It rated Ashe the second-best player in the tournament on grass courts, and gave the underdog a 39% chance of winning.)

Ashe ran away with the first two sets and held on to win in four, 6-1 6-1 5-7 6-4. Perhaps because the two men didn’t get along–apart from striking personality differences, Connors and his manager targeted Ashe with one of many lawsuits–the veteran was uncharacteristically critical of his opponent after the match. Ashe claimed that Connors missed many of his shots into the net (rather than long), a sign of choking.

Connors denied it, of course. It later came out that Jimmy was dealing with a foot problem which probably affected his play that day. In any case, fans and pundits surely had their fun debating whether Connors was a choker. I don’t know of anyone who took the question beyond simple speculation. No amount of statistical analysis can settle whether a player choked, but we can often answer adjacent questions to shed more light on the issue.

Counting errors

A couple of years ago I charted the Wimbledon final for the Match Charting Project, so we have a full count of errors–forced and unforced, serves and rallying shots, net and deep–for the entire match. We also have similar shot-by-shot stats for 25 other Connors matches for comparison. (Unfortunately, 24 of the 25 are chronologically later than the Ashe match, because there’s not much full-match footage from the early 70s.)

Here’s the tally: Excluding serves, Connors committed 13 unforced errors, 10 of them into the net. I recorded the type of error for 65 more forced errors: 32 into the net, 33 other. (Ashe was a netrusher, so many of Jimbo’s mistakes were failed passing shots.) On serve, he missed 29 first deliveries: 16 into the net, 13 otherwise. And his two second serve faults were split between one into the net and one elsewhere.

The unforced error split of 10-to-3 means that 77% of his UFEs were netted. That’s the most extreme of any of his charted matches; on average, his unforced errors were half nets, half others. While suggestive, that’s an awfully small sample from which to draw any conclusions.

Using larger samples that include forced errors and serves, the Wimbledon final doesn’t particularly stand out among other charted Connors matches. 54% of his non-serve errors (forced or unforced) in that match were netted, compared to 52% over the whole sample. 55% of his service faults against Ashe were hit into the net, versus 49% across the 26 matches. Altogether, Connors made 54% of his total errors and faults into the net in the Wimbledon final, compared to 51% in the broader sample.

Does it matter?

You’ve probably heard the tennis coaching conventional wisdom that it’s better to hit long than to hit into the net. Like most tennis shibboleths, this one has been around for a very long time. Ashe had surely heard it, which partly explains why he made the comment he did. Arthur didn’t have a printout with match stats generated by a consulting company with a gargantuan marketing budget, so he probably recalled a few key points and generalized from there.

If error types matter, we’d expect to see at least a mild correlation between results (say, percentage of points won) and error types. Let’s stay focused on the 26 charted Connors matches for today’s purposes. Here’s a version of the Ashe hypothesis, stripped of emotional content:

When Connors hits more errors than usual into the net, it’s a sign that he’s playing below his standard level.

It turns out that this theory is wrong–or, at best, possibly correct if narrowly defined. I considered five main stats as indicators of errors and faults going into the net:

  • Unforced errors (excluding double faults) into the net as a percentage of total unforced errors
  • Total rally errors (forced and unforced) into the net as a percentage of total errors
  • First serve faults into the net as a percentage of total first serve faults
  • All serve faults into the net as a percentage of all serve faults
  • All errors and faults into the net as a percentage of all errors and faults

The second (total rally errors) and last (all errors and faults) seem like the most valid of the five, because they give us a decent sample of error types for each match. There is almost exactly zero correlation between the last stat and total points won. And there is a very weak negative correlation (r^2 = 0.05) between the second stat and total points won.

In other words, the Ashe hypothesis might be on to something very minor if our focus in on rally shots. But the correlation is so weak that no human observer would ever notice it, unless they lucked into it by watching a few confirming key moments after being primed by the conventional wisdom.

He didn’t choke like that

I said above that statistical analysis couldn’t settle issues like whether a player choked. We can study what happened, but without machines hooked up to a player’s brain, we can’t tell what was going on inside their heads that might have caused it.

So we can’t say that Connors didn’t choke in the 1975 Wimbledon final. But we have seen that his percentage of into-the-net errors wasn’t that unusual for him (except for the small sample of unforced errors), and we’ve recognized that the number of mistakes he made into the net didn’t have much to say about his level of play that day. If Connors choked, then, it didn’t have anything to do with the low trajectory of his missed shots.

I learned of Ashe’s post-match comment in Raymond Arsenault’s excellent biography, Arthur Ashe: A Life.

Roger Federer Wasn’t Clutch, But He Was Almost Clutch Enough

Italian translation at settesei.it

The stats from the Wimbledon final told a clear story. Over five sets, Roger Federer did most things slightly better than did his opponent, Novak Djokovic. Djokovic claimed a narrow victory because he won more of the most important points, something that doesn’t show up as clearly on the statsheet.

We can add to the traditional stats and quantify that sort of clutch play. A method that goes beyond simply counting break points or thinking back to obviously key moments is to use the leverage metric to assign a value to each point, according to its importance. After every point of the match, we can calculate an updated probability that each player will emerge victorious. A point such as 5-all in a tiebreak has the potential to shift the probability a great deal; 40-15 in the first game of the match does not.

Leverage quantifies that potential. The average point in a best-of-five match has a leverage of about 4%, and the most important points are several times that. Another way of saying that a player is “clutch” is that he is winning a disproportionate number of high-leverage points, even if he underwhelms at low-leverage moments.

Leverage ratio

In my match recap at The Economist, I took that one step further. While Djokovic won fewer points than Federer did, his successes mattered more. The average leverage of Djokovic’s points won was 7.9%, compared to Federer’s 7.2%. We can represent that difference in the form of a leverage ratio (LR), by dividing 7.9% by 7.2%, for a result of 1.1. A ratio of that magnitude is not unusual. In the 700-plus men’s grand slam matches in the Match Charting Project, the average LR of the more clutch player is 1.11. Djokovic’s excellence in key moments was not particularly rare, but in a close match such as the final, it was enough to make the difference.

Recording a leverage ratio above 1.0 is no guarantee of victory. In about 30% of these 700 best-of-five matches, a player came out on top despite winning–on average–less-important points than his opponent did. Some of the instances of low-LR winners border on the comical, such as the 2008 French Open final, in which Rafael Nadal drubbed Federer despite a LR of only 0.77. In blowouts, there just isn’t that much leverage to go around, so the number of points won matters a lot more than their timing. But un-clutch performances often translate to victory even in closer matches. Andy Murray won the 2008 US Open semi-final over Nadal in four sets despite a LR of 0.80, and in a very tight Wimbledon semi-final last year, Kevin Anderson snuck past John Isner with a LR of 0.88.

You don’t need a spreadsheet to recognize that tennis matches are decided by a mix of overall and clutch performance. The numbers I’ve shown you so far don’t advance our understanding much, at least not in a rigorous way. That’s the next step.

DR, meet BLR

Regular users of Tennis Abstract player pages are familiar with Dominance Ratio (DR), a stat invented by Carl Bialik that re-casts total points won. DR is calculated by dividing a player’s rate of return points won by his rate of service points lost (his opponent’s rate of return points won), so the DR for a player who is equal on serve and return is exactly 1.0.

Winners are usually above 1.0 and losers below 1.0. In the Wimbledon final, Djokovic’s DR was 0.87, which is extremely low for a winner, though not unheard of. DR balances the effect of serve performance and return performance (unlike total points won, which can skew in one direction if there are many more serve points than return points, or vice versa) and gives us a single-number summary of overall performance.

But it doesn’t say anything about clutch, except that when a player wins with a low DR, we can infer that he outperformed in the big moments.

To get a similarly balanced view of high-leverage performance, we can adapt leverage ratio to equally weight clutch play on serve and return points. I’ll call that balanced leverage ratio (BLR), which is simply the average of LR on serve points and LR on return points. BLR usually doesn’t differ much from LR, just as we often get the same information from DR that we get from total points won. Djokovic’s Wimbledon final BLR was 1.11, compared to a LR of 1.10. But in cases where a disproportionate number of points occur on one player’s racket, BLR provides a necessary correction.

Leverage-adjusted DR

We can capture leverage-adjusted performance by simply multiplying these two numbers. For example, let’s take Stan Wawrinka’s defeat of Djokovic in the 2016 US Open final. Wawrinka’s DR was 0.90, better than Djokovic at Wimbledon this year but rarely good enough to win. But win he did, thanks to a BLR of 1.33, one of the highest recorded in a major final. The product of Wawrinka’s DR and his BLR–let’s call the result DR+–is 1.20. That number can be interpreted on the same scale as “regular” DR, where 1.2 is often a close victory if not a truly nail-biting one. DR+ combines a measure of how many points a player won with a measure of how well-timed those points were.

Out of 167 men’s slam finals in the Match Charting Project dataset, 14 of the winners emerged triumphant despite a “regular” DR below 1.0. In every case, the winner’s BLR was higher than 1.1. And in 13 of the 14 instances, the strength of the winner’s BLR was enough to “cancel out” the weakness of his DR, in the sense that his DR+ was above 1.0. Here are those matches, sorted by DR+:

Year  Major            Winner              DR   BLR   DR+  
2019  Wimbledon        Novak Djokovic    0.87  1.11  0.97  
1982  Wimbledon        Jimmy Connors     0.88  1.20  1.06  
2001  Wimbledon        Goran Ivanisevic  0.95  1.16  1.10  
2008  Wimbledon        Rafael Nadal      0.98  1.13  1.10  
2009  Australian Open  Rafael Nadal      0.99  1.13  1.12  
1981  Wimbledon        John McEnroe      0.99  1.16  1.15  
1992  Wimbledon        Andre Agassi      0.97  1.19  1.16  
1989  US Open          Boris Becker      0.96  1.22  1.18  
1988  US Open          Mats Wilander     0.98  1.21  1.18  
2015  US Open          Novak Djokovic    0.98  1.21  1.18  
2016  US Open          Stan Wawrinka     0.90  1.33  1.20  
1999  Roland Garros    Andre Agassi      0.98  1.25  1.23  
1990  Roland Garros    Andres Gomez      0.94  1.34  1.26  
1991  Australian Open  Boris Becker      0.99  1.30  1.29

167 slam finals, and Djokovic-Federer XLVIII was the first one in which the player with the lower DR+ ended up the winner. (Some of the unlisted champions had subpar leverage ratios and thus DR+ figures lower than their DRs, but none ended up below the 1.0 mark.) While Federer was weaker in the clutch–notably in tiebreaks and when he held match points–his overall performance in high-leverage situations wasn’t as awful as those few memorable moments would suggest. More often than not, a player who combined Federer’s DR of 1.14 with his BLR of 0.90 would conclude the Wimbledon fortnight dancing with the Ladies’ champion.

Surprisingly, 1-out-of-167 might understate the rarity of a winner with a DR+ below 1.0. Only one other best-of-five match in the Match Charting Project database (out of more than 700 in total) fits the bill. That’s the controversial 2019 Australian Open fourth-rounder between Kei Nishikori and Pablo Carreno Busta. Nishikori won with a 1.06 DR, but his BLR was a relatively weak 0.91, resulting in a DR+ of 0.97. Like the Wimbledon final, that Melbourne clash could have gone either way. Carreno Busta may have been unlucky with more than just the chair umpire’s judgments.

What does it all mean?

We knew that the Wimbledon final was close–now we have more numbers to show us how close it was. We knew that Djokovic played better when it mattered, and now we have more context that indicates how much better he was, which is not a unusually wide margin. Federer has won five of his slams despite title-match BLRs below 1.0, and two others with DRs below 1.14. He’s never won a slam with a DR+ of 1.03 or lower, but then again, there had never before been a major final that DR+ judged to be that close. Roger is no one’s idea of a clutch master, but he isn’t that bad. He just should’ve saved a couple of doses of second-set dominance for more important junctures later on.

If you’re anything like me, you’ll read this far and be left with many more questions. I’ve started looking at several, and hope to write more in this vein soon. Is Federer usually less clutch than average? (Yes.) Is Djokovic that much better? (Yes.) How about Nadal? (Also better.) Is Nadal really better, or do his leverage numbers just look good because important points are more likely to happen in the ad court? (No, he really is better.) Does Djokovic have Federer’s number? (Not really, unless you mean his mobile number. Then yes.) Did everything change after Djokovic hit that return? (No.)

There are many interesting related topics beyond the big three, as well. I started writing about leverage for subsets of matches a few years ago, prompted by another match–the 2016 Wimbledon Federer-Raonic semi-final–in which Roger got outplayed when it mattered. Just as we can look at average leverage for points won and lost, we can also estimate the importance of points in which a player struck an ace, hit a backhand unforced error, or chose to approach the net.

Matches are decided by a combination of overall performance and high-leverage play. Commonly-available stats do a pretty good job at the former, and fail to shine much light on the latter. The clutch part of the equation is often left to the speculation of pundits. As we build out a more complete dataset and have access to more and more point-by-point data (and thus leverage numbers for each point and match), we can close the gap, enabling us to better quantify the degree to which situational performance affects every player’s bottom line.

Will a Back-To-Normal Federer Backhand Be Good Enough?

Italian translation at settesei.it

After Roger Federer’s 2017 triumph over Rafael Nadal at the Australian Open, I credited his narrow victory to his backhand. He came back from the injury that sidelined him for the second half of 2016 having strengthened that wing, ready with the tactics necessary to use it against his long-time rival. Since that time, he has beaten Nadal in five out of six meetings, suggesting that the new-and-improved weapon has remained a part of his game.

The Swiss is riding high after defeating Rafa once again in the Wimbledon semi-finals on Friday. But unlike in Melbourne two-and-a-half years ago, the backhand wasn’t responsible for the victory. In the Australian Open final, Federer’s stylish one-hander earned him 11 more points than in a typical contest, enough to flip the result in his favor. On Friday, Nadal had little reason to fear a Federer backhand that was only a single point better than average. The Swiss owes his semi-final result to some stellar play, but not from his backhand.

BHP redux

I’m deriving these numbers from a stat called Backhand Potency (BHP), which uses Match Charting Project shot-by-shot data to isolate the effect of each one of a player’s shots. The formula is straightforward:

[A]dd one point for a winner or an opponent’s forced error, subtract one for an unforced error, add a half-point for a backhand that set up a winner or opponent’s error on the following shot, and subtract a half-point for a backhand that set up a winning shot from the opponent. Divide by the total number of backhands, multiply by 100, and the result is net effect of each player’s backhand.

The average player hits about 100 backhands per match, so the final step of multiplying by 100 gives us an approximate per-match figure. BHP hands out up to 1.5 “points” per tennis point, since credit is given for both a winning shot and the shot that set it up. Thus, to translate BHP (or any other potency metric, like Forehand Potency, FHP) to points, multiply by two-thirds. In the 2017 Australian Open final, Federer’s backhand was worth +17 BHP, equal to about 11 points.

On Friday, Roger’s backhand was worth only +1 BHP. The best thing we can say about that is that it didn’t hold him back–the sort of comment we might have made as he racked up wins for the first 15 years of his career.

The semi-final performance wasn’t an outlier. In a year-to-year comparison based on the available (admittedly incomplete) MCP data, the 2019 backhand looks an awful lot like the pre-injury backhand:

Year(s)     BHP  
1998-2011  +0.1  
2012       +0.4  
2013       -1.8  
2014       -1.1  
2015       +1.3  
2016       -0.3  
2017       +3.5  
2018       +1.3  
2019       +0.8

There are still good days, like Fed’s whopping +16 BHP against Kei Nishikori in this week’s quarter-finals. But when we tally up all the noise of good and bad days, effective and ineffective opponents, and fast and slow conditions, the net result is that the backhand just doesn’t rack up points the way it did two years ago.

The backhand versus Novak

Federer’s opponent in today’s final, Novak Djokovic, is known for his own rock-solid groundstrokes. Like Nadal did for many years, Djokovic is able to expose the weaker side of Federer’s baseline game. The Serbian has won the last five head-to-head meetings, and nine of the last eleven. In most of those, he reduces Roger’s backhand to a net negative:

Year  Tournament        Result  BHP/100  
2018  Paris             L         -11.0  
2018  Cincinnati        L         -11.0  
2016  Australian Open   L         -12.6  
2015  Tour Finals (F)   L          -4.8  
2015  Tour Finals (RR)  W          +0.7  
2015  US Open           L          +0.8  
2015  Cincinnati        W          -2.2  
2015  Wimbledon         L         -13.4  
2015  Rome              L         -12.2  
2015  Indian Wells      L          -5.0  
2015  Dubai             W          -5.9  
…                                        
2014  Wimbledon         L          -3.1  
2012  Wimbledon         W          +9.6

Out of 438 charted matches, Federer’s BHP was below -10 only 27 times. On nine of those occasions–and two of the five since Fed’s 2017 comeback–the opponent was Djokovic. Incidentally, Novak would do well to study how Borna Coric dismantles the Federer backhand, as Fed suffered his two worst post-injury performances (-20 at 2018 Shanghai, and -19 at 2019 Rome) against the young Croatian.

It is probably too much to ask for Federer to figure out how to beat Djokovic at his own game. The best he can do is minimize the damages by serving big and executing on the forehand. The Swiss has a career average +9 Forehand Potency (FHP), but falls to only +4 FHP against Novak. In last year’s Cincinnati final, Djokovic reduced his opponent to an embarrassing -13 FHP, the worst of his career. It wasn’t a fluke: four of Fed’s five worst single-match FHP numbers have come against the Serb.

If Federer is to win a ninth Wimbledon title, he’ll need to rack up points on at least one wing–either his typical forehand, or the backhand in the way he did against Djokovic in the 2012 semi-final. Whichever one does the damage, he’ll also need the other one to remain steady. His forehand was plenty effective in the semi-final against Nadal, worth +12 FHP in that match. Against a player like Novak who defends even better on a fast surface, Federer will need to somehow tally similar results. It’s a lot to ask, and one thing is certain: No one would be able to complain that his 21st major title came cheaply.

The Effect of Serena’s Serve Speed

Italian translation at settesei.it

Yesterday at FiveThirtyEight, Tom Perrotta highlighted the relationship between Serena Williams’s first serve performance and her chances of winning. According to the article, Serena has won only (“only”) 74% of her first serve points over the fortnight, compared to an outlandish 87.5% when she won the title in 2010. She has never won Wimbledon while winning fewer than 75% of her first-serve points, and even the three-quarters mark is no guarantee, as she topped 77% last year en route to a second-place finish.

A lot of factors go into first-serve winning percentage, including serve placement, serve tactics, and all the shots that a player hits when the return comes back. The most obvious, though, is another category in which Serena has often topped the charts: serve speed. When Williams beat Garbine Muguruza to win the Championships in 2015, her average first serve clocked in at 113 miles per hour, the third straight match in which her typical first delivery topped 111 mph. Over her last 13 matches, she has averaged only (“only”) 106.4 mph, never exceeding 109 mph in a single contest.

How much does it matter?

It seems fair to assume that, all else equal, a faster serve is more effective than a slower one. Complicating things is the fact that all else is rarely equal: wide serves are often deadly despite requiring less raw power, more conservative serves can be easier to place, andwe haven’t even scratched the surface of the effect of spin. A faster serve isn’t always better than a slower one. But on average, the basic assumption holds true.

For each of Serena’s 23 matches at Wimbledon 2014, 2015, 2018, and 2019 (she didn’t play in 2017, and I don’t have the relevant data at hand for 2016–don’t ask), I split her first serve points into quintiles, ranked from fastest serves to slowest serves. This is a crude way of controlling for the effects of different opponents and giving us an initial sense of how much Serena’s serve speed influences the outcome of first-serve points:

Quintile     1SP W%  Avg MPH  
Fastest       80.6%    116.9  
2nd fastest   73.7%    112.2  
Middle        79.5%    108.0  
2nd slowest   73.7%    103.7  
Slowest       74.9%     98.1

Clearly, serve speed doesn’t tell the whole story. At the same time, it looks like a 117 mph serve–or even a 108 mph one–is a better bet than a 98 mph offering.

Another way to isolate the effect of serve speed is to ignore the influence of specific opponents and simply sort first serves by miles per hour. From these 23 matches, we have 43 first serves recorded at exactly 100 mph, with a corresponding winning percentage of 72.1%. Serena hit 33 first serves at 101 mph, of which she won 72.7%. While the winning percentages don’t usually move so neatly in lockstep with first serve speed, there is a general trend:

The correlation is a loose one: winning percentages at 99 mph and 103 mph are better than those at 116 mph and 117 mph, for example. We could attribute that to the possibility that the slower serves are tactically savvier, or more approximate placement of the faster deliveries, or just dumb luck, because our sample size at any specific speed isn’t that great. Still, we can draw an approximate conclusion:

Each additional two miles per hour of first-serve speed is worth an additional one percentage point to Serena’s 1st serve winning percentage.

To take it one step further: Serena usually lands about 60% of her first serves, and roughly half of total points will be on her serve, so each additional two miles per hour of first-serve speed is worth an additional 0.6 percentage points of total points won. In a close match, like her 2014 loss to Alize Cornet–in which she averaged only 104 mph on her first serves and won exactly 50% of the points played–that could be the difference.

Serena in context

The same general rule cannot be applied to all women. (Several years ago, I took a similar look at ATP serve speeds, and–perhaps foolishly–I didn’t break it down by player.) I ran the same algorithm on the recent Wimbledon records of the nine other women for whom I have at least 15 matches worth of data. The effect of serve speed varies from “quite a bit” for Johanna Konta to “not at all” for Venus Williams and “I don’t understand the question” for Caroline Wozniacki.

The following table shows two numbers for each player. The “Addl MPH =” column shows the effect of one additional mile per hour on first serve winning percentage, and the “_ MPH = 1% SPW” column shows how many additional miles per hour are required to increase first serve winning percentage by one percentage point:

Player               Addl MPH =  MPH = 1% SPW  
Johanna Konta             0.89%           1.1  
Angelique Kerber          0.56%           1.8  
Serena Williams           0.48%           2.1  
Garbine Muguruza          0.47%           2.1  
Simona Halep              0.41%           2.5  
Petra Kvitova             0.29%           3.5  
Agnieszka Radwanska       0.28%           3.6  
Victoria Azarenka         0.02%          50.9  
Venus Williams            0.00%             -  
Caroline Wozniacki       -0.40%             - 

Konta’s serve speed is almost twice as important to her first-serve success as Serena’s is. Her average first-serve speed in her quarter-final loss to Barbora Strycova was 99.9 mph, her lowest at Wimbledon since a first-round loss in 2014.

At the opposite extreme, we have Victoria Azarenka and Venus, for whom serve speed doesn’t seem to matter. (Venus, for one, excels at the deadly wide serve, which she converts into aces regardless of speed.) Wozniacki apparently lulls her opponents into confusion and illogic, giving her better results on slower first serves.

Serena vs Simona

These are small effects, so even the range between Serena’s slowest serving performance this fortnight (105 mph first serves against Carla Suarez Navarro) and the 2015 final against Muguruza would only have effect Serena’s total points won by about 2.5 percentage points. Nine out of ten times Williams and Halep have gone head to head, Serena has come out on top, always with more than 52.5% of total points, usually with more than 55%. That’s an ample margin of error–or, more precisely, margin of slow serving.

On the other hand, the most recent Serena-Simona contest, the only time they’ve played since 2016, was the closest of the lot. Halep is a great returner, but she is not immune to powerful serving: her rate of return points won is affected by serve speed just as much as Williams’s serve stats are. The gap between the finalists could be narrow, and Serena’s serve speed is one of the few tools completely in her own power that she could deploy to tilt the scales in her favor.

Slow Conditions Might Just Flip the Outcome of Federer-Nadal XL

Italian translation at settesei.it

Roger Federer likes his courts fast. Rafael Nadal likes them slow. With eight Wimbledon titles to his name, Federer is the superior grass court player, but the conditions at the All England Club have been unusually slow this year, closer to those of a medium-speed hard court.

On Friday, Federer and Nadal will face off for the 40th time, their first encounter at Wimbledon since the Spaniard triumped in their historical 2008 title-match battle. Rafa leads the head-to-head 24-15, including a straight-set victory at his favorite slam, Roland Garros, several weeks ago. But before that, Roger had won five in a row–all on hard courts–the last three without dropping a set.

Because of the contrast in styles and surface preferences, the speed of the conditions–a catch-all term for surface, balls, weather, and so on–is particularly important. Nadal is 14-2 against his rival on clay, with Federer holding a 13-10 edge on hard and grass. Another way of splitting up the results is by my surface speed metric, Simple Speed Rating (SSR). 22 of the matches have been been on a court that is slower than tour average, with the other 17 at or above tour average speed:

Matches     Avg SSR  RN - RF  Unret%  <= 3 shots  Avg Rally  
SSR < 0.92     0.74     17-5   21.2%       49.5%        4.7  
SSR >= 1.0     1.14     7-10   27.0%       56.9%        4.3

At faster events–all of which are on hard or grass–fewer serves come back, more points end by the third shot, and the overall rally length is shorter. Fed has the edge, with 10 wins in 17 tries, while on slower surfaces–all of the clay matches, plus a handful of more stately hard courts–Rafa cleans up.

Rafa broke Elo

According to my surface-weighted Elo ratings, Federer is the big semi-final favorite. He leads Nadal by 300 points in the grass-only Elo ratings, which gives him a 75% chance of advancing to the final. The betting market strongly disagrees, believing that Rafa is the favorite, with a 57% chance of winning.

The collective wisdom of the punters is onto something. Elo has systematically underwhelmed when it comes to forecasting the 39 previous Fedal matches. Federer has more often been the higher-rated player, and if Roger and Rafa behaved like the algorithm expected them to, the Swiss would be narrowly leading the head-to-head, 21-18. We might reasonably conclude that, going into Friday’s semi-final, Elo is once again underestimating the King of Clay.

How big of Fedal-specific adjustment is necessary? I fit a logit model to the previous 39 matches, using only the surface-weighted Elo forecast. The model makes a rough adjustment to account for Elo’s limitations, and reduces Roger’s chances of winning the semi-final from 74.8% all the way down to 48.5%.

Now, about those conditions

The updated 48.5% forecast takes the surface into account–that’s part of my Elo algorithm. But it doesn’t distinguish between slow grass and fast grass.

To fix that, I added SSR, my surface speed metric, to the logit model. The model’s prediction accuracy improved from 64% to 72%, its Brier score dropped slightly (a lower Brier score indicates better forecasts), and the revised model gives us a way of making surface-speed-specific forecasts for this matchup. Here are the forecasts for Federer at several surface speed ratings, from tour average (1.0) to the fastest ratings seen on the circuit:

SSR  p(Fed Wins)  
1.0        49.3%  
1.1        51.4%  
1.2        53.4%  
1.3        55.5%  
1.4        57.5%  
1.5        59.5% 

In the fifteen years since Rafa and Roger began their rivalry, the Wimbledon surface has averaged around 1.20, 20% quicker than tour average. In 2006, when they first met at SW19, it was 1.24, and in 2008, it was 1.15. Three times in the last decade it has topped 1.30, 30% faster than the average ATP surface. This year, it has dropped almost all the way to average, at 1.00, when both men’s and women’s results are taken into account.

As the table shows, such a dramatic difference in conditions has the potential to influence the outcome. On a faster surface, which we’ve seen as recently as 2014, Federer has the edge. At this year’s apparent level, the model narrowly favors Nadal. Rafa has said that the surface itself is unchanged, but that the balls have been heavier due to humidity. He should hope for another muggy day on Friday–the end result could depend on it.

The Grass Dies, But the Speed Lives On

Italian translation at settesei.it

Earlier this week, I trotted out some stats showing that the Wimbledon grass is playing slower this year, the latest tick in a years-long trend. Many fans suspect that by the second week, the conditions are even slower still, with huge brown spots around each baseline where the players have worn away the grass. Assuming that the dying-grass effect is similar each year, this is something we can test.

I ran my surface speed algorithm for several subsets of Wimbledon men’s singles matches: week 1, week 2, each round from 1 to 4, and the final 8. For a single year, the “week 2,” “round 4,” and “final 8” samples are too small to give us any reliable indicators. But over the course of two decades, the differences between weeks and rounds–the effect we’re interested in today–should become clear.

(Quick refresher on my surface speed method: It uses ace rate as a proxy for speed–not perfect, but functional, using a stat that is universally available–and takes into account the server and returner in each match. An average court speed is 1.0, and ratings typically range from about 0.5 for a venue like Monte Carlo to 1.5 for the fastest grass and indoor hard courts.)

For example, here are the week-by-week and round-by-round speed ratings for the 2018 Wimbledon men’s draw:

  • Week 1: 1.16
  • Week 2: 1.16
  • Round 1: 1.02
  • Round 2: 1.29
  • Round 3: 1.33
  • Round 4: 1.25
  • Last 8: 1.08

I promised noise, and there it is. Each week is equally speedy, but the first round and last few rounds are oddly slower than the rest. I don’t have a good explanation for the first round (and there might not be one–it could be random), but the last 8 often features fewer aces, even when adjusting for the players involved. We’ll come back to that in a bit.

Wimbledon, 2000-18

Here are the same numbers, averaged over the last 19 Wimbledons:

  • Week 1: 1.20
  • Week 2: 1.21
  • Round 1: 1.19
  • Round 2: 1.20
  • Round 3: 1.21
  • Round 4: 1.25
  • Last 8: 1.16

The sample of the last 8 still deviates from the rest, but with more data, the difference is much smaller. The gap between 1.20 and 1.16 is just an ace or two per match. That’s not enough to reverse the outcome of any but the very closest matches.

As usual, I must acknowledge that an ace-based metric isn’t definitive. There’s more to court speed than what aces can tell us. It’s possible that the surface behaves differently as the grass is worn away, even if it doesn’t show up in serve stats. Since net approaches are increasingly rare, the service-box grass lasts longer than the baseline grass, meaning that the speed at which serves move through the court would be relatively unchanged. On the other hand, the biggest brown spots on court are behind the baseline, so most groundstrokes also bounce on green grass, not on brown dirt.

The best versus the best

Even the small difference between the last 8 and the rest of the tournament may not have anything to do with the decaying of the surface. Since 2000, the US Open has exhibited the same trend: 1.07 for week 1, 1.06 for round 4, and 0.97 for the final 8. (The Australian Open numbers are much noisier than the other slams, perhaps due to frequent use of the roof, so I’m hesitant to use them.)

It seems safe to assume that the hard courts in Flushing don’t suddenly get slower starting on Tuesday or Wednesday of the second week. Instead, I think the answer is in the mix of players–or more precisely, how those players interact with each other. By this ace-based metric, the Tour Finals have often been rated as one of the slowest indoor hard court events–even though the official Court Pace Index (CPI) ratings disagree.

In other words, aces tend to go down when the best play the best. Maybe the elites serve more tactically when facing tough opponents? Perhaps they focus more consistently on return, rarely allowing cheap aces? Maybe the best players know each other’s games so well that they anticipate even better than usual? This seems like an interesting line of research, even if it’s not something I’m going to resolve today.

The bottom line is that partly-brown Wimbledon courts play just about as fast as totally-green Wimbledon courts do. There might be a very minor slowdown toward the end of the fortnight, but even there, we should remain skeptical. The conditions are slow this year, but at least they won’t get much slower.

Introducing Elo Ratings for Mixed Doubles

Scroll down for Wimbledon updates, including a forecast for the title match.

With Andy Murray and Serena Williams pairing up in this year’s Wimbledon mixed doubles event, more eyes than ever are on tournament’s only mixed-gender draw. Mixed doubles is played just four times a year (plus the Olympics, the occasional exhibition, and the late Hopman Cup), so most partnerships are temporary, and it’s tough to get a sense of who is particularly good in the dual-gender format.

That’s where math comes into play. Over the last few years, I’ve deployed a variation of the Elo rating algorithm for men’s doubles. It treats each team as the average of the two members, and after every match, it adjusts each player’s rating based on the result and the quality of the opponent. Doubles Elo–D-Lo–is even better suited for mixed than for single-gender formats, because players rarely stick with the same partner. The main drawback of D-Lo for men’s or women’s doubles is that it doesn’t help us tease out the individual contributions of long-time teams such as Bob and Mike Bryan. By contrast, mixed doubles draws often look like a game of musical chairs from one major to the next.

The rating game

Let’s jump right in. The Wimbledon mixed doubles draw consists of 56 teams. Here are the 10 highest-rated of those 112 players, as of the start of the fortnight:

Rank  Player                 XD-Lo  
1     Venus Williams          1855  
2     Serena Williams         1847  
3     Bethanie Mattek-Sands   1834  
4     Jamie Murray            1809  
5     Ivan Dodig              1793  
6     Latisha Chan            1785  
7     Bruno Soares            1776  
8     Leander Paes            1771  
9     Heather Watson          1770  
10    Gabriela Dabrowski      1760

Serena and Venus Williams require a bit of an asterisk, since both are playing mixed for the first time after a long break. Venus last played at the 2016 Olympics, and Serena last competed in mixed at the 2012 French Open. Maybe they’re rusty. My XD-Lo algorithm doesn’t include any kind of adjustment for injuries or other layoffs, so it’s possible that we should expect them to perform at a lower level. On the other hand, they are among the greatest doubles players of all time, and players tend to age gracefully in doubles. Venus lost her opening match, but perhaps we should blame that on Francis Tiafoe (XD-Lo: 1,494). The sisters will probably trade places at the top of the list once Wimbledon results are incorporated.

Murray’s rating is a decent but more pedestrian 1,648, so Murray/Williams is not the best team in the field. But they’re close. The strongest pair is Jamie Murray and Bethanie Mattek Sands–3rd and 4th on the list above–followed by Ivan Dodig and Latisha Chan, 5th and 6th on the individual list. Due to the vagaries of ATP and WTA doubles rankings and the resulting seedings, Dodig/Chan entered the event as the narrow favorites, because they got a first-round bye and Murray/Mattek-Sands did not.

Here are the top ten teams in the draw:

Rank  Team                                XD-Lo  
1     Bethanie Mattek-Sands/Jamie Murray   1822  
2     Ivan Dodig/Latisha Chan              1789  
3     Bruno Soares/Nicole Melichar         1762  
4     Serena Williams/Andy Murray          1748  
5     Gabriela Dabrowski/Mate Pavic        1734  
6     Leander Paes/Samantha Stosur         1731  
7     Heather Watson/Henri Kontinen        1708  
8     Venus Williams/Frances Tiafoe        1674  
9     Abigail Spears/Marcelo Demoliner     1653  
10    Neal Skupski/Chan Hao-ching          1634

The top five have survived (though Murray/Mattek-Sands and Pavic/Dabrowski will complete their second-round match this afternoon, leaving only four), and of the last 18 teams standing, only one other one–John Peers and Shuai Zhang–is rated above 1,600.

Forecasting SerAndy

Using my ratings, Murray/Williams entered the tournament with a 9.8% chance of winning. That made them fourth favorite, behind Dodig/Chan (17.1%), Murray/Mattek-Sands (16.3%), and the big-serving duo of Bruno Soares and Nicole Melichar (14.5%). I’ll update the forecast this evening, when the second round is finally complete.

Murray/Williams’s second-round match is against Fabrice Martin and Racquel Atawo. They are both excellent doubles players, though neither has excelled in mixed. Atawo, especially, has struggled. Her XD-Lo is 1,304, the third-lowest of anyone who has entered a mixed draw since 2012. (Shuai Peng is rated 1,268, and Marc Lopez owns last place with 1,252.) A player with no results at all enters the system with 1,500 points, so falling to 1,300 requires a lot of losing. The combined ratings translate into a 89% chance of a Murray/Williams victory.

The challenge comes in the third round. Soares/Melichar are the top seed, and they have already advanced to the round of 16, awaiting the winner of Murray/Williams and Martin/Atawo. Thus two of of the top four teams will likely play for a place in the quarter-finals, with Soares/Melichar holding a narrow, 52% edge.

Historical peaks

Generating these current ratings required amassing a lot of data, so it would be a waste to ignore the history of the mixed doubles format. Here are the top 25 female mixed doubles players, ranked by their peak XD-Lo ratings:

Rank  Player                   Peak  
1     Billie Jean King         2043  
2     Greer Stevens            2035  
3     Margaret Court           2015  
4     Rosie Casals             2000  
5     Martina Navratilova      1998  
6     Helena Sukova            1991  
7     Anne Smith               1989  
8     Betty Stove              1985  
9     Jana Novotna             1977  
10    Martina Hingis           1964  
11    Wendy Turnbull           1956  
12    Kathy Jordan             1948  
13    Elizabeth Smylie         1947  
14    Arantxa Sanchez Vicario  1946  
15    Serena Williams          1942  
16    Venus Williams           1937  
17    Francoise Durr           1934  
18    Jo Durie                 1929  
19    Kristina Mladenovic      1922  
20    Zina Garrison            1901  
21    Samantha Stosur          1898  
22    Larisa Neiland           1891  
23    Lindsay Davenport        1888  
24    Victoria Azarenka        1887  
25    Natasha Zvereva          1886 

Venus really can’t catch a break. She’s one of the best players of all time, and Serena is always just a little bit better.

And the top 25 men:

Rank  Player               Peak XD-Lo  
1     Owen Davidson              2043  
2     Bob Hewitt                 2042  
3     Marty Riessen              2016  
4     Todd Woodbridge            2000  
5     Frew McMillan              1999  
6     Kevin Curren               1997  
7     Jim Pugh                   1995  
8     Ilie Nastase               1975  
9     Tony Roche                 1962  
10    Bob Bryan                  1949  
11    Rick Leach                 1938  
12    Mahesh Bhupathi            1933  
13    Mark Woodforde             1929  
14    Justin Gimelstob           1929  
15    Max Mirnyi                 1926  
16    John Lloyd                 1922  
17    Emilio Sanchez             1918  
18    Ken Flach                  1909  
19    Jeremy Bates               1908  
20    John Fitzgerald            1906  
21    Cyril Suk                  1902  
22    Wayne Black                1889  
23    Dick Stockton              1881  
24    Jean-Claude Barclay        1879  
25    Mike Bryan                 1875

Owen Davidson won eight mixed slams with Billie Jean King, plus three more with other partners. Bob Hewitt won six, spanning 18 years from 1961 to 1979. (We can’t erase his accomplishments from the history books, but any mention of Hewitt comes with the caveat that he is a convicted rapist who has since been expelled from the International Tennis Hall of Fame.)

It is interesting to see two famous pairs represented on the men’s list. Bob Bryan ranks 10th to Mike’s 25th, and Todd Woodbridge comes in 4th to Mark Woodforde’s 13th. We probably can’t conclude from mixed doubles results that one member of the team was a superior men’s doubles player, but it is one of the few data points that allows us to compare these partners.

The ignominious Spaniards

Finally, I can’t spend this much time with mixed doubles ratings without revisiting the case of David Marrero. You may recall the 2016 Australian Open, when Marrero’s first-round match with Lara Arruabarrena triggered “suspicious betting patterns.” As I wrote at the time, the most suspicious thing about it was that Marrero–who was terrible at mixed doubles and admitted that he played differently with a woman across the net–could still find a partner.

He entered that match with an XD-Lo rating of 1,349–the worst of any man in the draw, though Anastasia Pavlyuchenkova was a few points lower–and left it at 1,342. He played his last mixed doubles match at Wimbledon that year, and–surprise!–he lost. One hopes he’ll stick to men’s doubles for the remainder of his career, sticking with an XD-Lo rating of 1,326.

Marrero’s only saving grace is that he’s better than his compatriot Marc Lopez. Lopez has been active in mixed doubles more recently, entering the US Open last year with Arruabarrena. After that loss, he fell to his current rating of 1,252, the lowest mark recorded in the Open Era.

Fortunately for us, this year’s Wimbledon draw includes both Williams sisters, both Murray brothers, a healthy Mattek-Sands … and very few players as helpless in the mixed doubles format as Marrero or Lopez.

Update: Murray/Williams won their second-rounder, setting up the final 16. Mixed doubles isn’t the top scheduling priority, so it didn’t exactly work that way–by the time Muzzerena advanced, two other teams had already secured places in the quarter-finals. Ignoring those for the moment, here is the last-16 forecast:

Team                      QF     SF      F      W  
Soares/Melichar        52.5%  44.5%  33.2%  18.8%  
Murray/Williams        47.5%  39.7%  29.0%  15.8%  
Middelkoop/Yang        55.5%   9.5%   3.6%   0.8%  
Daniell/Brady          44.5%   6.3%   2.1%   0.4%  
Peers/Zhang            61.6%  36.9%  13.8%   5.2%  
Lindstedt/Ostapenko    38.4%  18.7%   5.2%   1.5%  
Skugor/Olaru           56.2%  26.3%   8.3%   2.6%  
Mektic/Rosolska        43.8%  18.0%   4.8%   1.3%  
                                                   
Player                    QF     SF      F      W  
Koolhof/Peschke        42.6%  10.1%   2.4%   0.6%  
Qureshi/Kichenok       57.4%  16.7%   4.9%   1.5%  
Sitak/Siegemund        27.4%  16.0%   5.3%   1.8%  
Pavic/Dabrowski        72.6%  57.2%  30.8%  17.5%  
Dodig/Chan             75.9%  64.6%  44.1%  28.1%  
Roger-Vasselin/Klepac  24.1%  15.5%   6.6%   2.5%  
Hoyt/Silva             54.1%  11.3%   3.5%   1.0%  
Vliegen/Zheng          45.9%   8.6%   2.5%   0.6% 

The two teams already in the quarters are Skugor/Olaru and Hoyt/Silva. Since both of their matches were close to 50/50, you can roughly double their odds, and the odds of the other teams are only a tiny bit less. The remaining six third-round matches are scheduled for Wednesday, and I’ll try to update again when those are in the books.

Update 2: Murray/Williams are out, so the number of people interested in mixed doubles has fallen from double digits back to the typical level of single digits. The departure of the singles stars also leaves one clear favorite in each half. Here is the updated forecast:

Team                    SF      F      W  
Soares/Melichar      83.4%  64.3%  36.4%  
Middelkoop/Yang      16.6%   6.7%   1.5%  
Lindstedt/Ostapenko  46.0%  12.6%   3.7%  
Skugor/Olaru         54.0%  16.4%   5.2%  
Koolhof/Peschke      37.5%   7.3%   1.8%  
Sitak/Siegemund      62.5%  17.2%   6.0%  
Dodig/Chan           84.4%  68.3%  43.3%  
Hoyt/Silva           15.6%   7.2%   2.0%

All four quarter-finals are scheduled for Thursday, so I’ll post another update tomorrow evening.

Update 3: We’re down to four teams. Of the Elo favorites in the quarter-finals, only Dodig/Chan survived, leaving them as the overwhelming pick to take the title. Here’s the full forecast:

Team                     F      W  
Middelkoop/Yang      42.3%   8.2%  
Lindstedt/Ostapenko  57.7%  14.1%  
Koolhof/Peschke      14.1%   6.3%  
Dodig/Chan           85.9%  71.4% 

Update 4: Both favorites won in Friday’s semi-finals, so we’ve got a final between Lindstedt/Ostapenko and Dodig/Chan. The first team didn’t get an opening-round bye, so they won one more match to get here. They also have a better story, since Ostapenko keeps hitting her partner in the head. Dodig/Chan entered as the 8th seeds, despite being the second-best team according to XD-Lo.

Consequently, Dodig/Chan get the edge here, with an 81% of winning the 2019 Wimbledon Mixed Doubles title.

Yep, Wimbledon is Playing Slower This Year

Italian translation at settesei.it

The players are right. Wimbledon’s surface–or balls, or atmosphere, or aura–has slowed down in comparison with recent years. We’ve heard comments to that effect from Roger Federer, Milos Raonic, Boris Becker, Rafael Nadal, and many others. Raonic attributes the change to the grass, and Nadal to the balls. Regardless of the reason, the numbers back up their perceptions.

Here is an overview of several surface-speed indicators for the first three rounds of singles matches at Wimbledon, 2017-19:

                     2017   2018   2019  
Aces (Men)           8.9%  10.0%   8.5%  
Aces (Women)         4.1%   4.2%   4.1%  
                                         
Unret (Men)         36.0%  36.6%  33.3%  
Unret (Women)       25.9%  27.6%  25.2%  
                                         
<= 3 Shots (Men)    65.2%  65.6%  61.9%  
<= 3 Shots (Women)  55.3%  57.9%  55.0%  
                                         
Avg Rally (Men)       3.4    3.5    3.7  
Avg Rally (Women)     4.0    3.8    4.1

The second set of rows, "Unret," is the percent of unreturned serves. The next set, "<=3 Shots," is the percent of points that ended in three shots or less. For all four of the stats shown, including aces and average rally length, men's numbers point to slower conditions. The women's numbers are less clear, but to the extent that they point in either direction, they concur.

Not just 2019

Aggregate numbers such as these usually give us an idea of what's going on. But we can do better. The numbers above do not control for the mix of players or the length of their matches. For instance, 2019's rates would be different if John Isner, instead of Mikhail Kukushkin, had played a third-round match. The surface speed might have affected that result, but if we're going to compare ace rate from one year to the next, we shouldn't compare Isner's ace rate with Kukushkin's ace rate.

That's where my surface speed metric comes in. For each tournament, I control for the mix of servers and returners (yes, returners affect ace rate, too) to boil down each event to one number, representing how the tournament's ace rate compares to tour average. While there's more to surface speed than ace rate, aces are a good proxy for many of those other indicators, and more importantly, aces are one of the few stats that are available for every match.

The resulting score usually ranges between 0.5--50% fewer aces than average, usually on a slow clay court like Monte Carlo--and 1.5--50% more aces than average, on a fast grass or indoor hard court, like Antalya or Metz. Over the last decade, Wimbledon's conditions have drifted from the high end of that range to the middle:

Year      Men    Women  Average  
2011     1.26     1.37     1.31  
2012     1.27     1.06     1.17  
2013     1.29     1.04     1.17  
2014     1.35     1.19     1.27  
2015     1.20     1.16     1.18  
2016     1.06     1.03     1.04  
2017     1.03     1.07     1.05  
2018     1.14     0.98     1.06  
2019     1.04     0.96     1.00 

The men's numbers are usually more reliable measurements, because they are based on many more aces, which means that the ace rate for any given match is less fluky. Ideally, we'd see the men's and women's speed ratings move in lockstep, but there is some noise in the calculation, and the ratings are also relative to that year's tour average, which depends in turn on the changing speeds of dozens of other surfaces.

Caveats aside, the direction of the trend is clear. There isn't a substantial difference between 2019 and the last few years, but the gap between the first and second half of the decade is dramatic.

What is less clear--and will require considerable further research--is how much it matters. In 2014, Nick Kyrgios upset Nadal in four sets, while last week, the result was reversed. How much of that can we attribute to the surface? Would faster conditions have allowed Isner to outlast Kukushkin? Kevin Anderson to hold off Guido Pella? Jelena Ostapenko to withstand Su Wei Hsieh?

For now, those questions remain in the speculation-only file. Now that we can conclude that the grass really has gotten slower, we can focus that speculation on the fates of several grass court savants, including Federer, Raonic, and Karolina Pliskova. By the end of the fortnight, they--like Kyrgios--might be wishing it was 2014 again.

How Good is Cori Gauff Right Now?

Italian translation at settesei.it

15-year-old sensation Cori Gauff holds a WTA ranking of No. 313. She has played only a limited number of events that are considered by the WTA’s system, so even before her impressive run began, we could’ve predicted that her ranking was an understatement. But by how much?

Gauff doesn’t show up yet on my Elo ratings list because, before Wimbledon qualies, she hadn’t played at least 20 matches at the ITF $50K level or higher in the last year. However, she still had a rating: 1,488, good for 194th place among those who had met the playing time minimum. A rating in that range translates to about a 3% chance of upsetting current top-ranked player Ashleigh Barty, and a 10% chance of beating someone around 20th, such as Donna Vekic. Given how little data we had to work with at that point, that seemed like a reasonable assessment.

Since she arrived in London, she has won six matches: Three in qualifying and three in the main draw, with wins over Venus Williams, Magdalena Rybarikova, and Polona Hercog. Not bad for a teenager who had previous won only one slam qualifying match and one tour-level main draw match in her young career!

194th place doesn’t seem like such a fair judgment anymore. Any player who comes through qualifying and reaches the fourth round at a major deserves some reassessment, and that’s even more applicable to a player about whom we knew so little two weeks ago. The tricky part is figuring out how much to adjust. Is Gauff now a top-100 player? Top 50? Top 20?

Revising with Elo

The Elo algorithm does a good job of approximating how humans make those reassessments: The more data we already have about a player, the less we will adjust her rating after a win or loss. The previous player to defeat Hercog was Simona Halep, at Eastbourne. We already have years’ worth of match results for Halep, and she was heavily favored to win the match. Thus, the fact that she recorded the victory alters our opinion of her by only a small amount. In Elo terms, it was an increase from 2,100 points to 2,102–basically nothing.

Gauff is a different story. Entering her third-round clash with Hercog, not only did we know very little about her skill level, it wasn’t even clear if she was the favorite. The result caused Elo to make a considerably larger adjustment, increasing her rating from 1,713 to 1,755, a rise 21 times greater than what Halep received after beating the same opponent. The 42-point jump caused her to leapfrog 16 players in the rankings.

Here is Gauff’s Elo progression, from the moment she arrived at Wimbledon to middle Sunday. After each match, I show her overall Elo (the numbers I’ve been discussing so far), her grass-specific Elo, and her grass-weighted Elo, a 50/50 blend of overall and grass-specific that is used for forecasting. For each of the three ratings, I also show her ranking among WTA players with at least 20 matches in the last 52 weeks.

Match          Overall   Rk  Grass   Rk  Weighted   Rk  
Pre-Wimbledon     1488  194   1350  163      1419  187  
d. Bolsova        1540  171   1405  132      1473  155  
d. Ivakhnenko     1566  157   1447  107      1507  131  
d. Minnen         1614  132   1514   57      1564   95  
d. Venus          1670  108   1578   40      1624   73  
d. Rybarikova     1713   83   1650   21      1682   41  
d. Hercog         1755   67   1686   17      1721   31

Over the course of only six matches, Gauff has jumped from 194th in the overall Elo rankings to 67th. For forecasting purposes, her grass court rating has soared from 187th to 31st. Her current weighted rating of 1,721 is better than that of three other women in the round of 16: Karolina Muchova, Carla Suarez Navarro, and Shuai Zhang. She trails another surviving player, Elise Mertens, by only 20 points.

So you’re telling me there’s a chance

Unfortunately, none of those relatively weak grass-court players are Gauff’s next opponent. Instead, the 15-year-old will face Halep, the third-best remaining player (behind Barty and Karolina Pliskova), and a three-time quarter-finalist at the All England Club. Halep’s weighted Elo rating is 229 points higher than Gauff’s, implying that the veteran has a 79% chance of winning on Monday. The betting market concurs, suggesting that the probability of a Halep victory is about 80%.

It doesn’t usually have much of an effect on forecasts to update Elo ratings throughout a tournament. While anyone reaching the 4th round has a higher rating than they did before the event, the differences are typically small. And since forecasts are based on the difference between the ratings of two players, the forecast isn’t affected if both players’ ratings have increased by roughly the same amount.

As a teenager with such limited match experience, Gauff breaks the mold. Her pre-Wimbledon 1,488 Elo rating is only two weeks old, and it is already completely unrepresentative of what we know of her skill level. She’ll have ample time to prove us right or wrong in the upcoming years, but for now, we have good reason to estimate that she belongs–even more than some of the older players who have reached the second week at Wimbledon.