Around the Net, Issue 2

Around the Net is my attempt to provide a clearinghouse for tennis analytics on the web. Each week, you’ll find a summary of recent articles, podcasts, papers, and data sources, as well as trivia and the occasional bit of interesting non-tennis content. If you would like to suggest something for a future issue, drop me a line.

Articles and Papers

Multimedia

Data

  • Match Charting Project: The dataset has grown by 60 matches in the last week, from 5,083 to 5,143. Highlights include the 100th charted Petra Kvitova match, making her the 7th woman to become so well represented. We’ve also continued filling out the historical record of grand slam semi-finals, including a 1981 clash between Jimmy Connors and Bjorn Borg.

Trivia

  • Last week’s New York semi-final between John Isner and Reilly Opelka set plenty of records, the number of which is probably limited only by our imaginations. First, their 59 tiebreak points tied a best-of-three record. Unsurprisingly, Isner (and Jeremy Chardy) held the previous record as well.
  • The Isner-Opelka tilt also set the record for most aces (81) in a best-of-three match–breaking another of Isner’s marks–and was also the first best-of-three match in which both players hit at least 37 aces.
  • Marco Cecchinato has somehow won three tour-level titles (and reached a Roland Garros semi-final!) with only 33 tour-level match wins. By contrast, Julien Benneteau won 273 tour-level matches but nary a title.
  • Since 2008, Fabio Fognini has played at least part of the South American golden swing every year but one. But 2019 was the first time he suffered three straight first-round exits, despite entering each event as a top-two seed.

Beyond the net

Thanks to Peter, Jeff, and Carl for help with this week’s issue.

Forecasting the Davis Cup Finals

It took more than a year to decide on a new format, but barely a week to make the draw. With 12 countries qualifying for the inaugural Davis Cup Finals in home-and-away ties earlier in month, the field of 18 is set. Using the ITF’s own system to rank countries, the 18 teams were divided into three “pots,” then assigned to the six round-robin groups that will kick off the tournament this November in Madrid.

The new format sounds complicated, but as round-robin events go, it’s easy enough to understand. Each of the six round-robin groups will send a winning team to the quarter-finals. Two second-place sides will also advance to the final eight, as determined by matches won, then sets won, and so on as necessary, until John Isner and Ivo Karlovic stand back to back to determine which one is really taller. From that point, it’s an eight-team knock-out tournament.

Here are the groups, as determined by yesterday’s draw, with seeded countries indicated:

  • Group A: France (1), Serbia, Japan
  • Group B: Croatia (2), Spain, Russia
  • Group C: Argentina (3), Germany, Chile
  • Group D: Belgium (4), Australia, Colombia
  • Group E: Great Britain (5), Kazakhstan, Netherlands
  • Group F: United States (6), Italy, Canada

The ITF ranking system considers the last four years of Davis Cup results, so Spain’s brief exit from the World Group makes the seedings a bit wonky. As it turns out, not only is it a top team (Croatia) who will have to deal with early ties against the Spaniards, the entire Group B trio constitutes a group of death. Russia would be an up-and-coming squad in any format, and it is clearly the most dangerous of the six lowest-ranked sides.

Madrid to Monte Carlo

Last week, I introduced a more accurate, predictive rating system for Davis Cup, involving surface-specific Elo ratings for the players likely to compete. Those rankings put Spain at the top, Croatia second, Russia fifth, and fourth-seeded Belgium 14th in the 18-team field.

Now that we have a draw, we can use those ratings to run Monte Carlo simulations of the entire Davis Cup carnival Finals. As in my post last week, I’m estimating that singles players have a 75% chance of playing at any given opportunity and doubles players have an 85% chance. Those are just guesses–there’s no data involved in this step. Surely some teams are more fragile than others, perhaps because their stars are particularly susceptible to injury or just uninterested in the next event. I’ve excluded Andy Murray, but for the moment, I’m keeping Novak Djokovic and Alexander Zverev in the mix.

(We’re using Elo ratings for each individual player, which means the simulation is telling us what would be likely to happen if it were played today. Things will change between now and November, even if every eligible player shows up. A proper forecast that takes the time lag into account would probably give a slight boost for younger teams [whose players will have nine months to mature] and a penalty for older ones [who are more likely to be hit by injury]. And overall, it would shift all of the championship probabilities a bit toward the mean.)

Here are the results of 100,000 simulations of the draw, with percentages given for each country’s chance of winning their group, then reaching each of the knock-out rounds:

Country  Group     QF     SF      F      W  
ESP      46.1%  59.1%  41.9%  30.3%  19.3%  
FRA      54.2%  66.6%  40.6%  25.1%  14.6%  
AUS      74.5%  84.4%  46.0%  23.8%  12.1%  
USA      53.0%  65.5%  36.8%  19.7%  10.4%  
CRO      31.0%  43.0%  27.2%  17.8%   9.8%  
GER      52.5%  67.9%  39.7%  17.6%   7.7%  
RUS      22.9%  33.1%  19.5%  12.0%   6.1%  
SRB      33.0%  47.9%  24.1%  12.6%   6.0%  
GBR      66.8%  78.7%  35.9%  12.5%   4.4%  
ARG      39.7%  56.6%  28.6%  10.4%   3.8%  
ITA      24.3%  35.9%  14.6%   5.5%   2.1%  
CAN      22.7%  33.4%  13.1%   4.9%   1.8%  
JPN      12.8%  19.5%   7.2%   2.8%   0.9%  
BEL      20.3%  32.0%   8.5%   2.1%   0.6%  
NED      21.7%  35.5%   8.6%   1.7%   0.3%  
CHI       7.8%  12.9%   3.4%   0.6%   0.1%  
KAZ      11.5%  19.0%   3.2%   0.5%   0.1%  
COL       5.1%   8.9%   1.2%   0.1%   0.0%

Spain is our clear favorite, despite their path through the group of death. Five teams have a better chance of winning their group and reaching the quarters than the Spaniards do, but their chances in the single-elimination rounds make the difference. At the other extreme, Australia seems to be the biggest beneficiary of draw luck. My rankings put them sixth, and they landed in a group with Belgium (the lowest-rated seed) and Colombia (the weakest team in the field). Their good fortune makes them the most likely country to reach the final four, even if Spain and France have a better chance of advancing to the championship tie.

Less randomness, more Spain

What if we run the simulation one step earlier in the process? That is to say, ignore yesterday’s draw and see what each country’s chances were before their round-robin assignments were determined. For this simulation, we’ll keep the ITF’s seeds, so Spain is still a floater. Here’s how it looked ahead of the ceremony:

Country  Group     QF     SF      F      W  
ESP      63.0%  75.9%  52.9%  35.0%  22.6%  
FRA      56.8%  70.8%  43.9%  25.7%  14.5%  
CRO      55.5%  69.4%  42.2%  25.1%  13.5%  
USA      51.3%  65.6%  38.5%  19.8%  10.0%  
AUS      48.3%  62.9%  34.8%  17.7%   8.5%  
RUS      40.6%  53.5%  30.2%  15.8%   7.9%  
SRB      42.9%  55.8%  28.3%  13.5%   5.9%  
GER      42.0%  55.7%  27.3%  12.5%   5.4%  
ARG      35.9%  49.1%  20.9%   7.9%   2.8%  
ITA      33.6%  47.1%  19.2%   7.2%   2.5%  
GBR      34.9%  48.3%  20.3%   7.5%   2.5%  
CAN      24.5%  35.5%  14.3%   5.3%   1.9%  
JPN      19.8%  29.4%  10.6%   3.6%   1.1%  
BEL      20.9%  30.4%   7.5%   1.8%   0.4%  
NED       9.5%  15.5%   3.5%   0.7%   0.1%  
CHI       7.9%  13.3%   2.6%   0.4%   0.1%  
KAZ       8.4%  14.1%   2.1%   0.3%   0.0%  
COL       4.3%   7.5%   1.1%   0.2%   0.0%

With the “group of death” out of the picture, Croatia jumps from fifth to third, swapping places with Australia. The defending champs lost the most from the draw, while Spain suffered a bit as well.

Elo in charge

Another variation is to ignore the ITF rankings and generate the entire draw based on my Elo-based ratings. In this case, the top six seeds would be Spain, Croatia, France, USA, Russia, and Australia, in that order. Argentina and Great Britain would fall to the middle group, and Belgium would drop to the bottom third. Here’s how that simulation looks:

Country  Group     QF     SF      F      W  
ESP      71.6%  82.8%  57.3%  38.0%  24.1%  
FRA      64.6%  77.6%  45.8%  26.7%  14.4%  
CRO      63.1%  76.3%  45.8%  25.6%  13.6%  
USA      59.7%  73.3%  41.1%  20.2%  10.2%  
RUS      58.6%  71.2%  37.0%  19.7%   9.5%  
AUS      57.7%  71.4%  37.7%  17.7%   8.8%  
SRB      37.1%  53.0%  26.1%  12.1%   5.3%  
GER      35.3%  52.3%  24.5%  10.9%   4.6%  
ARG      28.0%  44.2%  17.5%   6.4%   2.2%  
ITA      27.4%  43.6%  16.9%   6.2%   2.1%  
GBR      27.0%  43.1%  16.5%   6.0%   2.0%  
CAN      26.7%  41.8%  16.0%   5.8%   2.0%  
JPN      15.9%  23.6%   8.1%   2.6%   0.8%  
BEL       9.4%  15.1%   3.9%   0.9%   0.2%  
NED       6.5%  10.8%   2.3%   0.5%   0.1%  
CHI       5.3%   9.0%   1.8%   0.3%   0.1%  
KAZ       3.2%   5.8%   0.9%   0.1%   0.0%  
COL       3.1%   5.2%   0.8%   0.1%   0.0%

The big winners in the Elo scenario are the Russians, who gain a seed and avoid a round-robin encounter with either Spain or Croatia. Australia gets a seed as well, but the benefit of protection from the powerhouses isn’t as valuable as the luck than shone on the Aussies in the actual draw.

Imagine a world with no rankings

Finally, let’s see what happens if we ignore the rankings altogether. It would be unusual for the tournament to take such an approach, but if there’s ever a time to have a tennis event with no seedings, this is it. The existing rankings are far too dependent on years-old results, leaving young teams at a disadvantage. And my system, while more accurate, doesn’t quite feel appropriate either. It is based on individual player ratings, and this is a team event.

Whatever the likelihood of a ranking-free draw in the Davis Cup future, here’s what a simulation looks like with completely random assignment of nations into round-robin groups:

Country  Group     QF     SF      F      W  
ESP      62.8%  75.4%  52.4%  34.8%  22.5%  
FRA      54.8%  68.6%  42.6%  25.0%  13.9%  
CRO      53.4%  67.2%  41.0%  23.6%  13.0%  
USA      48.8%  62.9%  35.9%  19.1%   9.7%  
RUS      47.9%  61.0%  34.8%  18.5%   9.3%  
AUS      47.1%  61.1%  34.1%  17.6%   8.5%  
SRB      41.5%  54.3%  28.0%  13.5%   6.1%  
GER      40.3%  53.6%  26.7%  12.3%   5.3%  
ARG      31.9%  44.9%  18.8%   7.2%   2.6%  
ITA      31.5%  44.2%  18.6%   7.1%   2.5%  
GBR      30.7%  43.4%  17.6%   6.5%   2.3%  
CAN      30.4%  42.7%  17.4%   6.4%   2.2%  
JPN      25.9%  36.4%  13.5%   4.6%   1.4%  
BEL      17.2%  25.9%   7.2%   1.8%   0.4%  
NED      12.5%  20.0%   4.6%   0.9%   0.2%  
CHI      10.4%  16.9%   3.5%   0.6%   0.1%  
KAZ       7.0%  11.8%   1.9%   0.3%   0.0%  
COL       5.9%   9.7%   1.5%   0.2%   0.0%

Round-robin formats do a decent job of surfacing the best teams, so the fully random approach doesn’t give us wildly different results than the seeded simulations. The main effect of the no-seed version is to give the weakest sides a slightly better chance at advancing past the group stage, since there is a better chance for them to avoid strong round-robin competition.

Madrid or Maldives redux

Some top players are likely to skip the event. Zverev has said he’ll be in the Maldives, and Djokovic has hinted he may miss the tournament as well. The new three-rubber format means that teams will suffer a bit less from the absence of a singles star, assuming he also isn’t one of the best doubles options as well. Still, both Germany and Serbia would much rather head to the party with a top-three singles player on their side.

Here are the results of the intial simulation–based on the actual draw–but without Djokovic or Zverev:

Country  Group     QF     SF      F      W  
ESP      46.5%  59.5%  44.0%  33.2%  21.3%  
FRA      68.2%  79.3%  49.6%  30.6%  17.8%  
AUS      74.3%  84.5%  46.1%  24.2%  12.6%  
USA      53.4%  66.2%  37.5%  20.4%  10.8%  
CRO      30.3%  42.5%  28.4%  19.6%  10.8%  
RUS      23.2%  33.6%  21.1%  13.8%   7.0%  
GBR      67.0%  79.0%  40.9%  14.6%   5.2%  
ARG      52.1%  66.9%  35.5%  12.9%   4.9%  
GER      36.4%  52.3%  23.3%   7.2%   2.2%  
ITA      24.2%  35.9%  14.5%   5.7%   2.2%  
CAN      22.4%  33.2%  13.4%   5.2%   2.0%  
JPN      19.4%  31.7%  11.5%   4.8%   1.6%  
BEL      20.5%  32.4%   8.6%   2.3%   0.6%  
SRB      12.4%  21.1%   6.0%   1.9%   0.5%  
NED      21.6%  35.5%   9.8%   2.0%   0.4%  
CHI      11.4%  18.5%   4.9%   0.9%   0.2%  
KAZ      11.3%  19.1%   3.8%   0.5%   0.1%  
COL       5.2%   9.0%   1.2%   0.2%   0.0%

Germany’s chances of winning the inaugural Pique Cup would fall from 7.7% to 2.2%, and Serbia’s odds drop from 6.0% to 0.5%. Argentina and France, the seeded teams sharing groups with Germany and Serbia, respectively, would be the biggest gainers from such high-profile absences.

Anybody’s game

I’ve been skeptical of the new Davis Cup, and while I remain unconvinced that it’s an improvement, I find myself getting excited for the weeklong tennis hootenanny in Madrid. These simulations were even more encouraging. As always, the ranking and seeding isn’t the way I’d do it, but in this format, the differences are minimal. The event format will give us a chance to see plenty of tennis from every qualifying nation, and the high level of competition from most of these countries ensures that most teams have a shot at going all the way.

Is Doubles As Entertaining As We Think?

For as long as I’ve been following tennis, there’s been a tension between the amount of doubles available to watch and the amount of doubles that fans say they want to watch. In-person spectators flock to doubles matches at grand slams and aficionados pass around GIFs of the most outrageous, acrobatic doubles points. Yet broadcasters almost always stick with singles, leaving would-be viewers chasing down online streams, often illegal ones.

There are some good reasons for that, foremost among them the marquee drawing power of the best singles players. Broadcasters are convinced that their audiences would rather watch a Fed/Rafa/Serena/Pova blowout than a potentially more entertaining one-on-one contest between unknowns, let alone a doubles match. And they’re probably right–at least, they’ve got ratings numbers to back them up. So we’re left with a small population of hipster doubles fans, confident that two-on-two is the good stuff, even if most of us rarely watch it.

It’s probably impossible to quantify entertainment value, but that doesn’t mean we shouldn’t try. What can the numbers tell us about the watchability of doubles?

Hip to be rectangular

There’s plenty of room for a diversity of preferences–one fan’s Monfils may be another fan’s Isner. But there are some general principles that seem to define entertaining tennis for most spectators. Winners are better than errors, for one. Long rallies are better than short ones, at least within reason. And you can never go wrong with more net play.

If net play were the only criterion, doubles would beat singles easily. But what about other factors? I started wondering about this while researching a recent post on gender differences in mixed doubles, when I came across a match in which every rally was four shots or fewer. For every brilliant reflex half-volley, doubles features a hefty dose of big serving and tactically high-risk returning. Especially in men’s doubles, that translates into a lot of team conferences and not very much shotmaking.

Let’s see some numbers. For each of the five main events at the 2019 Australian Open–men’s and women’s singles, men’s and women’s doubles, and mixed doubles–here is the average rally length, the percentage of points ended in three shots or less, and the percentage of points that required at least ten shots:

Event            Avg Rally  <3 Shots  10+ Shots  
Men's Singles          3.2     72.6%       5.1%  
Women's Singles        3.4     67.9%       5.4%  
Men's Doubles          2.5     81.6%       1.1%  
Women's Doubles        2.9     76.7%       2.4%  
Mixed Doubles          2.8     74.0%       1.8%

There's a family resemblance in these numbers, but it's clear that doubles points are shorter. Men's doubles is the most extreme, at 2.5 shots per point. By comparison, only 8% of the men's singles matches in the Match Charting Project database have an average rally length lower than that. More than four out of every five men's doubles points ends by the third shot, and with barely one in one hundred points lasting to ten shots, you'd be lucky to sit through an entire match and see more than one such exchange.

Quantity and quality

Shorter points are the nature of the format. Even recreational players can find it hard to keep the ball in play when half of each team is patrolling the net, looking for an easy putaway. Short-rally tennis can still be entertaining, as long as the quality of play offsets the unfavorable watching-to-waiting ratio.

I've mentioned my perception that men's doubles features a lot of unreturned serves. The numbers suggest that I spoke too soon. For the five events, here are the percentage of points in which the return doesn't come back in play:

Event            Unret%  
Men's Singles     31.7%  
Women's Singles   24.3%  
Men's Doubles     32.1%  
Women's Doubles   21.6%  
Mixed Doubles     29.3%

For men, singles and doubles are about the same. Perhaps the singles servers are a bit stronger, but the doubles returners are taking more chances, trying to avoid feeding weak returns to aggressive netmen. With women, you're more likely to see a return in play in a doubles match than in singles. Unless you're a connoisseur of powerful serves, you'll probably find higher rates of returns in play to be more enjoyable to watch.

The same applies to winners, compared to unforced errors. (Forced errors are a bit tricky--sometimes they are as exciting and indicative of quality as a winner; other times they're just an out-of-position unforced error.) Let's see what fraction of points end in various ways, for each of the five events:

Event            Unforced%  Forced%  Winner%  
Men's Singles        25.6%    16.2%    21.3%  
Women's Singles      28.9%    16.0%    23.4%  
Men's Doubles        12.8%    17.2%    29.9%  
Women's Doubles      20.9%    18.0%    32.1%  
Mixed Doubles        14.5%    17.0%    29.5%

Here, doubles is the clear winner. For both men and women, more doubles points than singles points end in winners, and fewer points end in unforced errors. Some of that reflects the much higher rate of net play, since it's easier to execute an unreturnable shot from just a few feet behind the net. There are a few more forced errors in doubles, perhaps representing failed attempts to handle volleys that almost went for winners, but no matter how we interpret them, the difference in forced errors is not enough to offset the differences in winners and unforced errors.

The hipsters weren't wrong

The numbers aren't as conclusive as I expected them to be. Yes, doubles points are shorter, but not so much so that the format is reduced to only serving and returning. (Though some men's matches are close.) As usual, our data has limitations, but the information available for each point suggests that there's plenty of high-quality, entertaining tennis to be seen on doubles courts, even if it's usually limited to four or five shots at a time.

Top Seed Upsets in ATP 250s

Italian translation at settesei.it

In a typical week, no one would notice if Fabio Fognini, Karen Khachanov, and Lucas Pouille combined to go 0-3. This week is different, as those three men held the top seeds at the ATP events in Cordoba, Sofia, and Montpellier. After their first-round byes, each of them lost in the second round, to Aljaz Bedene, Matteo Berrettini, and Marcos Baghdatis, respectively. At least two of the top seeds pushed their opponents to three sets, while Fognini lasted only 71 minutes.

This is not the first time a trio of number one seeds have suffered first-match upsets in the same week. Amazingly, it’s not even the first such occurrence in this very week on the calendar. Two years ago, when the South American event was played in Quito, the results were the same: top seeds Marin Cilic, Ivo Karlovic, and Dominic Thiem all failed to win a match. Thiem’s vanquisher, Nikoloz Basilashvili, even extended the streak the following week, heading to Memphis and handing Karlovic his second straight second-round ouster.

Predictable upsets?

Focusing on these losses, it’s natural to wonder whether top seeds are particularly fragile in this sort of tournament. There’s certainly a logic to it. The number one seed at an ATP 250 is usually ranked in the top 20, and is the sort of player who might have considered taking the week off. He knows that more ranking points are available at slams and Masters, so winning a smaller event isn’t his highest priority. His opponent, on the other hand, is competing every chance he gets, and the points on offer at a smaller event could make a big difference in his standing. Further, he has already played–and won–his first-round match, so he might be performing better than usual, or the conditions might suit him particularly well.

Let’s put it to the test. Since 2010, not counting this week’s carnage, I found 267 non-Masters events at which a top seed got a first-round bye and completed his second-round match. (Additionally, there have been three retirements and one withdrawal; only one of those resulted in a loss for the top seed.) The number one seeds had a median rank of 10, and the underdogs had a median rank of 89. Based on my surface-weighted Elo ratings at the time of each match, the favorites should have won 81.5% of the time. That’s better than this week’s trio of top-seeded losers, who were 64% (Fognini), 80% (Khachanov), and 69% (Pouille) favorites.

As it happened, the unseeded challengers were more successful than expected. The favorites won only 76.8% of those matches–a rate low enough that there is only a 3% probability it is due to chance alone. It’s not an overwhelming effect–certainly not enough that we should have predicted this week’s results–but it seems that a few of the top seeds are showing up unmotivated and a handful of the underdogs are playing better than expected.

Riding the wave

What about the underdog winners? Once they’ve defeated the top seed, how many capitalize on the opportunity? Berrettini came back to beat Fernando Verdasco in his quarter-final match today, while Baghdatis and Bedene play later. My forecasts believe that, of the three, Bedene has the best chance of claiming a title, though still less than a one-in-five shot at doing so.

In our subset of 267 matches, the underdog won 66 of them. More than half the time, though, that was the end of the run. 38 of the 66 (58%) fell in the quarter-finals. Another 17 lost in the semis. Whatever works so well for these underdogs in the second round disappears afterward. In the 105 matches contested by these 66 men in the quarter-finals and beyond, Elo thinks they should have won 44.9% of them. Instead, they managed only 42.3%.

There’s still a bit of hope. Five men knocked out the top seed in the second round and went on to win the entire tournament. One of those was a challenger we’ve already mentioned: Estrella, who knocked out Karlovic and went on to hoist the trophy in Quito two years ago. Maybe there’s some magic in week six. This week’s trio of underdogs would surely love to think so.

Bianca Andreescu’s Very, Very Good Week

Italian translation at settesei.it

WTA fans have grown accustomed to watching teenagers blast their elders off the court, but nobody expected this. 18-year-old Bianca Andreescu, ranked just outside the top 150, qualified for the season-opening Auckland event with three victories, overpowered Timea Babos in the first round, and then proceeded to knock out two former WTA No. 1s, Caroline Wozniacki and Venus Williams. She advances to the semi-final in just her fifth tour-level main draw and will jump at least a few dozen places in the rankings.

What makes Andreescu’s feat so notable is the pedigree of her opponents. Sure, Wozniacki was dealing with physical issues and Williams isn’t quite the unstoppable force she used to be, but fringe players like the Canadian teenager don’t knock out multiple former No. 1s very often.

Going back to 1984, I found just over 2,000 matches in which a top-ranked or former top-ranked player lost. Over 300 players have recorded a win against such an opponent, and elite players have accumulated a lot of these upsets. Serena Williams has beaten No. 1s or former No. 1s over 100 times, and Venus has done so 65 times, including her first-round win over Victoria Azarenka this week.

Andreescu’s achievement in Auckland was the 171st time (again, since 1984) that a player beat two or more such opponents at the same tournament, so we’ve seen it happen about five times per season. It has become more frequent in recent years, at least in part because there are so many former top-ranked players on tour, giving would-be giant-killers more opportunities. Most of the players who beat multiple No. 1s are themselves elite players: Serena accounts for 26 of the 171 tournaments, and Venus for another 9. Andreescu was the 71st different woman to pull off the feat.

At just over 18.5 years of age, the Canadian is one of the youngest players to beat multiple former No. 1s at the same event. She’s a bit older than Belinda Bencic was when she knocked out Serena, Wozniacki, and Ana Ivanovic in Toronto in 2015, but before that we need to go back to the 2006 French Open to find a woman who recorded similar upsets at an earlier age. Here is the full list of such feats accomplished at or before Andreescu’s age:

Event                 Player              Age  
1997 French Open      Martina Hingis     16.7  
1998 Key Biscayne     Anna Kournikova    16.8  
1998 Berlin           Anna Kournikova    16.9  
2006 French Open      Nicole Vaidisova   17.1  
2004 Wimbledon        Maria Sharapova    17.2  
1999 Indian Wells     Serena Williams    17.4  
1999 Key Biscayne     Serena Williams    17.5  
1987 Key Biscayne     Steffi Graf        17.7  
1988 Boca Raton       Gabriela Sabatini  17.8  
1999 Manhattan Beach  Serena Williams    17.9  
2005 Miami            Maria Sharapova    17.9  
1999 US Open          Serena Williams    17.9  
2015 Toronto          Belinda Bencic     18.4  
1996 Tokyo            Iva Majoli         18.5  
2019 Auckland         Bianca Andreescu   18.5

She wouldn’t be the first player on this list to flame out before taking a place among the all-time greats, but in general, that’s good company for an 18-year-old qualifier.

Andreescu stands out even more when we consider that she is ranked far outside the top 100. (At least for another few days.) Of the 171 occasions when a player knocked out two current or former No. 1s, none had done so with such a low ranking. The only other player to accomplish such a thing while outside the top 100 was Louisa Chirico, who beat Azarenka and Ivanovic at the 2016 Madrid event. The Canadian’s career-best week is only the 13th time that a player beat two such opponents while ranked outside the top 40, and a few of those instances came when a typically-great player’s ranking was recovering from time away:

Event                 Player              Age  Rank  
2019 Auckland         Bianca Andreescu   18.5   152  
2016 Madrid           Louisa Chirico     20.0   130  
2003 French Open      Nadia Petrova      21.0    76  
2017 Madrid           Eugenie Bouchard   23.2    60  
2007 Istanbul         Aravane Rezai      20.2    59  
2010 Australian Open  Maria Kirilenko    23.0    58  
2009 Beijing          Shuai Peng         23.7    53  
2014 Montreal         CoCo Vandeweghe    22.7    51  
2007 Beijing          Shuai Peng         21.7    49  
2005 Paris            Dinara Safina      18.8    48  
2015 Doha             Victoria Azarenka  25.6    48  
2018 Indian Wells     Naomi Osaka        20.4    44  
2014 Dubai            Venus Williams     33.7    44

Two shocking upsets are no guarantee of future success, but the demonstrated ability to defeat such elite veterans is probably more indicative of future success than winning a handful of ITF $25K titles (as she has) or lifting trophies for multiple junior grand slam doubles championships (as she did). On a tour already full of promising young stars, it took Andreescu only 48 hours to establish herself as one of the WTA teenagers most worth watching.

Measuring the Impact of Break Points

Yesterday I dove deep into tiebreak luck. I explained that while better players tend to win more tiebreaks, there’s no special tiebreak skill that causes certain players to perform better at the end of sets than they do at other stages of the match. Therefore, if a player has a long stretch of excellent or dismal tiebreak results, we should discard the tempting hypothesis that he or she possesses some special tiebreak talent and assume that he or she will post more average results in the future.

The same is true of break points. In any given season, you can find players who win or lose a disproportionate number of break points, and it’s tempting to point to mental strength by way of explanation. Yet more often that not, the unusual results disappear, along with any convincing case that we’ve identified a notably steely or flimsy tennis brain.

To quantify those over- and underachievements, I’ve attempted to measure the number of break points converted compared to the “expected” number, where the expectation is defined by how often the player wins return points. (It’s a bit more complicated than looking up a player’s single season return-points-won (RPW) rate. Instead, we consider their RPW for each match, and weight the matches according to how many break point opportunities occurred in the match.) For example, Gael Monfils converted 146 of his 317 break point chances last year, good for a 46.1% win rate. That far outstrips his weighted RPW of 38.7%. He claimed 23 more break points than expected, or an excess of 19%. Parallel to my approach with tiebreaks, I’ve named those stats, so the counting stat is Break Points Over Expectation (BPOE) and the rate stat is Break Points Overperformance Rate (BPOR).

(On average, returners win slightly fewer break points than non-break points. I’ve adjusted the “expected” level downward by 1.4% to account for this.)

Monfils was an outlier, the only player in 2018 to exceed +20 BPOE, and the only player with 40-plus matches to post a BPOR of more than 15%. Yet there was little in his past performance that would have told us what was coming. From 2009 to 2017, he had three negative seasons, two years indistinguishable from neutral, and four above average. Over the entire span, he won break points less than one percent more often than expected. The Frenchman’s pressure-point success in 2018 could be thanks to some newfound mental strength, but if history is any guide, he won’t continue to display whatever mix of luck and nerves led him to post his circuit-leading figures.

Here are the best and worst break point performances, by BPOE, posted by ATPers with at least 20 tour-level matches last year:

Player                 Chances  Won   BPOE  BPOR  
Gael Monfils               317  146   23.4  1.19  
Mackenzie Mcdonald         252  116   19.0  1.20  
Michael Mmoh               129   63   16.9  1.37  
Malek Jaziri               298  134   16.2  1.14  
Pierre Hugues Herbert      297  126   16.1  1.15  
Adrian Mannarino           318  136   14.1  1.12  
Ricardas Berankis          235  103   13.8  1.15  
Sam Querrey                290  118   13.8  1.13  
Martin Klizan              313  139   13.5  1.11  
Jan Lennard Struff         272  118   13.4  1.13  
                                                  
Marton Fucsovics           414  162  -11.5  0.93  
Filip Krajinovic           238   86  -11.8  0.88  
Evgeny Donskoy             239   79  -11.9  0.87  
Stan Wawrinka              217   66  -11.9  0.85  
Aljaz Bedene               303  108  -12.9  0.89  
John Isner                 308   85  -13.0  0.87  
Mischa Zverev              347  123  -14.1  0.90  
Marin Cilic                568  209  -18.1  0.92  
Joao Sousa                 484  176  -21.6  0.89  
Novak Djokovic             617  246  -21.7  0.92

It’s striking to see Novak Djokovic at the bottom of the list, nearly as bad or unlucky as Monfils was good or fortunate. Yet Novak’s story is surprisingly similar to Gael’s. From 2009 to 2017, his overall BPOR was 0.997–almost precisely neutral–and he posted nearly as many positive seasons as negative ones.

Yep, it’s random

To give more player-specific examples would only belabor the point: A player’s performance on break points (independent of his overall return-point skill) has no relationship from one year to the next. I found 700 pairs of consecutive player-seasons between 2009 and 2018 (for example, Djokovic’s 2017 and 2018) and found that the correlation between the two seasons was effectively zero. (r^2 = 0.002)

Here’s one more illustration of the point. This table shows the ten players who recorded the highest 2017 BPOR figures of those men who played at least 20 ATP matches in both 2017 and 2018. The right-most column shows what they did the following year:

Player             2017 BPOR  2018 BPOR  
Damir Dzumhur           1.16       1.05  
Alexander Zverev        1.15       1.02  
Nicolas Kicker          1.15       1.04  
Peter Gojowczyk         1.14       0.92  
Dusan Lajovic           1.13       1.04  
Mikhail Kukushkin       1.13       0.94  
Mischa Zverev           1.13       0.90  
John Isner              1.12       0.87  
Andrey Rublev           1.12       0.96  
Thiago Monteiro         1.12       1.17  
AVERAGE                 1.14       0.99

Only Thiago Monteiro continued to be successful enough to maintain a place amid the tour leaders; John Isner’s follow-up campaign was so different that he registered as one of the tour’s worst in 2018. Taken together, five of 2017’s top ten ended 2018 below average, and the ten men combined for a BPOR just a bit worse than neutral. This is all just another way of saying we’re looking at something indistinguishable from chance.

Putting a price tag on good fortune

We’ve established that break point performance in the present has nothing to tell us about break point performance in the future. But as I pointed out in yesterday’s post about tiebreaks, that very lack of predictiveness has value.

Monfils’s BPOE of +23 helped his overall cause, helping him rack up more victories in 2018 than he otherwise would have. His break point results probably boosted his ranking and prize money tally. Reverting to neutral break point performance won’t knock him off tour, but assuming he continues to serve and return at the same level he did last year, a more pedestrian BPOE could hurt his cause. But how much?

Yesterday I suggested that two additional tiebreaks are equal to one additional win. Break points are a bit more complicated–clearly a single break point is not as valuable as an entire tiebreak, both because it is a single point and because it rarely offers the player a chance to finish off an entire set or match. On the other hand, break points are more numerous, and figures Monfils’s +23 and Djokovic’s -21 are more extreme than the most unexpected tiebreak performances.

Measuring high-leverage points

The key to measuring the impact of break points is the general concept of win probability, and the more specific notion of leverage. (Leverage is often referred to as volatility or importance; these are all the same basic idea.) Win probability is simply a measure of each player’s chances of winning the match at any given stage. Leverage is an index of how much a single point can affect that probability. Say two equal players embark on a new match. Before the first ball is struck, each have a 50% chance of emerging victorious. If winning the first point increases the server’s chance of winning to 51% while losing it decreases his probability to 49%, we would say that the leverage of the first point is 2%–the difference between the win probabilities that would result from winning or losing the point.

The more crucial the point, the higher the leverage. The typical point is well below 5%, but a truly high-pressure moment, like 5-6 in a third-set tiebreak, can be as high as 50%.

Win probability stats depend a great deal on the inputs you choose, so there’s no single mathematically correct leverage measurement at any given moment. If you think two players are equal, your estimate of the win probability at the start of the match is very different than if you think one of the competitors is a heavy favorite. Those judgements affect the leverage of every point as well. Still, for aggregates of large numbers of matches–say, an entire season–we can get a general idea of the value of break points.

Necessary assumptions

If we make the simple but clearly wrong assumption that all players are equal, the leverage of the average point on the ATP tour last year was 4.6%, and the leverage of the average break point on tour last year was 10.5%. Those numbers are useful as a starting point, but they are clearly too high; when we accept that most matches are not contested between players of equal skill, we realize that any given break point isn’t quite that important–if Djokovic fails to convert one against Monteiro, he’ll remain almost certain to win the match.

One alternative approach is to assume that each player’s skill level is represented exactly by their performance in a given match. So if Djokovic plays Monteiro and wins 80% of service points, while Monteiro wins only 60%, we could calculate the win probability and leverage of every point using those numbers. Using that method, we get a leverage of 2.9% on the average point and 6.5% on the average break point.

The second assumption is also not exactly right, but it probably gets closer to the truth than the first. Keeping in mind that it’s an approximation, let’s use a break point leverage of 7.5%. That figure means that, on average, changing the result of a single break point affects the win probability of a single match by 7.5%. Another of way of thinking about it–the one most relevant to the task at hand–is that winning a break point instead of losing it is equivalent to winning 7.5% (or about one-thirteenth) of a match.

Break points are (fractions of) wins

Returning to the concept of BPOE, we can now say that 13 additional break points is equivalent to one additional win. Monfils’s 2018 tally of +23 was good for almost two extra victories over the course of the season, and Djokovic’s count of -21 would, on average, cost him 1.5 matches. Given the multitude of other factors influencing each man’s performance, it’s unreasonable to expect either player’s won-loss record in 2019 to bounce back so predictably and precisely. (Especially since it’s impossible to win 1.5 matches.) But in the unlikely event that all else is equal, we should expect those advantages and disadvantages to disappear in the new season.

The range of minus-21 to plus-23 break points is a decent representation of how extreme break point luck can be. Since 2009, only four players have posted single-season numbers above +23, including the most extreme BPOE of +34, accumulated by Damir Dzumhur in 2017. (Dzumhur was hit hard by the ensuing reverse in fortune: His 2017 tour-level record was 37-24, but in 2018, when his BPOE fell to a still-lucky +8, his record dropped to 25-31.) At the opposite extreme, Dominic Thiem suffered from a tally of 28 break points below expectations in 2015. A year later, he bounced back to minus-5, and his ranking improved from 19th to 9th. Despite the roller-coaster descents and climbs of Dzumhur and Thiem, the range of the break-point-luck effect appears to be about five wins, from about minus-2 wins at the low end to plus-3 for the players most favored by fortune.

For most players in most seasons, however, break point luck is little more than a rouding error. And while it’s easy to get sucked into the measurements I’ve laid out, that’s the most important point of all: Just like there’s no special tiebreak factor, there’s no reason to think that certain players are somehow better at break points than others. The better a player’s return game, the more break points he’ll convert. Anything beyond that will eventually regress to the mean. And for players with extremely strong or weak break point performances, that regression is likely to have effects that extend to the overall won-loss record, ranking, and beyond.

The Effect of Tiebreak Luck

I’ve written several things over the years about players who win more or fewer tiebreaks than expected. (Interested readers should start here.) Fans and commentators tend to think that certain players are particularly good or bad at tiebreaks. For instance, they might explain that a big serve is uncommonly valuable at the end of a set, or that mental weakness is more harmful than ever at such times.

My research has shown that, for the vast majority of players, tiebreak results are indistinguishable from luck. Let me qualify that just a bit: Tiebreak results are dependent on each player’s overall skill, so better players tend to win more tiebreaks. But there’s no additional factor to consider. While players tend to win service points at a slightly lower rate in tiebreaks, the effect is similar for everyone. There’s no magical tiebreak factor.

However, a single season is short enough that some players will always have glittering tiebreak records, tricking us into thinking that they have some special skill. In 2017, John Isner won 42 of his 68 tiebreaks, a 62% success rate. Based on his rate of service points won and return points won against the opponents he faced in tiebreaks, we’d expect him to win only 34–exactly half. Whether by skill or by luck, he exceeded expectations by 8 tiebreaks. Armed with a monster serve and a steady emotional presence on court, Isner is the kind of guy who makes us think that he has hacked the game of tennis, that he has figured out how to win tiebreaks. But while he has beaten expectations several times throughout his career, even Big John can’t sustain such a level. In 2018, he played 73 tiebreaks, and the simple model predicts that he would win 41. He managed only 39.

For additional examples, name whichever player you’d like. Roger Federer has built a career on unshakeable service performances, yet his tiebreak performances have been roughly neutral for the last four years. In other words, he wins tiebreak serve and return points at almost exactly the same rate as he does non-tiebreak points. Robin Haase, infamous for his record streak of 17 consecutive tiebreak losses, has paralleled Federer’s tiebreak performance for the last four years. 2018 was particuarly good for his high-pressure record, as he won two more breakers than expected, putting him in the top quartile of ATP players for the season.

Meaning from randomness

In short, season-by-season tiebreak performance resembles a spreadsheet full of random numbers. A player with a good tiebreak record last year may well sustain it this year, but only if it’s based on good overall play. If there is an additional secret to tiebreak excellence (beyond being good at tennis), no one has told the players about it.

But in sports statistics, every negative result has a silver lining. We might be disappointed if a stat is not predictive of future results. However, the very lack of predictiveness allows us to make a different kind of prediction. If a player has a great tiebreak year, beating expectations in that category, the odds are he just got lucky. Therefore, he’s probably not going to get similarly lucky this year, and his overall record will regress accordingly.

The player to watch in 2019 in this department is Taylor Fritz, who recorded a sterling 20-8 record in tiebreaks last season. Based on his performance in the whole of those matches, we would have expected him to win only 13 of 28. His Tiebreaks Over Expectations (TBOE) of +7 exceeded that of any other tour player last season, even though many of his peers contested far more breakers. It’s always possible that Fritz really does have the magical mix of steely nerves and impeccable tactics that translates into tiebreak wins, but it’s far more likely that he’ll post a neutral tiebreak record in 2019. In 2017, the player after Isner on the TBOE list was Jack Sock, and it’s fair to say that his 2018 campaign didn’t exactly continue in the same vein.

With that regression to the mean in mind, here are the TBOE leaders and laggards from the 2018 ATP season. The TBExp column shows the number of tiebreaks that the simple model would have predicted, and TBOR is a rate-stat version of TBOE, reflecting the percentage of tiebreaks won above or below average. Rate stats like TBOR are usually more valuable than counting stats like TBOE, but in this case the counting stat may have more to tell us, since it takes into account which players contest the most tiebreaks. Sam Querrey’s rate of underperformance isn’t quite as bad as Cameron Norrie’s, but the number of tiebreaks he plays is a result of his game style, justifying his place at the bottom of this list.

Player                 TBs  TBWon  TBExp  TBOE   TBOR  
Taylor Fritz            28     20   13.3   6.7   0.24  
Bradley Klahn           22     16   10.6   5.4   0.24  
Martin Klizan           16     13    8.1   4.9   0.31  
Kei Nishikori           22     17   12.5   4.5   0.20  
Bernard Tomic           18     14    9.6   4.4   0.24  
Alexander Zverev        23     17   13.2   3.8   0.17  
Albert Ramos            22     15   11.2   3.8   0.17  
Adrian Mannarino        25     16   12.3   3.7   0.15  
Stan Wawrinka           21     13    9.6   3.4   0.16  
Juan Martin Del Potro   32     22   18.7   3.3   0.10  
                                                       
Borna Coric             21      8   10.8  -2.8  -0.13  
Denis Shapovalov        30     12   15.0  -3.0  -0.10  
Karen Khachanov         42     20   23.4  -3.4  -0.08  
Ivo Karlovic            47     19   22.6  -3.6  -0.08  
Denis Istomin           31     13   16.7  -3.7  -0.12  
Ricardas Berankis       22      7   10.9  -3.9  -0.18  
Pablo Cuevas            21      7   11.3  -4.3  -0.20  
Andrey Rublev           18      5    9.6  -4.6  -0.26  
Fernando Verdasco       25      8   12.8  -4.8  -0.19  
Roberto Bautista Agut   26     10   14.8  -4.8  -0.19  
Cameron Norrie          22      5    9.9  -4.9  -0.22  
Sam Querrey             36     12   18.5  -6.5  -0.18

The guys at the top of this list can expect to see their tiebreak records drift back to normalcy in 2019, while the guys at the bottom have reason to hope for an improvement in their overall results this year.

Converting tiebreaks to wins

I’m sure we all agree that tiebreaks are really important, but what’s the real impact of the over- and underperformance I’m talking about here? In other words, given that Kei Nishikori won 4.5 more tiebreaks last season than expected (than he “should” have won), how did that effect his overall won-loss record? And by extension, what might it mean for his match record in 2019?

The math gets hairy*, but in the end, two additional tiebreak wins are roughly equal to one additional match win. Nishikori’s 4.5 bonus tiebreaks are equivalent to about 2.25 additional match wins. He was 48-22 last year, so with neutral tiebreak luck, he would’ve gone 46-24. Of course, that still leaves some unanswered questions; translating match record to ranking points and titles is much messier, and I won’t attempt anything of the sort. His lucky tiebreaks might have converted should-have-been-losses into wins, or they might have turned gut-busting three-setters into more routine straight-set victories. But blending all the possibilities together, each player’s TBOE has a concrete value we can convert to wins.

The exact numbers aren’t important here, but the concept is. When you see an extremely good or bad tiebreak record, you don’t need to whip out a spreadsheet and calculate the precise number of breakers the player should have won. Given neutral luck, every ATP regular should have a tiebreak record between 40% and 60%–40% for the guys at the fringe, 60% for the elites. (In 2018, Federer’s expected rate was 60.1%, and Sock’s was 40.9%.) Any number out of that range, like Richard Gasquet’s 13-of-16 in 2016, is bound to come crashing back to earth, though rarely so catastrophically as the Frenchman’s did, falling to a mere 5 wins in 17 tries.

Any given tiebreak might be determined by superlative serving, daring return tactics, or sheer mental fortitude. But over time, those effects even out, meaning that no player is consistently good or bad in breakers. The better player is more likely to win, but luck has a huge say in the outcome. In the long term, that luck usually cancels itself out.

* A quick overview of the math: In a best-of-three match, there are three possible times that the tiebreak can take place. Flipping the result of a tiebreak could change the result of the first set, the second set, or the third set. The win probability impact of flipping the first set is 50%–assuming equal players, the winner has a 75% chance of winning the match and the loser has a 25% chance. The win probability effect of reversing the second set is also 50%. Either the winner takes the match (100%) instead of forcing a third set (50%), or the winner forces a third set (50%) instead of losing the match (0%). Changing the result of the third set directly flips the outcome of the match, so the win probability effect is 100%.

Every completed match has a first and second set, but fewer than 40% of ATP matches have third sets. The weighted average of 50%, 50%, and 100% is about 58%, which would be our answer if ATPers played only best-of-three matches. The math for five-setters is more involved, but the most important thing is that best-of-five gives each of the first four sets less leverage, and by extension, it does the same to tiebreaks in the first four sets. Weighing that effect combined with the frequency of best-of-five set matches would give us a precise value to convert TBOE to wins. Rather than going further down that rabbit hole, I’m happy with the user-friendly andapproximately correct figure of 50%.

Are Two First Serves Ever Better Than One?

Italian translation at settesei.it

It’s one of those ideas that never really goes away. Some players have such strong first serves that we often wonder what would happen if they hit only first serves. That is, if a player went all-out on every serve, would his results be any better?

Last year, Carl Bialik answered that question: It’s a reasonably straightforward “no.”

Bialik showed that among ATP tour regulars in 2014, only Ivo Karlovic would benefit from what I’ll call the “double-first” strategy, and his gains would be minimal. When I ran the numbers for 2015–assuming for all players that their rates of making first serves and winning first-serve points would stay the same–I found that Karlovic only breaks even. Going back to 2010, 2014 Ivo was the only player-season with at least 40 matches for whom two first serves would be better than one.

Still, it’s not an open-and-shut case. What struck me is that the disadvantage of a double-first strategy would be so minimal. For Karlovic (and others, mainly big servers, such as Jerzy Janowicz, Milos Raonic,and John Isner), hitting two first serves would only slightly decrease their overall rate of service points won. For Rafael Nadal and Andy Murray, opting for double-first would reduce their rate of service points won by just under two percentage points.

Here’s a visual look at 2015 tour regulars (minimum 30 matches), showing the hypothetical disadvantage of two first serves. The diagonal line is the breakeven level; Ivo, Janowicz, and Isner are the three points nearly on the line.

myplot

Since some players are so close to breaking even, I started to wonder if some matchups make the double-first strategy a winning proposition. For example, Novak Djokovic is so dominant against second serves that, perhaps, opponents would be better off letting him see only first serves.

However, it remains a good idea–at least in general–to take the traditional approach against Djokovic. Hypothetically, two first serves would result in Novak raising his rate of return points won by 1.2 percentage points. Gilles Simon and Andy Murray are in similar territory, right around 1 percentage point.

Here’s the same plot, showing the disadvantage of double-first against tour-regular returners this season:

myplot2

There just aren’t any returners who would cause the strategy to come as close to breaking even as some big servers do.

The match-level tactic

What happens if a nearly-breakeven server, like Karlovic, faces a not-far-from-breakeven returner, like Djokovic? If opting for double-first is almost a good idea for Ivo against the average returner, what happens when he faces someone particularly skilled at attacking second serves?

Sure enough, there are lots of matches in which two first serves would have been better than one. I found about 1300 matches between tour regulars (players with 30+ matches) this season, and for each one, I calculated each player’s actual service points won along with their estimated points won had they hit two first serves. About one-quarter of the time, double-first would have been an improvement.

This finding holds up in longer matches, too, avoiding some of the danger of tiny samples in short matches. In one-quarter of longer-than-average matches, a player would have still benefited from the double-first strategy. Here’s a look at how those matches are distributed:

myplot3

Finally, some action on the left side of the line! One of those outliers in the far upper right of the graph is, in fact, Ivo’s upset of Djokovic in Doha this year. Karlovic won 85% of first-serve points but only 50% of second-serve points. Had he hit only first serves, he would’ve won about 79% of his service points instead of the 75% that he recorded that day.

Another standout example is Karlovic’s match against Simon in Cincinnati. Ivo won 81% of first-serve points and only 39% of second-serve points. He won the match anyway, but if he had pursued a double-first strategy, Simon could’ve caught an earlier flight home.

Predicting double-first opportunities

Armed with all this data, we would still have a very difficult time identifying opportunities for players to take advantage of the strategy.

For each player in every match, I multiplied his “double-first disadvantage” (the number of percentage points of serve points won he would lose by hitting two first serves) with the returner’s double-first disadvantage. Ranking all matches by the resulting product puts combinations like Karlovic-Djokovic and Murray-Isner together at one extreme. If we are to find instances where we could retroactively predict an advantage from hitting two first serves, they would be here.

When we divide all these matches into quintiles, there is a strong relationship between the double-first results we would predict using season-aggregate numbers and the double-first results we see in individual matches. However, even if the most double-first-friendly quintile–the one filled with Ivo serving and Novak returning–there’s still, on average, a one-percentage-point advantage to the traditional serving tactic.

It is only at the most extreme that we could even consider recommending two first serves. When we take the 2% of matches with the smallest products–that is, the ones we would most expect to benefit from double first–26 of those 50 matches are one in which the server would’ve done better to hit two first serves.

In other words, there’s a ton of variance at the individual match level, and since the margins are so slim, there are almost no situations where it would be sensible for a player to hit two first serves.

A brief coda in the real world

All of this analysis is based on some simplifying assumptions, namely that players would make their first serves at the same rate if they were hitting two instead of one, and that players would win the same number of points behind their first serves even if they were hitting them twice as often.

We can only speculate how much those assumptions mask. I suspect that if a player hit only first serves, he would be more likely to see streaks of both success and failure; without second serves to mix things up, it would be easier to find oneself repeating mechanics, whether perfect or flawed.

The second assumption is probably the more important one. If a server hit only first serves, his ability to mix things up and disguise serving patterns would be hampered. I have no idea how much that would affect the outcome of service points–but it would probably act to the advantage of the returner.

All that said, even if we can’t recommend that players hit two first serves in any but the extreme matchups, it is worth emphasizing that the margins we’re discussing are small. And since they are small, the risk of hitting big second serves isn’t that great. There may be room for players to profitably experiment with more aggressive second serving, especially when a returner starts crushing second serves.

Ceding the advantage on second-serve points to a player like Djokovic must be disheartening. If the risk of a few more double faults is tolerable, we may have stumbled on a way for servers to occasionally stop the bleeding.

Premier or International? Balancing Rewards and Draw Quality

This week, WTA players had a choice of tournaments: a Premier event in Stanford or an International in Washington. Stanford offers far more ranking points–470 to Washington’s 280 for the winner–and an even bigger difference in prize money–$120,000 to $43,000.

For the very best players, it’s an easy choice to head for the event with the biggest rewards. But further down the rankings, it’s not so clear cut. If enough top contenders gather at one tournament, there may be easier points (and dollars) for the taking elsewhere.

The pairing of Stanford and Washington provides a neat natural experiment that allows us to analyze players’ scheduling decisions. Both tournaments are in the same country and played on the same surface. The only major difference is the package of available rewards.

(Of course, for any particular player, there may be other strong reasons to choose one event or the other, such as local ties, previous success at the event, sponsorship commitments, or appearance fees. Also, some players might opt for Washington because of its closer proximity–and lack of time zone changes–to upcoming events in Montreal, Cincinnati, and New York. For the purpose of this analysis, though, we’ll have to ignore personal considerations.)

Lucie’s choice

Let’s start with an example: Lucie Safarova. The 17th-ranked Czech is the top seed in Washington. Before the draw was released, a simple ranking-based projection would’ve given her a 14% chance of winning the title, making her the favorite. Had she entered Stanford, she would’ve been the 8th seed, and a similar forecast would’ve given her a 3% chance of winning the title.

Advantage Washington? If Lucie wants prestige, a trophy, and more time on court, yes. But if she prefers ranking points and cash, she still should have gone to California.

That 14% chance of winning the Citi Open title, combined with her pre-tournament odds of reaching each preceding round, gives Safarova a weighted forecast of 87 ranking points and $11,800 in prize money. Had she opted for Stanford, her weighted expectation would be 95 ranking points and $21,170.

Safarova’s comparison is indicative of what we find with many more players in action this week. Even with a higher chance of advancing to the final rounds in Washington, the ranking point balance tilts in Stanford’s favor, and the prize money difference is even more extreme.

California cash

The contrast between the two events is much starker in terms of dollars than in points. As we’ve seen, the champion in Stanford receives almost three times as much as her fellow trophy-winner in Washington, but not even twice as many ranking points.

Because the prize-money pot is so much bigger in Stanford, every direct-entry player in the draws of both tournaments would have expected a bigger check from Stanford. The differences run from the extreme–Agnieszka Radwanska could have expected only 38% as much prize money in Washington than in Stanford–to the less outrageous–Ekaterina Makarova, the Citi Open #2 seed, can expect 67% as much cash in DC as she would have expected in Stanford.

Still, every single player with the option to enter either event could have expected a bigger paycheck had they chosen Stanford.

Ranking point decisions

When it comes to WTA ranking points, Stanford holds much less of an edge. Of the 48 direct-entry players in the two tournaments, 11 of them can expect more ranking points in Washington than in Stanford, including Makarova, whose expected points haul is 15% greater in DC than it would’ve been at Bank of the West.  Most of the players who would’ve done better in Washington would be seeded in DC but not in Stanford, giving them the likelihood of a much easier early-round draw at the east-coast event.

Still, for the majority of players, the bigger rewards in Stanford outweigh the difficulty of the competition. 37 of the 48 direct entries would be expected to earn more points in Stanford, and for 15 of them, their expected points in Washington would be less than 80% as much as the comparable number in California. Nine of those 15 are playing Washington. In fairness, a few of those players were ranked below the cut for Stanford, so they didn’t have a choice.

On average, players in action this week could expect 15% more ranking points in Stanford than in Washington, along with double the prize money.

Smart choices

Not every player is going to maximize her chances of winning money and racking up points every week. But it does seem extreme that, given the choices that players made this week, the balance between risks (crashing out early to a great player) and rewards (points and cash) seems so out of whack.

It may be that secondary concerns, like proximity to other events, hold more importance that I am giving them credit for. It could be that, in the run-up to higher-stakes events next month, some players are interested in playing more matches. The Citi Open does offer the likelihood of that.  Also, some players commit to one event or the other before knowing much about the relative field strength–there is the possibility that players underestimated the quality of this year’s Washington draw, which has not always been so strong.

Still, it is striking to find little evidence that players made optimal choices. On average, the players who chose Stanford could expect 16% more ranking points than if they had played Washington. The players who opted for DC could have expected 14% more ranking points in Stanford–basically the same as their colleagues on the other coast.

With this much at stake, many players could’ve improved their lot simply by thinking through their options a little better. In general, if you’re likely to be seeded at one tournament and not the other, go where the seed is. If you will be seeded at both or unseeded at both, go where the higher stakes are.

For more detail on methodology, keep reading.

Continue reading Premier or International? Balancing Rewards and Draw Quality

Roger Federer’s Break Point Opportunities

Remember Roger Federer‘s dreadful performance on break points against Tommy Robredo at last year’s US Open?  Of course you do. He had 16 chances to break, converted only two of them, and lost the match in straight sets.  Then we all cried.

Yesterday, Federer won in straight sets against James Duckworth, but his break point performance wasn’t much better.  Four breaks of serve was all he needed to cruise to victory, but the Australian saved 13 other break chances.  In his disappointing loss to Lleyton Hewitt in Brisbane, Fed only converted 1 of 10 break chances.

Is this the end? Is a lack of break point conversions the monster that will finally slay the old man?

Not so fast.

To identify how bad (or, possibly, good) Federer has been on break points, we must compare that performance to his record on other return points.  Roger isn’t same kind of master returner as Novak Djokovic or Rafael Nadal, so it would be unrealistic to expect him to convert as many break points as they do.  To control for general returning ability, we must compare break point conversion rate to winning percentage on all other return points.

Sure enough, 2013 wasn’t a good year for Fed.  His break point conversion rate was 8% lower than his winning percentage on other return points.  When I ran these numbers after the Robredo match, that ranked 40th out of the ATP top 50.

Most of us, thinking back to Fed’s glory days, surely imagine that this is new.  And it’s true: 2013 was a bad year. But watch out for runaway narratives–there’s more randomness here than trend.  The graph below shows how Fed has performed each year on break point conversions.  A number above 1 is good: He’s winning more break point chances than other return points, as in 2009, when he exceeded expectations by 4.4%. Below 1 is bad: Last year was 7.8% below expectations.

fedbp

If you see a pattern here, I’m impressed.  2013 was bad, but not as bad as 2003, when 21-year-old Fed performed more than 10% worse on break point chances than on other return points.  He also went 78-17, winning seven tournaments, including Wimbledon and the Masters Cup, raising his ranking from #6 to #2.

Last year’s break point record was also comparable to 2007, when he converted 5.9% fewer break points than expected … and won three Grand Slams.

As with so many popular tennis stats, this one just doesn’t have that much of a relationship with winning.  Breaks matter, but missed break chances don’t. In Federer’s case, even breaks don’t always matter that much–he’s one of history’s best in tiebreaks.

The bigger picture with break point conversions

Over his career, Federer has been just a tick below average on break point, winning about 1.5% more other return points than break points.  The year-to-year fluctuations don’t appear to be terribly meaningful.

That isn’t to say that no player has strong break point tendencies.  Nadal has consistently excelled in these clutch situations, winning more break points than expected for each of the last five seasons.  He is even better when facing break point, typically winning about 7% more service points in that situation than in others.  (Some of that is due to the advantage of a lefty serving in the ad court.)

Novak Djokovic has also been a little better on break points than on return points as a whole. But last year–a season he finished within a whisker of #1–his performance in those situations was almost as poor as Federer’s.

Andy Murray is consistent when handed break point chances–consistently bad.  Since 2006, he has only exceeded expectations once. In 2012–a pretty good year from him by most standards–he won 7.3% fewer break point chances than other return points.

David Ferrer? A tick below expectations. 7.7% below other return points in 2013. Juan Martin del Potro? Consistently above expectations, including an impressive +6.8% in 2011.  Stanislas Wawrinka? -7.3% in 2011, +7.8% in 2012, then in his breakthrough 2013 campaign, -3.0%.

Constant exposure to break point stats has tricked us into thinking they are particularly meaningful. There are plenty of reasons why Federer is winning fewer matches than he used to–for one thing, he’s almost as old as I am–but break point performance just isn’t that important.