A First Look at the Down-the-Line Backhand

When executed correctly and deployed at the right moment, the down-the-line backhand is one of the most devastating shots in tennis. How valuable is it, and which players use it the most effectively? These are surprisingly complicated questions, and I don’t yet have solid answers. But the preliminary work, of determining the frequency with which players use the down-the-line backhand, as well as their success rate when they do, is illuminating in itself.

The Match Charting Project offers a lot of data on tactics like this. MCP charts record the type and direction of every shot. In a rally between right-handers, a down-the-line (DTL) backhand is simple to identify: a backhand from the backhand corner to the opponent’s forehand corner. (Or in MCP parlance, 3b1.) From the 2010s alone, the MCP has logged close to 100,000 DTL backhands, roughly evenly split by gender.

MCP charts also give us an idea of the bigger picture. We can identify opportunities to hit DTL backhands, in which a player might choose instead to hit a backhand in a different direction, or even to use a different shot entirely. A player who hits a lot of slices, or runs around the backhand to hit forehands, might hit a very high percentage of their backhands down the line, but those DTL backhands wouldn’t make up a high proportion of their total chances in that corner.

DTL opportunities

Let’s start with a look at those opportunities. The following table shows three rates for each tour, covering charted matches, 2010-present. The first rate is the percentage of backhand-corner opportunities that resulted in backhands. (For today’s purposes, I’m excluding slice backhands. DTL slices can be devilish, but they are an entirely different weapon. I’ve also excluded service returns, which present their own complications.) The second is the percentage of opportunities that results in down-the-line backhands, and the third is a combination of those two, the percentage of backhands from the backhand corner that were hit down the line.

Tour  BH/Opps  DTL BH/Opps  DTL/All BH  
ATP     63.7%        11.1%       17.4%  
WTA     73.6%        12.8%       17.4% 

Women are much more likely than men to hit a non-slice backhand from their backhand corner. There are two reasons for that. First: men, on average, hit more slices, largely because a small number of men hit a lot of slices. Second, men are somewhat more likely to run around the backhand and hit a forehand from that corner. Because men hit fewer backhands in total, men also hit fewer DTL backhands as a percentage of all shots from that corner.

However, once they choose to hit a backhand, men and women go down the line at exactly the same rate, 17.4%, or roughly once per six backhands.

DTL results

Let’s look at the results of those DTL backhands. Here’s another table with aggregate ATP and WTA numbers, showing the percentage of DTL backhands that go for winners (including shots that induce forced errors), the percentage that are unforced errors, and the percentage that lead to the most important thing–ultimately winning the point:

Tour  Winner%   UFE%  Points Won%  
ATP     22.1%  18.1%        51.6%  
WTA     26.0%  22.2%        52.4% 

Both men and women have a “positive” ratio of winners to unforced errors. But women hit a lot more of both. (As we’ll see, some women have eye-poppingly aggressive numbers.) And both genders, on average, end points more frequently with DTL backhands than they do with other shots, whether we look at all shots, or all backhands. The average non-slice backhand–counting those from every position, hit in every direction–goes for approximately 10% winners and 10% unforced errors.

The percentage of points won doesn’t look very dramatic, at 51.6% and 52.4% for men and women, respectively. Yet both numbers reverse the usual expectation for backhands. Backhands occur more often in defensive positions, so backhands are slightly more likely to occur in points lost than in points won, so the corresponding numbers for all backhands are below 50%. This is a good example of what makes shot analysis so difficult: Do DTL backhands result in more points won because they are a better tactical decision, or because players hit them more often in response to weak balls? It’s probably a combination of both, but more a reflection of the latter.

A lefty digression

I will get to some player-by-player numbers shortly, but first, let’s look at an interesting comparison between lefties and righties. I’ve long speculated that lefties–because they mostly face right-handed opponents–must learn to play “backwards.” While righties can whack crosscourt forehands at each other, a lefty rarely has the chance to do that. As a result, left-handers spend more time practicing unusual shots, like inside-out groundstrokes and the DTL backhand. That’s my theory, anyway.

Sure enough, left-handed men hit quite a few more DTL backhands than their right-handed peers. Righties go down the line on 16.9% of their backhands from the backhand corner, while lefties do so 21.4% of the time. Rafael Nadal plays a sizable part in this, as he represents a lot of the charted matches of left-handers, and he goes down the line 24.4% of the time, more than almost any other man. (Another lefty, Martin Klizan, is one of the few to be more extreme than Rafa, at 25.2%.) Still, a gap of several percentage points remains even if we exclude Nadal.

But this is hardly a physical law. Women show the opposite trend, in the aggregate. Right-handed women go down the line 17.6% of the time, while lefties do so 15.8% of the time. A few female lefties fit the mold of Nadal and Klizan, including Lucie Safarova (26.3%) and Ekaterina Makarova (26.1%). But in general, it is the most aggressive women–regardless of their dominant hand–who use the DTL backhand the most often. Jelena Ostapenko tops 27%, and Dayana Yastremska forces us to rescale the y-axis with a rate of 33%.

The DTL trade-off

I mentioned above a prime difficulty in evaluating shot selection. The most important measurement of any tactic is whether it results in more points won. If hitting more DTL backhands didn’t improve a player’s rate of points won, why would she do it? But if hitting more DTL backhands does improve her rate of points won, she should look to hit more of them, which means finding opportunities from slightly more challenging positions … which means winning points at a slightly lower rate. Push that logic to its extreme, and a superior tactic will no longer result in many more points won than the inferior tactic it replaces.

This problem, combined with the obvious fact that players have different skills and preferences, means that there’s not a strong relationship between a player’s rate of DTL backhands and their success–measured in points won–when they hit them. There is a very slight negative correlation (for both and women) between the frequency with which a player hits DTL backhands and the number of DTL backhand winners he or she hits, suggesting that there are limited opportunities to swing away and hit a clean winner. For women, there is no relationship, however, between the rate of DTL backhands and the rate of points won.

There is one minor exception to the barrage of non-relationships. For men, there is a weak negative correlation (r^2 = 0.13) between the rate at which the player hits DTL backhands and the rate of points won. That result tracks with the intuition described above, that as a player opts for the tactic more often, his results will decline–not because he plays worse, but because he is opting for the tactic in riskier situations. A player who goes down the line on 10% of his backhands is just picking the low-hanging fruit, while a player who does so 25% of the time is sometimes hitting an awfully low-percentage shot.

DTL by player: ATP

Thus, we might–very cautiously!–conclude a player who is winning a high percentage of points when he hits a DTL backhand should do so even more often. Here are 25 of the most prominent ATPers, sorted by the frequency with which they hit the DTL backhand:

Player                 DTL/BH   Wnr%   UFE%  Pts Won%  
Rafael Nadal            24.5%  12.1%  11.1%     54.7%  
John Isner              22.0%  23.2%  27.3%     38.2%  
Novak Djokovic          21.2%  16.7%  16.1%     54.2%  
Jo Wilfried Tsonga      21.0%  20.6%  27.7%     45.8%  
Denis Shapovalov        20.5%  20.1%  23.5%     49.1%  
Stan Wawrinka           19.1%  28.8%  26.8%     51.4%  
Kei Nishikori           18.8%  27.7%  19.1%     56.7%  
Dominic Thiem           18.4%  28.5%  28.2%     51.6%  
Fabio Fognini           18.3%  20.4%  23.8%     49.3%  
David Goffin            18.2%  23.5%  23.8%     49.5%  
Roger Federer           18.2%  25.5%  21.0%     53.2%  
Grigor Dimitrov         17.7%  27.4%  23.6%     50.5%  
Nick Kyrgios            17.7%  19.5%  23.5%     44.4%  
Andy Murray             16.8%  21.7%  16.5%     54.2%  
Richard Gasquet         16.6%  33.5%  23.1%     55.2%  
Juan Martin del Potro   15.5%  24.6%  15.7%     52.2%  
Alexander Zverev        15.3%  32.5%  19.0%     56.1%  
Gael Monfils            14.3%  25.9%  17.6%     54.7%  
Daniil Medvedev         14.3%  17.0%  16.9%     49.6%  
David Ferrer            14.2%  16.9%  18.1%     48.0%  
Stefanos Tsitsipas      14.1%  24.3%  22.9%     49.3%  
Borna Coric             13.6%  29.3%  24.1%     55.4%  
Kevin Anderson          13.3%  25.3%  24.9%     45.9%  
Roberto Bautista Agut   10.4%  17.3%  20.2%     46.3%  
Diego Schwartzman       10.3%  32.5%  22.3%     55.7%

If nothing else, these numbers show us that there are a lot of different ways to win tennis matches. Nadal hits a lot of backhands down the line, but he rarely ends the point that way. Only a bit further down the list, we find players who end the point with DTL backhands more than twice as often. The bottom of the table is filled with players who don’t win many points going down the line, but they are mixed with Diego Schwartzman and Borna Coric, two men who are very effective on the rare occasions they hit the more difficult shot.

DTL by player: WTA

There is no similar tour-wide correlation for women, but that doesn’t mean that each player’s shot selection is optimal. Here are the same stats for 25 prominent WTAers:

Player                DTL/BH   Wnr%   UFE%  Pts Won%  
Dayana Yastremska      33.7%  27.4%  24.7%     54.8%  
Jelena Ostapenko       27.1%  35.0%  33.6%     51.0%  
Serena Williams        25.2%  28.3%  19.6%     57.4%  
Belinda Bencic         21.6%  28.1%  14.6%     59.1%  
Aryna Sabalenka        21.2%  38.7%  25.5%     57.1%  
Madison Keys           20.4%  27.7%  39.9%     46.7%  
Simona Halep           20.1%  25.3%  21.7%     55.8%  
Venus Williams         19.2%  26.1%  19.7%     49.7%  
Bianca Andreescu       19.2%  22.6%  17.9%     59.7%  
Victoria Azarenka      19.1%  25.9%  16.2%     57.3%  
Karolina Pliskova      18.9%  26.6%  23.1%     51.6%  
Garbine Muguruza       18.1%  28.2%  18.9%     57.5%  
Maria Sharapova        18.0%  27.1%  21.4%     53.2%  
Naomi Osaka            17.9%  28.2%  27.7%     48.6%  
Johanna Konta          16.1%  33.4%  29.9%     53.6%  
Petra Kvitova          15.8%  30.9%  24.0%     54.0%  
Caroline Wozniacki     15.6%  25.5%  15.9%     56.8%  
Sloane Stephens        15.1%  25.9%  26.4%     53.2%  
Kiki Bertens           14.7%  21.6%  21.7%     49.0%  
Monica Niculescu       13.2%  29.7%  14.7%     62.9%  
Angelique Kerber       13.2%  26.7%  18.5%     56.2%  
Ashleigh Barty         13.1%  26.9%  29.0%     50.6%  
Marketa Vondrousova    11.5%  29.8%  18.5%     52.3%  
Carla Suarez Navarro   10.9%  33.1%  25.8%     55.9%  
Elina Svitolina        10.2%  27.6%  20.5%     53.9%

A dramatic example is that of Belinda Bencic, who hits more DTL backhands than almost anyone else on this list and is one of the most successful, in terms of points won, when she does so. It’s tough to avoid the hypothesis that she is squandering some opportunities to deploy this weapon. At the opposite extreme, Ostapenko and Madison Keys are extremely aggressive, hitting almost as many errors as winners, and in the case of Keys, winning considerably fewer than half of those points.

As it says on the tin, this is just a first look at the DTL backhand. Evaluating shot selection is hard, and quantifying the effects of shot-level tactics is even harder. But we can’t do it unless we’ve pinned down some of the basics, picking out some useful metrics and doing a first pass for any correlations that might (or probably don’t) exist. While it’s a long process, we’re one baby step closer to some answers.

Match Charting Project Tactics Stats: Glossary

I’m in the process of rolling out more stats based on Match Charting Project data across Tennis Abstract. This is one of several glossaries intended to explain those stats and point interested visitors to further reading.

At the moment, the following tactics-related stats can be seen at a variety of leaderboards.

  • SnV Freq% – Serve-and-volley frequency. The percentage of service points (excluding aces) on which the server comes in behind the serve. I exclude aces because serve-and-volley attempts are less clear (and thus less consistently charted) if the server realizes immediately that he or she has hit an unreturnable serve. I realize this is a minority opinion and thus an unorthodox way to calculate the stat, but I’m sticking with it.
  • SnV W% – Serve-and-volley winning percentage. The percentage of (non-ace) serve-and-volley attempts that result in the server winning the point.
  • Net Freq – Net point frequency. The percentage of total points in which the player comes to net, including serve-and-volley points. I include points in which the player doesn’t hit any net shots (such as an approach shot that leads to a lob winner), but I do not count points ended by a winner that appears to be an approach shot.
  • Net W% – Net point winning percentage. The percentage of net points won by this player.
  • FH Wnr% – Forehand winner percentage. The percentage of topspin forehands (excluding forced errors) that result in winners or induced forced errors.
  • FH DTL Wnr% – Forehand down-the-line winning percentage. The percentage of topspin down-the-line forehands (excluding forced errors) that result in winners or induced forced errors. Here, I define “down-the-line” a bit broadly. The Match Charting Project classifies the direction of every shot in one of three categories. If a forehand is hit from the middle of the court or the player’s forehand corner and hit to the opponent’s backhand corner (or a lefty’s forehand corner), it counts as a down-the-line shot. Thus, some shots that would typically be called “off” forehands end up in this category.
  • FH IO Wnr% – Forehand inside-out winning percentage. The percentage of topspin inside-out forehands (excluding forced errors) that result in winners or induced forced errors. This one is defined more strictly, only counting forehands hit from the player’s own backhand corner to the opponent’s backhand corner (or a lefty’s forehand corner).
  • BH Wnr% – Backhand winner percentage. The percentage of topspin backhands (excluding forced errors) that result in winners or induced forced errors.
  • BH DTL Wnr% – Backhand down-the-line winner percentage. The percentage of topspin down-the-line backhands (excluding forced errors) that result in winners or induced forced errors. As with the forehand down-the-line stat, I define these a bit broadly, catching some “off” backhands as well.
  • Drop Freq – Dropshot frequency. The percentage of groundstrokes that are dropshots. This excludes dropshots hit at the net and those hit in response to an opponent’s dropshot (re-drops).
  • Drop Wnr% – Dropshot winner percentage. The percentage of dropshots that result in winners or induced forced errors. Note that this number itself isn’t a verdict on the dropshot tactic, as it doesn’t count extended points that the player who hit the dropshot went on to win.
  • RallyAgg – Rally Aggression Score. A variation of Aggression Score, a stat invented by MCP contributor Lowell West. At its simplest, any member of this family of aggression metrics is the percentage of shots that end the point–winners, unforced errors, and shots that induce forced errors. RallyAgg excludes serves and is a bit more complex, following the logic that I outlined for Return Aggression by separating winners from unforced errors. For each match, the player’s unforced error rate and winner rate are normalized relative to tour average and expressed in standard deviations above or below the mean. RallyAgg is the average of those two numbers, multiplied by 100 for the sake of readability. The higher the score, the more aggressive the player. Tour average is zero.
  • ReturnAggReturn Aggression Score. Another variation of Aggression score, considering only return winners and return errors. As with RallyAgg, winners and errors are separated, and each rate is normalized relative to tour average. ReturnAgg is the average of those two normalized rates, multiplied by 100 for the sake of readability. The higher the number, the more aggressive the returner, and tour average is zero.

Net Play Has Declined, But This Isn’t Why

Italian translation at settesei.it

Wimbledon is here, so it’s time for another cycle of media commentary about the demise of net play, especially the serve-and-volley. The New York Times published a piece by Joel Drucker last week that covered this familiar territory, cataloguing various reasons why the game has changed. Racket and string technology, along with tweaks to the All England Club playing surface, are rightfully on the list.

But the first reason Drucker gives is the rise of the two-handed backhand and, by extension, the threat posed by players with weapons on both sides:

In May 1999, 43 of the top 100 male players in the world hit their backhands with one hand. As of June 2019, there were 15. According to Mark Kovacs, a sports science consultant and tennis coach, “Most players used to have a weaker side, usually the backhand. And the two-handed backhand changed that completely. It doesn’t give you a spot you can hit to.”

I’m more interested in the “weaker side” argument than the fortunes of the one-handed and two-handed backhands. Many players who still use one-handers, such as Stan Wawrinka, would rightly bristle at a claim that their shots are weak. In terms of effectiveness, the contemporary one-handed shot might have more in common with a two-hander of old than the all-slice, only-defensive backhand favored by many pros in the 1970s and 1980s.

Both sides, now

The “weaker side” argument can be slightly rephrased into a research question: For contemporary players, is there a smaller gap between forehand effectiveness and backhand effectiveness than there used to be?

To answer that, we need a working definition of “effectiveness.” Long-time readers may recall a stat of mine called “potency,” as in “backhand potency” (BHP) or “forehand potency” (FHP). It’s a simple stat, using data derived from the shot-by-shot records of the Match Charting Project, calculated as follows:

BHP approximates the number of points whose outcomes were affected by the backhand: add one point for a winner or an opponent’s forced error, subtract one for an unforced error, add a half-point for a backhand that set up a winner or opponent’s error on the following shot, and subtract a half-point for a backhand that set up a winning shot from the opponent.

The same procedure applies to forehand potency and slice potency. The weights–plus one for some shots, plus a half point for others, and so on–are not precise. But the results generally jibe with intuition. Across 3,000 charted ATP matches, an average player’s results from a single match are:

  • Forehand potency (FHP): +6.5
  • Backhand potency (BHP): +0.8
  • Slice potency (SLP): -1.3
  • Backhand side potency (BSP): -0.5

The first three stats isolate single shots, while the final one combines BHP and SLP into a single “backhand side” metric. All of these exclude net shots, and since forehand slices are so rare, I’ve left those out of today’s discussion as well.

The forehand reigns

The numbers above shouldn’t come as a surprise. The average ATP player has a stronger forehand than backhand, regardless of how many hands are on the racket for the latter shot. Novak Djokovic possesses one of the best backhands in the history of sport, but the gap between his FHP and BSP numbers is greater than average: +11.3 per match for the forehand, and +2.5 for the backhand, resulting in a difference of 8.8. Even a backhand master reaps more rewards on his other side.

The Match Charting Project has at least three matches worth of data for 299 different men across several generations, spanning from Vitas Gerulaitis to Jannik Sinner. Only 30 of them–about one in ten–gain more points on their backhand than on their forehands, and for half of that minority, the difference is less than a single point. It’s a diverse group, including Pat Cash, Jimmy Connors, Guillermo Coria, Ernests Gulbis, Daniil Medvedev, and Benoit Paire. This mixed-bag minority doesn’t provide much evidence to settle the question.

Proponents of the “weaker side” argument often point to the arrival of Lleyton Hewitt as a turning point between the net-play-was-feasible era and the approach-at-your-peril era. Others might point to Andre Agassi. As it turns out, both of these figures are surprisingly average.

The Match Charting Project has extensive records on both men. Hewitt’s forehand was worth +10.0 per match, while his backhand and slice combined for +2.9. That’s a difference of 7.1, a bit greater than average, though less than Djokovic’s. Agassi’s FHP was good for +13.0 per match, compared to a BSP of +6.8. That’s a difference of 6.2, even closer to the mean than Hewitt. Ironically, that gap is almost identical to that of Pete Sampras, whose FHP of +6.3 and BSP of -0.1 were equally spaced, even though his groundstrokes were considerably less effective.

Comparing eras

We can’t answer a general question about trends over time simply by calculating shot potencies for individual players, no matter how pivotal. Instead, we need to look at the whole population.

First, a quick note about our data: The Match Charting Project is extremely heavily weighted toward current players. Our sample of 300 players consists of only 40 whose careers were mostly or entirely in the 20th century, and 30 more whose matches mostly took place in the first decade of this century. Thus, the averages mentioned above are skewed toward the 2010s. That said, the 70 “older” players in the sample are the most prominent–the guys who played in major finals and semi-finals, and Masters finals. If there has been a marked trend across decades, those players should help us reveal it.

The earlier players in our sample are, in fact, quite similar to the contemporary ones. I ranked the 299 players by the absolute difference between their FHP and their BSP, with the most balanced player ranked 1, and the least balanced ranked 299. I looked at two subgroups: the 52 oldest players in the sample, most of whose careers were fading out when Hewitt arrived; and the 78 players with the most recent matches in the sample.

  • Oldest — Average rank: 143, Average (FHP – BSP): 5.7
  • Most recent — Average rank: 155, Average (FHP – BSP): 6.5

These numbers do not indicate that players used to have a weak side, and now they don’t. They don’t really reflect any trend at all. The difference between forehand effectiveness and backhand side effectiveness has barely changed over several decades.

As further evidence, here is a selection of players who are both well-represented in the Match Charting Project data and noteworthy representatives of their eras. They’re listed in approximate chronological order. Each of the shot-potency numbers is given on a per-match basis, and the final column (“Diff”) is the difference between FHP and BSP–the gap between each player’s forehand and backhand sides.

Player              FHP    BHP   SLP   BSP  Diff  
Bjorn Borg          12.9  11.5  -0.5  11.0   2.0  
Jimmy Connors       6.5    9.1  -0.3   8.9  -2.4  
John McEnroe        2.0   -0.4  -2.1  -2.4   4.4  
Mats Wilander       7.2    6.8  -0.5   6.3   0.9  
Ivan Lendl          10.3   4.0   0.6   4.6   5.7  
Stefan Edberg       1.9    1.8  -1.1   0.7   1.1  
Boris Becker        5.9    2.1  -1.5   0.7   5.2  
Jim Courier         13.3   4.2  -0.3   3.9   9.4  
Michael Stich       2.0    2.0  -3.4  -1.4   3.4  
Michael Chang       9.7    5.0  -0.6   4.4   5.3  
                                                  
Player              FHP    BHP   SLP   BSP  Diff  
Thomas Muster       18.4   2.2  -1.1   1.1  17.3  
Pete Sampras        6.3    0.7  -0.7  -0.1   6.4  
Andre Agassi        13.0   7.2  -0.5   6.8   6.3  
Patrick Rafter      3.5    0.5  -1.6  -1.1   4.6  
Carlos Moya         9.8   -0.9  -1.4  -2.3  12.1  
Lleyton Hewitt      10.0   3.5  -0.6   2.9   7.1  
Guillermo Coria     4.7    6.3  -1.2   5.2  -0.5  
David Nalbandian    8.8    5.6  -1.7   3.9   4.9  
Nikolay Davydenko   7.2    4.4  -1.2   3.2   4.0  
Roger Federer       10.0   0.2  -0.4  -0.3  10.2  
                                                  
Player              FHP    BHP   SLP   BSP  Diff  
Rafael Nadal        15.3   2.6  -1.0   1.6  13.7  
Andy Murray         7.2    2.9  -1.8   1.1   6.1  
Novak Djokovic      11.3   3.4  -0.8   2.5   8.8  
Richard Gasquet     1.9    1.4  -1.4   0.0   1.9  
Stan Wawrinka       6.2    0.5  -1.7  -1.2   7.3  
Kei Nishikori       5.4    3.8  -1.1   2.7   2.8  
Dominic Thiem       9.3   -0.1  -1.6  -1.7  11.0  
Alexander Zverev    3.6    4.2  -1.1   3.0   0.6  
Stefanos Tsitsipas  8.3   -0.9  -2.2  -3.0  11.4  
Daniil Medvedev     1.6    3.3  -1.4   1.9  -0.3 

Not weaker, but weak

These numbers cast a lot of doubt on the “weaker side” hypothesis, that it used to be easier to move forward by approaching an opponent’s less dangerous wing.

Instead, what has probably happened is that for the typical player, both sides got stronger. As a result, the weaker side was no longer flimsy enough to make approaching the net a profitable strategy. Even players with weaker-than-average backhands are now able to hit powerful topspin passing shots. This is essentially the racket-and-string-technology argument, and it seems to me to be the most valid.

There’s no question that tennis has drastically changed in the last few decades. But the conventional explanations for those trends don’t always hold up under scrutiny. In this case, while volleys have been reduced to a vestigial part of the singles game, groundstrokes–on both sides–have only gotten better.

Break Point Serve Tendencies on the ATP Tour

Italian translation at settesei.it

Every player has their “go-to” serve, their favorite option for high-pressure moments. At the same time, their opponents notice patterns, so no server can be too predictable. Let’s dive into the numbers to see who’s serving where, how it’s working out for them, and what it tells us about service strategies on the ATP tour.

Specifically, let’s look at ad-court first serves, and where servers choose to go on break points. For today’s purposes, we’ll focus on a group of 43 men, the players with at least 20 charted matches from 2010-present in the Match Charting Project dataset. For each of the players, we have at least 85 ad-court break points and another 800-plus ad-court non-break points. (I’ve excluded points in tiebreaks, because many of those are high-pressure as well, but it’s less clear cut than in other games.) For most players we’ve logged a lot more, including nearly 1,000 ad-court break points each for Novak Djokovic and Rafael Nadal.

First question: What’s everybody’s favorite break point serve? On average, these 43 men hit about 20% more “wide” first serves than “T” first serves on break points. (Body serves are a factor as well, but they make up only about 10% of total first serves, and comparing two options is way more straightforward than three.) That 20% difference isn’t quite as big as it sounds, since on non-break points in the ad court, players go wide about 10% more often. So while the wide serve is the typical favorite, it’s only a bit more common than on other ad-court points.

Tour-wide averages don’t tell us the whole story, so let’s look at individual players. Here are the ten men who favor each direction the most when choosing an ad-court first serve on break point:

Player                       BP Wide/T  
Philipp Kohlschreiber             2.58  
Pablo Cuevas                      2.46  
Denis Shapovalov                  1.94  
Rafael Nadal                      1.87  
Jack Sock                         1.84  
David Goffin                      1.78  
Nick Kyrgios                      1.69  
Alexandr Dolgopolov               1.66  
Dominic Thiem                     1.64  
Pablo Carreno Busta               1.58  
…                                       
Gilles Simon                      0.94  
Alex De Minaur                    0.94  
Gael Monfils                      0.90  
Feliciano Lopez                   0.83  
Tomas Berdych                     0.83  
Karen Khachanov                   0.82  
David Ferrer                      0.81  
Fabio Fognini                     0.77  
Diego Schwartzman                 0.69  
Borna Coric                       0.67

You’re probably as unsurprised as I was to find Rafael Nadal near the top of the list. The combination of Rafa and Denis Shapovalov suggests that lefties all follow the same pattern, but Feliciano Lopez swats away that hypothesis, as one of the players who most favors the T serve on break points. The other two lefties in our 43-player set, Adrian Mannarino and Fernando Verdasco, both hit more wide serves than average, so perhaps Feli is the odd man out here. We don’t have a lot of data on other contemporary lefties, so it’s tough to be sure.

Second question: How do break point tendencies compare to ad-court tendencies in general? We’ve already seen that players opt for wide first serves about 10% more than T deliveries in non-break point ad-court situations. That difference doubles on break points. These modest shifts lend themselves to an easy explanation: Most players serve a little better wide to the ad court, and under pressure, they’re a bit more likely to go with their most reliable option.

For some guys, though, there’s no “little” about it. We’ve already seen that Philipp Kohlschreiber goes wide every chance he gets on break points, more often than anyone else in our group. Yet on non-break points in the ad court, he splits his deliveries almost fifty-fifty. That’s a huge difference between break point and non-break point tendencies. He’s not alone. Borna Coric is similar (albeit less extreme) in the opposite direction, splitting his ad-court first serves about fifty-fifty in lower-pressure situations, then heavily favoring T serves when facing break point.

The next table shows the players who shift tactics most dramatically on break points. The first two columns show the ratio of wide serves to T serves on break points and on other ad-court points. The rightmost column shows the ratio between those two. At the top of the list are the men like Kohlschreiber, who go wide under pressure. At the bottom are the men like Coric. I’ve included the top ten in both directions, as well as the three members of the big four who aren’t in either category. Djokovic, for example, doesn’t let the situation alter his tactics, at least in this regard.

Player                 BP W/T  Other W/T  Wide BP/Other  
Philipp Kohlschreiber    2.58       1.04           2.49  
Nick Kyrgios             1.69       0.74           2.28  
Juan Martin del Potro    1.52       0.81           1.87  
Jack Sock                1.84       1.05           1.75  
Pablo Cuevas             2.46       1.50           1.64  
Kevin Anderson           1.18       0.74           1.59  
David Goffin             1.78       1.13           1.58  
John Isner               1.43       0.91           1.58  
Grigor Dimitrov          1.41       0.94           1.49  
Dominic Thiem            1.64       1.11           1.48  
…                                                        
Andy Murray              1.19       0.86           1.39  
Rafael Nadal             1.87       1.51           1.24  
Novak Djokovic           1.20       1.16           1.03  
…                                                        
Stan Wawrinka            0.99       1.15           0.87  
Roberto Bautista Agut    1.38       1.60           0.86  
Fabio Fognini            0.77       0.91           0.85  
Roger Federer            1.08       1.35           0.80  
Benoit Paire             1.36       1.73           0.78  
Adrian Mannarino         1.45       1.86           0.78  
Diego Schwartzman        0.69       0.89           0.78  
Feliciano Lopez          0.83       1.09           0.76  
Borna Coric              0.67       0.97           0.69  
Karen Khachanov          0.82       1.25           0.66

Some of the tour’s best servers feature near the top of the list. While many of them favor the ad-court T serve in general, they go wide more often under pressure. This tactic offers an explanation of why some players outperform (at least sometimes) on break points and in tiebreaks. Nick Kyrgios, for instance, is deadly serving in all directions, but in the ad court, he’s even better out wide. Overall, he wins 78.8% of his wide first serves in the ad court, against 75.8% of his T first serves. By “saving” the wide serves for big moments, he is able to defend more break points than his overall ad-court record would suggest. The same theory applies to tiebreaks, where a player could deploy their favored serve more often.

Third question: Could these tactics be improved? I usually start with the assumption that players know what they’re doing. If Kyrgios goes down the middle most of the time and then out wide more often on break points, it probably isn’t a random choice. There’s an easy rule of thumb to check whether servers are making optimal choices, which my co-podcaster Carl Bialik described a few years ago:

If your T serve is better than your wide serve, hit the T serve more. But don’t hit it 100 percent of the time because if you do, your opponent knows you’ll hit it and can stand in the middle of the court waiting for it instead of guarding against the wide serve. So how often should you hit it? Exactly as often as it takes to make it just as successful, but no more, than when you hit a wide serve. If your success rates on different choices are different, you’re not serving optimally.

For instance, facing break point in the ad court, Kyrgios wins 79.7% of his wide first serves and 76.1% of his T first serves. By Carl’s game-theory-derived logic, Kyrgios should be going wide even more often. His win rate on wide serves will go down a bit, as returners find him more predictable, but the average result of all of his break point serves will go up, as he trades a few T serves for more successful wide deliveries.

On average, our 43 players have a 4% gap between their break point win percentages on wide and T serves. Some of that is probably just noise. We’ve logged only 94 break points served by Alexandr Dolgopolov, so his 15% gap isn’t that reliable. Still, some gaps appear even for those players with considerably more data.

The following table shows the ten players with the most break points faced in the dataset. The third column–“BP Wide/T”–shows how much they favor the wide serve on break points. The next two columns show their winning percentages on break point first serves in the two primary directions. Finally, the last column shows the difference between those winning percentages, also in percentage terms. The closer the gap to 0%, the closer to an optimal strategy.

Player             BPs  BP Wide/T  Wide W%   T W%    Gap  
Novak Djokovic     973       1.20    73.1%  72.9%   0.3%  
Rafael Nadal       971       1.87    67.3%  76.7%  12.2%  
Roger Federer      865       1.08    77.1%  77.1%   0.0%  
Andy Murray        730       1.19    71.1%  72.2%   1.6%  
Alexander Zverev   493       1.04    72.4%  76.6%   5.5%  
Stan Wawrinka      379       0.99    72.7%  71.9%   1.2%  
Kei Nishikori      366       1.18    59.5%  69.6%  14.5%  
David Ferrer       347       0.81    59.7%  63.7%   6.2%  
Diego Schwartzman  338       0.69    72.2%  67.8%   6.5%  
Dominic Thiem      294       1.64    71.8%  73.9%   2.8%

Djokovic, Roger Federer, Andy Murray, and Stan Wawrinka are close to the tactical optimum. Nadal is … not. He loves the wide serve on break points, yet he is considerably more successful when he lands his first serve down the T.

But again, we need to work from the assumption that the players know what they’re doing–especially when that player is as accomplished and otherwise strategically sound as Rafa. My focus throughout this post has been on first serves. In general, players make first serves at about the same rate regardless of which direction they choose. In the ad court, down-the-middle attempts are a bit more likely to land in than wide deliveries. But for Rafa, it’s a different story. His wide serve isn’t particularly deadly, but it is the picture of reliability. His ad-court first serve wide hits the mark 77.8% of the time, compared to a mere 59.5% down the middle. The T serve is effective when it lands in, but that in itself is not sufficient reason to make more attempts.

The same reasoning can’t save Kei Nishikori. He has an even bigger gap than Rafa’s, winning about 70% of his break point first serves down the T but only 60% when he goes wide. This is almost definitely not luck: Assuming 180 serves in each direction and the average success rate of about 65%, the chances of either number being at least five percentage points above or below the mean is about 18%. The probability that both are so extreme is roughly 3.5%, so the odds that they are extreme in opposite directions is less than 2%, or one in fifty.

Like Nadal, he is one of the few players who makes a lot more first serves in one direction than the other. But unlike Nadal, his first-serve-in discrepancy makes the gap even more pronounced! In the 366 break points we’ve logged, he landed 48.8% of his break point wide first serve attempts and 62.8% of his tries down the T. He lands more first serves down the middle and those serves are more likely to result in points won. Nishikori needs to hit a lot more of his break point serves down the T. His T-specific winning percentage will probably decrease as opponents discover the more pronounced tendency, but his overall results would likely improve.

At the most basic level, players should be aware of their opponents’ serving tendencies, whether by rumor, advance scouting, or data like the Match Charting Project. Beyond that, we’ve seen that there’s even more potential in the data, showing that some men are leaving break points on the table. Most elite tennis players have a good intuitive grasp of game theory, but even elite-level intuition gets it wrong sometimes.

Petra Kvitova’s Current Status: Low Risk, High Reward

Italian translation at settesei.it

For more a decade, Petra Kvitova has been one of the most aggressive women in tennis. She aims for the corners, hits hard, and lets the chips fall where they may. Sometimes the results are ugly, like a 6-4 6-0 loss to Monica Niculescu in the 2016 Luxembourg final, but when it works, the rewards–two Wimbledon titles, for starters–more than make up for it.

She’s currently riding another wave of winners. Her 11-match win streak–which has involved the loss of only a single set–puts her one more victory away from a third major championship. The 28-year-old Czech has gotten this far by persisting with her big-hitting style, but with a twist: In Melbourne, she’s not missing very often. While she’s ending as many points as ever on her own racket, she’s missing less often than many of her more conservative peers.

In her last five matches at the Australian Open, from the second round through the semi-finals, 7.9% of her shots (including serves) have resulted in unforced errors. In the 88 Petra matches logged by the Match Charting Project, that’s the stingiest five-match stretch of her career. In charted matches since 2010, the average WTA player hits unforced errors on 8.0% of their shots. So Kvitova, the third-most aggressive player on tour, is somehow making errors at a below-average rate. It’s high-risk, high-reward tennis … without the risk.

And it isn’t because her go-for-broke tactics have changed. In Thursday’s semi-final against Danielle Collins, her aggression score–an aggregate measure of point-ending shots including winners, induced forced errors, and unforced errors–was 30.5%, the third-highest of all of her charted matches since her 2017 return to the tour. Her overall aggression score in Melbourne, 28.2%, is also higher than her career average of 27.1%.

In other words, she’s making fewer errors, and the missing errors are turning into point-ending shots in her favor. The following graph shows five-match rolling averages of winners (and induced forced errors) per shot and unforced errors per shot for all charted matches in Kvitova’s career:

Even with the winner and error rates smoothed out by five-match rolling averages, these are still some noisy trend lines. Still, some stories are quite clear. This month, Kvitova is hitting winners at close to her best-ever rate. Her average since the second round in Melbourne has been 20.3%, as high as anything she’s posted before with the exception of her 2014 Wimbledon title. (I’ve never tried to adjust winner totals for surface; it’s possible that the difference can be explained entirely by the grass.)

And most strikingly, this is as big a gap between winner rate and error rate as she’s achieved since her 2014 Wimbledon title run. In fact, between the second round and semi-finals at that tournament, she averaged 8.1% errors and 20.0% winners. Both of her numbers in Australia this year have been a tiny bit better.

Best of all, the error rate has–for the most part–seen a steady downward trend since 2016. The recent error spike is largely due to her three losses in Singapore last October and a bumpy start to this season in Brisbane. We can’t write those off entirely–perhaps Kvitova will always suffer through weeks when her aim goes awry–but she appears to have put them solidly behind her.

None of this is a guarantee that Petra will continue to avoid errors in Saturday’s final against Naomi Osaka. I could’ve written something about her encouraging error rates before the tour finals in Singapore last fall, and she failed to win a round-robin match there. And Osaka is likely to offer a stiffer challenge than any of Kvitova’s previous six opponents in Melbourne, even if her second serve doesn’t. That said, a stingy Kvitova is a terrifying prospect, one with the potential to end the brief WTA depth era and dominate women’s tennis.

The Oddity of Naomi Osaka’s Soft Second Serves

Italian translation at settesei.it

Naomi Osaka has quickly risen to the top of the women’s game on the back of some big hitting, especially a first serve that is one of the fastest in the game. Through Thursday’s semi-final, Osaka’s average first-serve speed in Melbourne was 105 mph, faster than all but two of the other women who reached the third round. Even those two–Aryna Sabalenka and Camila Giorgi–barely edged her out, each with average speeds of 106.

Shift the view to second serves, and Osaka’s place on the list is reversed. While Sabalenka’s typical second offering last week was 90 mph and Giorgi’s was 94, Osaka’s has been a mere 78 mph, the fourth-slowest of the final 32. That mark puts her just ahead of the likes of Angelique Kerber and Sloane Stephens, both whose average first serves are nearly 10 mph slower.

Osaka’s 27 mph gap is the biggest of anyone in this group. The next closest is Caroline Wozniacki’s 23 mph gap, between her 102 mph first serve and 79 mph second serve–both of which are less extreme than the Japanese player’s. Expressed as a ratio, Osaka’s average second serve is only 74% the speed of her typical first. That’s also the widest gap of any third-rounder in Melbourne; Wozniacki is again second-most extreme at 77%.

The following table shows first and second serve speeds, along with the gap and ratio between those two numbers, for a slightly smaller group: women for whom the Australian Open published at least four matches worth of serve-speed data:

Player          Avg 1st  Avg 2nd   Gap  Ratio  
Osaka             105.5     78.5  27.0   0.74  
Keys              105.2     85.4  19.7   0.81  
SWilliams         103.8     88.6  15.2   0.85  
Barty             102.0     88.2  13.7   0.87  
KaPliskova        101.9     80.5  21.4   0.79  
Collins           101.2     82.2  19.1   0.81  
Kvitova            99.6     91.6   8.0   0.92  
Muguruza           98.1     82.5  15.6   0.84  
Pavlyuchenkova     97.9     84.5  13.4   0.86  
Sharapova          97.9     89.6   8.2   0.92  
Svitolina          97.6     78.2  19.4   0.80  
Stephens           96.1     75.1  21.0   0.78  
Halep              95.3     80.9  14.4   0.85  
Kerber             94.0     78.4  15.7   0.83

Oddly enough, having such a slow second serve doesn’t seem to be causing any problems. In today’s semi-final against Karolina Pliskova, Osaka won 81% of first serve points and only 41% of second serve points, but her typical performance behind her second serve is better than that. And in this match, both women feasted on the other’s weaker serves: Pliskova won only 32% of her own second serves. (Though to be fair, Pliskova had the second-largest gap of the players listed above. She tends to rely more on spin than speed when her first serve misses.)

Across her six matches, Osaka has won 73.3% of her first serve points and 49.7% of her second serve points–a bit better than the average quarter-finalist in the former category, a very small amount worse than her peers in the latter. The ratio of those two numbers–68%–is almost identical to those of Danielle Collins, Petra Kvitova, Anastasia Pavlyuchenkova, and Serena Williams, all of whom have smaller gaps between their first and second serve speeds. Of the eight quarter-finalists, Kvitova has the smallest speed gap of all, yet the end result is the same as Osaka’s, she’s just a few percentage points better on both offerings.

Here are the first- and second-serve points won in Melbourne for the eight quarter-finalists, along with the ratio of those two figures and each player’s serve-speed ratio from the previous table:

QFist           1SPW%  2SPW%  W% Ratio  Speed Ratio  
Kvitova         77.9%  52.8%      0.68         0.92  
Williams        74.7%  50.0%      0.67         0.85  
Osaka           73.3%  49.7%      0.68         0.74  
Collins         72.5%  50.0%      0.69         0.81  
Barty           70.8%  55.7%      0.79         0.87  
Pliskova        70.5%  50.0%      0.71         0.79  
Pavlyuchenkova  67.0%  44.9%      0.67         0.86  
Svitolina       66.5%  48.1%      0.72         0.80 

Clearly, there’s more than one way to crack the final eight. With Kvitova, we have a server who racks up cheap points with angles instead of speed, rendering the miles-per-hour comparison a bit irrelevant. Serena’s results are close to Osaka’s, though she gets there with bit more bite on her second serves. And then there’s Svitolina, who doesn’t serve very hard or that effectively but can beat you in other ways.

Knowing all this, should Osaka hit harder second serves? In extreme cases, like today’s 81%/41% performance against Pliskova, the answer is yes–had she simply hit nothing but first serves and succeeded at the same rate, she would’ve piled up a lot of double faults but won more total points. But the margins are usually slimmer, and as we’ve seen, her second-serve performance isn’t bad, it just might offer room for improvement. Every player is different, but faster is usually better.

A thorough analysis of that question may be possible with the available data, but it will have to wait for another day. In the meantime, Saturday’s final will offer us a glimpse of contrasting styles: Osaka’s powerful first offering and soft second ball, against Kvitova’s angles and placement on both serves. Both my forecast and the betting market see the title match as a close one–perhaps Osaka’s second serve will be the shot that makes the difference.

Dayana Yastremska Hits Harder Than You

Italian translation at settesei.it

At the 2019 Australian Open, tennis balls have more to fear than ever before. Serena Williams is back and appears to be in top form, Maria Sharapova is playing well enough to oust defending champion Caroline Wozniacki, and Petra Kvitova has followed up her Sydney title with a stress-free jaunt through the first three rounds.

And then there are the youngsters. Hyper-aggressive 20-year-old Aryna Sabalenka crashed out in the third round against an even younger threat, Amanda Anisimova. But still in the draw, facing Serena on Saturday, is the hardest hitter of all, 18-year-old Ukrainian Dayana Yastremska. Watch a couple of Sabalenka matches, and you might wonder if we’ve reached the apex of aggression on the tennis court. Nope: Yastremska turns it up to 11.

When Lowell first introduced his aggression score metric a few years ago, Kvitova was the clear leader of the pack, the player who ended points–for good or ill–most frequently with the ball on her racket. Madison Keys wasn’t far behind, with Serena coming in third among the small group of players for which we had sufficient data. Since then, two things have changed: The Match Charting Project now has a lot more data on many more players, and a new generation of ball-bashers has threatened to make the rest of the tour look like weaklings in comparison.

The aggression score metric packs a lot of explanatory power in a simple calculation: It’s the number of point-ending shots (winners, unforced errors, or shots that induce a forced error from the opponent) divided by the number of shot opportunities. The resulting statistic ranges from about 10% at the lower extreme–Sara Errani’s career average is 11.6%–to 30%* at the top end. Individual matches can be even higher or lower, but no player with at least five charted matches sits outside of that range.

* Readers with a keen memory or a penchant for following links will notice that in Lowell’s orignial post, Kvitova’s aggregate score was 33% and Keys was also a tick above 30%. I’m not sure whether those were flukes that have since come back down with larger samples, or whether I’m using a slightly different formula. Either way, the ordering of players has remained consistent, and that’s the important thing.

Here are the top ten most aggressive WTA tour regulars of the 2010s before Sabalenka and Yastremska came along:

Rank  Player                      Agg 
1     Petra Kvitova             27.1%  
2     Julia Goerges             26.8%  
3     Serena Williams           26.8%  
4     Jelena Ostapenko          26.5%  
5     Camila Giorgi             26.0%  
6     Madison Keys              25.9%  
7     Coco Vandeweghe           25.9%  
8     Sabine Lisicki            25.6%  
9     Anastasia Pavlyuchenkova  24.0%  
10    Maria Sharapova           23.2%

All of these women rank among the top 15% of most aggressive players. They end points more frequently on their own racket than plenty of competitors we also consider aggressive, like Venus Williams (21.9%), Karolina Pliskova (21.6%), and Johanna Konta (22.3%). Ostapenko bridges the gap between the two generations; she wasn’t part of the discussion when aggression score was first introduced, though once she started winning matches, it was immediately clear that she’d challenge Kvitova at the top of this list.

Here’s the current top ten:

Rank  Player               Agg  
1     Dayana Yastremska  28.6%  
2     Aryna Sabalenka    27.6%  
3     Petra Kvitova      27.1%  
4     Julia Goerges      26.8%  
5     Serena Williams    26.8%  
6     Jelena Ostapenko   26.5%  
7     Viktoria Kuzmova   26.0%  
8     Camila Giorgi      26.0%  
9     Madison Keys       25.9%  
10    Coco Vandeweghe    25.9%

Yastremska, Sabalenka, and even Viktoria Kuzmova have elbowed their way into the top ten. Yastremska’s and Kuzmova’s places on this list might be a little premature, since their scores are based on only seven and nine matches, respectively. But Sabalenka’s pugnaciousness is well-documented: her Petra-topping score of 27.6% is an average across almost 30 matches.

Tennis tends to swing between extremes, with one generation developing skills to counteract the abilities of the previous one. It’s not yet clear whether the aggression of these young women will catapult them to the top–after all, Sabalenka won only five games today against Anisimova, whose aggression score is a more modestly high 23.0%. Perhaps as they gain experience, they’ll develop more well-rounded games and return Kvitova to her place at the top.

In the meantime, we have the privilege of watching some of the hardest hitters in WTA history battle it out. Tomorrow, Yastremska will contest her first third round at a major in a must-watch match against Serena. There will be fireworks, but it’s safe to say there won’t be much in the way of rallies.

Mackie McDonald’s Secret Weapon

Italian translation at settesei.it

In the first round on Monday, the 23-year-old American Mackenzie McDonald defeated young Russian Andrey Rublev in four sets, 6-4 6-4 2-6 6-4. While Rublev missed part of the 2018 season due to injury and carries a ranking just inside the top 100, the victory still qualifies as a bit of an upset for McDonald, who has never come close to Rublev’s peak of No. 31.

The handful of fans who kept tabs on Court 10 were treated to an unusual display. The American relentlessly attacked Rublev’s second serve, rushing the net behind his return almost two dozen times. Many players don’t hit return approach shots that often in an entire year. What’s more, the tactic worked. Without it, the already close match would have been a coin flip.

By my count, in the log I kept for the Match Charting Project, McDonald came in behind his second serve return 22 times. Approach shot counts are never precise, because when a player hits a winner or an error, he may lean forward as if to continue toward the net, but quickly stop when he realizes it’s unnecessary. To be precise, he came in at least 22 times, and perhaps one more return winner or a couple of return errors should also be added to the total. No matter, the conclusions are similar regardless of whether the number is 22 or 24.

Rublev hit 62 second serves, but 9 of those resulted in double faults, so we’re looking at 53 playable second serves. McDonald netrushed 22 of those, winning 10. Of the other 31, he won only 11. That’s a return winning percentage of 45% on return approaches compared to 35% on other returns. Had he won all of those points at the 35% rate, it would have cost him two, perhaps three points off his overall total. He barely outscored Rublev as it was, 124 points to 118, so every little bit helped.

A rarity in context

The Match Charting Project has shot-by-shot data for nearly 2,000 men’s matches from this decade, and Monday’s four-setter was the first one of those in which a player hit at least 20 second-serve return approaches. (Dustin Brown approached at a higher rate in multiple matches, including his 2015 Wimbledon upset of Rafael Nadal.) There are only ten other matches in the database in which one player hit at least ten such approaches, and Mischa Zverev accounts for three of them. More than three-quarters of the time, the total number of second-serve return approaches is zero.

McDonald is not alone in enjoying some success with the tactic: The 1500 or so second-serve return approaches in the dataset were about 14% more effective than non-approaches in the same matches. However, it’s hard to be sure what that number is telling us, since most players approach so rarely. Some of the attacks are probably on-the-fly decisions against particularly weak serves, not pre-planned plays like many of Mackie’s netrushes on Monday.

Thus, it’s difficult to know how much success most men would have with the tactic, were they to adopt it more often. The fact that they employ it so rarely might tell us all we need to know: If more players thought that attacking the net behind the second serve return would win them more points, they’d do it. But for McDonald, it doesn’t matter what his peers do; it only matters what works for him. These 22 return approaches represented a lot more aggression than he displayed in the four previous matches we’ve charted, and it paid off.

It wasn’t enough to get him a win today against Marin Cilic, but he did outperform expectations, taking a set against the 6th seed and defending finalist. Best of all, he won more than half of Cilic’s second-serve points–a better rate than he managed against Rublev, and several ticks above 46%, the fraction that the average opponent manages against Cilic. In a sport often criticized for its uniformity of tactics, McDonald is an up-and-comer worth watching.

What I Should’ve Known About Playing Styles and Upsets

In the podcast Carl Bialik and I recorded yesterday, I mentioned a pet theory I’ve had for awhile, that upsets are more likely in matches between players with contrasting styles. The logic is fairly simple. If you have two counterpunchers going at it, the better counterpuncher will probably win. If two big servers face off, the better big server should have no problem. But if a big server plays a counterpuncher … then, all bets are off.

We’ve seen Rafael Nadal struggle against the likes of John Isner and Dustin Brown, and and we’ve seen big servers neutralized by their opposites, as in Marin Cilic’s 1-6 record against Gilles Simon. There are upsets when similar styles clash, as well, but as untested theories go, this one is appealing and not obviously flawed.

Then, to kick off the 2019 Australian Open, Reilly Opelka knocked out Isner. Playing styles don’t come much more evenly matched, and the veteran was the heavy favorite. It was a perfect example of the kind of match I would expect to follow the script, yet the underdog came out on top. They played four tiebreaks and there were only two breaks of serve, but Opelka didn’t even need the Australian Open’s new fifth-set 10-point tiebreak. While it’s just one match, of course, it suggested that I ought to look more closely at my assumptions.

After a couple of hours playing with data this afternoon, my theory is no longer untested … and it turned out to be flawed. Fortunately, it isn’t just another negative result. Playing style is related to upset likelihood, but not in the way I predicted.

Measuring predictability

Let me explain how I tested the idea, and we’ll work our way to the results. First, I used used Match Charting Project data to calculate aggression score for every ATP player with at least 10 charted matches since 2010. Aggression score is, essentially, the percentage of shots that end the point (by winner, unforced error, or inducing a forced error), as will serve as our proxy for playing style. That gives us a group of 106 players, from the conservative Simon and Yoshihito Nishioka with aggression scores around 13%, to the freewheeling Brown and Ivo Karlovic, with scores nearing 30%. I divided those 106 players into quartiles (by number of matches, not number of players, so each quartile contains between 21 and 31 players) so we could see how each general playing style fares against the others. Here are the groups:

(Aggression score conflates two things: big serving/big hitting and tactical aggression. Isner is sometimes not particularly aggressive, but because of his size and serve skill, he is able to end points so frequently that, statistically, he appears to be extremely aggressive. Accordingly, I’ll refer to “big servers” and “aggressive players” interchangeably, even though in reality, there are plenty of differences between the two groups.)

Limiting our view to these 106 men, I found just over 11,000 matches to evaluate and divided them into groups based on which quartiles the two players fell into. Each of the ten possible subsets of matches, like Q1 vs Q2, or Q4 vs Q4, contains at least 400 examples.

For every match, I used surface-adjusted Elo ratings to determine the likelihood that the favorite would win. That gives us pre-match odds that aren’t quite as accurate as what sportsbooks might offer, though they’re close.

Those pre-match odds are key to determining whether certain groups are more predictable than others. If there are 100 matches in which the favorite is given a 60% chance of winning, and the favorites win 70 of them, we’d say that the results were more predictable than expected. If the favorites win only 50, the results were less predictable.

Goodbye, pet theory

For the matches in each of the ten quartile-vs-quartile subsets, I calculated the average favorite’s chance of winning (“Fave Odds”), then compared that to the frequency with which the favorites went on to win (“Fave Win%”). The table below shows the results, along with the relationship between those two numbers (“Ratio”). A ratio of 1.0 means that matches within the subset are exactly as predictable as expected; higher ratios mean that the favorites were even better bets than the odds gave them credit for, and lower ratios indicate more upsets than expected.

[table id=1 /]

There’s a striking finding here: The largest ratio, marking the most predictable bucket of matches, is for the most conservative pairs of players, while the smallest ratio, pointing to the most frequent upsets, is for the most aggressive players.

Before analyzing the relationship, let’s check one more thing. The very best players aren’t evenly divided throughout the quartiles, since Q1 has two of the big four. Elo-based match predictions–one of the building blocks of these results–are tougher to get right for the best players and the most uneven matchups, so we need to be careful whenever the elites might be influencing our findings. Therefore, let’s look at the same numbers, but this time for only those matches in which the favorite has a 50% to 70% chance of winning. This way, we exclude many of the best players’ matchups and all of their more lopsided contests:

[table id=2 /]

We discard about 40% of our sample, but the predictability trend remains the generally the same. In both the overall sample and the narrower 50%- to 70%-favorite subset, the strongest relationship I could find was between the predictability ratio and the quartile of the less aggressive player. In other words, a counterpuncher is likely to have more predictable results–regardless of whether he faces a big server, a fellow counterpuncher, or anyone in between–than a more aggressive player.

Back to basics

My initial theory is clearly wrong. I expected to find that Q1 vs Q1 matches were more predictable than average, and I was right. But by my logic, I also guessed that Q4 vs Q4 matches went according to script, and that other pairings, like Q1 vs Q4, would be more upset-prone. I would have done better had I let the neighbor’s cat make my predictions for me.

Instead, we find that that matches with more aggressive players are more likely to result in surprises. That doesn’t sound so groundbreaking, and it’s something I should’ve seen coming. Big servers tend to hold serve more often and break serve less frequently, meaning that their matches end with narrower margins, opening the door for luck to play a larger role, especially when sets and matches are determined by tiebreaks.

After all this, you might be thinking that I’ve squandered my afternoon, plus another few minutes of your attention, arriving at something obvious and unremarkable. I agree that it’s not that exciting to proclaim that big servers are more influenced by luck. But there’s still a useful–even surprising–discovery buried here.

Exponential upset potential

We know that the most one-dimensional players are more subject than others to the ups and downs of luck, thanks to the narrow margins of tiebreaks. For a man who rarely breaks serve, no match is a guaranteed win; for a man who rarely gets broken, no opponent is impossible to beat. However, I would have expected that the unpredictability of big servers was already incorporated into our match predictions, via the Elo ratings of the big servers. If a player has unusually random results, we’d expect his rating to drift toward tour average. That’s one reason that it’s very difficult for poor returners to reach the very top of the rankings.

But apparently, that isn’t quite right. The randomness-driven Elo ratings of our big servers do a nearly perfect job of predicting match outcomes against counterpunchers, and they’re only a little bit too confident against the more middle-of-the-road players in Q2 and Q3. Against each other, though, upsets run rampant. That extremely volatile fraction of results–the tiebreak-packed outcomes when the biggest servers face off–only accounts for part of these players’ ratings.

We’re accustomed to getting unpredictable results from the most aggressive players, with their big serves, inconsistent returns, and short rallies. Today’s findings give us a better idea of when these do and do not occur. Against counterpunchers, things aren’t so unpredictable after all. But when big servers play each other, we expect the unexpected–and the results are even more unpredictable than that.

Just How Aggressive is Jelena Ostapenko?

Italian translation at settesei.it

If you picked up only two stats about surprise Roland Garros champion Jelena Ostapenko, you probably heard that, first, her average forehand is faster than Andy Murray’s, and second, she hit 299 winners in her seven French Open matches. I’m not yet sure how much emphasis we should put on shot speed, and I instinctively distrust raw totals, but even with those caveats, it’s hard not to be impressed.

Compared to the likes of Simona Halep, Timea Bacsinszky, and Caroline Wozniacki, the last three women she upset en route to her maiden title, Ostapenko was practically playing a different game. Her style is more reminiscent of fellow Slam winners Petra Kvitova and Maria Sharapova, who don’t construct points so much as they destruct them. What I’d like to know, then, is how Ostapenko stacks up against the most aggressive players on the WTA tour.

Thankfully we already have a metric for this: Aggression Score, which I’ll abbreviate as AGG. This stat requires that we know three things about every point: How many shots were hit, who won it, and how. With that data, we figure out what percentage of a player’s shots resulted in winners, unforced errors, or her opponent’s forced errors. (Technically, the denominator is “shot opportunities,” which includes shots a player didn’t manage to hit after her opponent hit a winner. That doesn’t affect the results too much.) For today’s purposes, I’m calculating AGG without a player’s serves–both aces and forced return errors–so we’re capturing only rally aggression.

The typical range of this version AGG is between 0.1–very passive–and 0.3–extremely aggressive. Based on the nearly 1,600 women’s matches in the Match Charting Project dataset, Kvitova and Julia Goerges represent the aggressive end, with average AGGs around .275. We only have four Samantha Crawford matches in the database, but early signs suggest she could outpace even those women, as her average is at .312. At the other end of the spectrum, Madison Brengle is at 0.11, with Wozniacki and Sara Errani at 0.12. In the Match Charting data, there are single-day performances that rise as high as 0.44 (Serena Williams over Errani at the 2013 French Open) and fall as low as 0.06. In the final against Ostapenko, Halep’s aggression score was 0.08, half of her average of 0.16.

Context established, let’s see where Ostapenko fits in, starting with the Roland Garros final. Against Halep, her AGG was a whopping .327. That’s third highest of any player in a major final, behind Kvitova at Wimbledon in 2014 (.344) and Serena at the 2007 Australian Open (.328). (We have data for every Grand Slam final back to 1999, and most of them before that.) Using data from IBM Pointstream, which encompasses almost all matches at Roland Garros this year, Ostapenko’s aggression in the final was 7th-highest of any match in the tournament–out of 188 player-matches with the necessary data–behind two showings from Bethanie Mattek Sands, one each from Goerges, Madison Keys, and Mirjana Lucic … and Ostapenko’s first-round win against Louisa Chirico. It was also the third-highest recorded against Halep out of more than 200 Simona matches in the Match Charting dataset.

You get the picture: The French Open final was a serious display of aggression, at least from one side of the court. That level of ball-bashing was nothing new for the Latvian, either. We have charting data for her last three matches at Roland Garros, along with two matches from Charleston and one from Prague this clay season. Of those six performances, Ostapenko’s lowest AGG was .275, against Wozniacki in the Paris quarters. Her average across the six was .303.

If those recent matches indicate what we’ll see from her in the future, she will likely score as the most aggressive rallying player on the WTA tour. Because she played less aggressively in her earlier matches on tour, her career average still trails those of Kvitova and Goerges, but not by much–and probably not for long. It’s scary to consider what might happen as she gets stronger; we’ll have to wait and see how her tactics evolve, as well.

The Match Charting Project contains at least 15 matches on 62 different players–here is the rally-only aggression score for all of them:

PLAYER                    MATCHES  RALLY AGG  
Julia Goerges                  15      0.277  
Petra Kvitova                  57      0.277  
Jelena Ostapenko               17      0.271  
Madison Keys                   35      0.261  
Camila Giorgi                  17      0.257  
Sabine Lisicki                 19      0.246  
Caroline Garcia                15      0.242  
Coco Vandeweghe                17      0.238  
Serena Williams               108      0.237  
Laura Siegemund                19      0.235  
Anastasia Pavlyuchenkova       17      0.230  
Danka Kovinic                  15      0.223  
Kristina Mladenovic            28      0.222  
Na Li                          15      0.218  
Maria Sharapova                73      0.217  
                                              
PLAYER                    MATCHES  RALLY AGG  
Eugenie Bouchard               52      0.214  
Ana Ivanovic                   46      0.211  
Garbine Muguruza               57      0.210  
Lucie Safarova                 29      0.209  
Karolina Pliskova              42      0.207  
Elena Vesnina                  20      0.207  
Venus Williams                 46      0.205  
Johanna Konta                  31      0.205  
Monica Puig                    15      0.203  
Dominika Cibulkova             38      0.198  
Martina Navratilova            25      0.197  
Steffi Graf                    39      0.196  
Anastasija Sevastova           17      0.194  
Samantha Stosur                19      0.193  
Sloane Stephens                15      0.190  
                                              
PLAYER                    MATCHES  RALLY AGG  
Ekaterina Makarova             23      0.189  
Lauren Davis                   16      0.186  
Heather Watson                 16      0.185  
Daria Gavrilova                20      0.183  
Justine Henin                  28      0.183  
Kiki Bertens                   15      0.181  
Monica Seles                   18      0.179  
Svetlana Kuznetsova            28      0.174  
Timea Bacsinszky               28      0.174  
Victoria Azarenka              55      0.170  
Andrea Petkovic                24      0.166  
Roberta Vinci                  23      0.164  
Barbora Strycova               16      0.163  
Belinda Bencic                 31      0.163  
Jelena Jankovic                24      0.162  
                                              
PLAYER                    MATCHES  RALLY AGG  
Alison Riske                   15      0.161  
Angelique Kerber               83      0.161  
Flavia Pennetta                23      0.160  
Simona Halep                  218      0.160  
Carla Suarez Navarro           31      0.159  
Martina Hingis                 15      0.157  
Chris Evert                    20      0.152  
Darya Kasatkina                18      0.148  
Elina Svitolina                46      0.141  
Yulia Putintseva               15      0.137  
Alize Cornet                   18      0.136  
Agnieszka Radwanska            90      0.130  
Annika Beck                    16      0.126  
Monica Niculescu               25      0.124  
Caroline Wozniacki             62      0.122  
Sara Errani                    23      0.121

(A few of the match counts differ slightly from what you’ll find on the MCP home page. I’ve thrown out a few matches with too much missing data or in formats that didn’t play nice with the script I wrote to calculate aggression score.)