Measuring WTA Tactics With Aggression Score

Editor’s note: Please welcome guest author Lowell! He’s a prolific contributor to the Match Charting Project, and the author of the first guest post on this blog.

The Problem

Quantifying aggression in tennis presents a quandary for the outsider. An aggressive shot and a defensive shot can occur on the same stroke at the same place on the court at the same point in a rally. To know whether one occurred, we need information on court positioning and shot speed, not only of the current shot, but the shots beforehand.

Since this data only exists for a fraction of tennis matches (via Hawkeye) and is not publicly available, using aggressive shots as a metric is untenable for public consumption. In a different era, net points may have been a suitable metric, but almost all current tennis, especially women’s tennis, revolves around baseline play.

Net points also can take on a random quality and may not actually reflect aggression. Elina Svitolina, according to data from the Match Charting Project, had 41 net points in her match against Yulia Putintseva at Roland Garros this year. However, this was not an indicator of Svitolina’s aggressive play so much as Putintseva hitting 51 drop shots in the match.

The Match Charting Project does give some data to help with this problem however. We can use the data to get the length of rallies and whether a player finished the point, i.e. he/she hit a winner or unforced error or their opponent hit a forced error. If we assume an aggressive player would be more likely to finish the point and would be more likely to try to finish the point sooner rather than later in a rally, we can build a metric.

The Metric

To calculate aggression using these assumptions, we need to know how often a player finished the point and how many opportunities did they have to finish the point, i.e. the number of times they had the ball in play on their side of the net. To measure the number of times a player finished the point, we add up the points where they hit a winner or unforced error or their opponent hit a forced error. For short, I will refer to these as “Points on Racquet”.

To measure how many opportunities a player had to finish the point, we calculate the number of times the ball was in play on each player’s side of the net. For service points, we add 1 to the length of each rally and divide it by 2, rounding up if the result is not an integer. For return points, we divide each rally by 2, rounding up if the result is not an integer. These adjustments allow us to accurately count how often a player had the ball in play on their side of the net. For brevity, I will call these values “Shot Opportunities”.

If we divide Points on Racquet by Shot Opportunities we will get a value between 0 and 1. If a player has a value of 0, they never finish points when the ball is on their side of the net. If the player has a value of 1, they only hit shots that end the point. As the value increases, a player is considered more aggressive. For short, I will call this measure an “Aggression Score.”

The Data

Taking data from the latest upload of the Match Charting Project, I found women’s players with 2000 or more completed points in the database (i.e. all points that were not point penalties or missed points). Eighteen players fitted these criteria. Since the Match Charting Project is, unfortunately, a nonrandom sample of matches, I felt uncomfortable making assessments below a very large number of data points. Using 2000 or more data points, however, an overwhelming amount of data would be required to overcome these assessments, giving some confidence that, while bias exists, we get in the neighborhood of the true aggression values.

The Results

Below are the results from the analysis. Tables 1-3 provide the Aggression Scores for each player overall, broken down into serve and return scores and further broken down into first and second serves. They also provide differences between where we would expect the player to be more aggressive (Serve v. Return, First Serve v. Second Serve and Second Serve Return v. First Serve Return).

Table 1: Aggression Scores

Name         Overall  On Serve  On Return  S-R Spread  
S Williams     0.281    0.3114     0.2476      0.0638  
S Halep       0.1818    0.2058     0.1537      0.0521  
M Sharapova   0.2421    0.2471     0.2358      0.0113  
C Wozniacki   0.1526    0.1788     0.1185      0.0603  
P Kvitova     0.3306     0.347      0.309       0.038  
L Safarova    0.2475    0.2694     0.2182      0.0512  
A Ivanovic    0.2413     0.247     0.2335      0.0135  
Ka Pliskova    0.256    0.2898     0.2095      0.0803  
G Muguruza     0.231     0.238     0.2214      0.0166  
A Kerber      0.1766    0.2044     0.1433      0.0611  
B Bencic      0.1742    0.1784     0.1687      0.0097  
A Radwanska   0.1473    0.1688     0.1207      0.0481  
S Errani      0.1232    0.1184     0.1297     -0.0113  
E Svitolina   0.1654    0.1769     0.1511      0.0258  
M Keys        0.3017    0.3284     0.2677      0.0607  
V Azarenka    0.1892    0.1988     0.1762      0.0226  
V Williams    0.2251     0.247     0.1944      0.0526  
E Bouchard    0.2458    0.2695     0.2157      0.0538  
WTA Tour       0.209    0.2254     0.1877      0.0377

Table 2: Serve Aggression Scores

Name          Serve  First Serve  Second Serve  1-2 Spread  
S Williams   0.3114       0.3958        0.2048       0.191  
S Halep      0.2058       0.2298        0.1587      0.0711  
M Sharapova  0.2471       0.2715        0.1989      0.0726  
C Wozniacki  0.1788       0.2016         0.121      0.0806  
P Kvitova     0.347       0.3924        0.2705      0.1219  
L Safarova   0.2694       0.3079        0.1983      0.1096  
A Ivanovic    0.247       0.2961        0.1732      0.1229  
Ka Pliskova  0.2898       0.3552        0.1985      0.1567  
G Muguruza    0.238       0.2906        0.1676       0.123  
A Kerber     0.2044       0.2337        0.1384      0.0953  
B Bencic     0.1784       0.2118        0.1218        0.09  
A Radwanska  0.1688       0.2083        0.0931      0.1152  
S Errani     0.1184       0.1254        0.0819      0.0435  
E Svitolina  0.1769       0.2196         0.105      0.1146  
M Keys       0.3284       0.3958        0.2453      0.1505  
V Azarenka   0.1988       0.2257        0.1347       0.091  
V Williams    0.247       0.3033        0.1716      0.1317  
E Bouchard   0.2695       0.3043        0.2162      0.0881  
WTA Tour     0.2254       0.2578        0.1679      0.0899

Table 3: Return Aggression Scores

Name          Serve  1st Return  2nd Return  Spread  
S Williams   0.2476      0.2108      0.3116  0.1008  
S Halep      0.1537      0.1399      0.1778  0.0379  
M Sharapova  0.2358      0.2133      0.2774  0.0641  
C Wozniacki  0.1185      0.1098       0.132  0.0222  
P Kvitova     0.309      0.2676      0.3803  0.1127  
L Safarova   0.2182      0.1778      0.2725  0.0947  
A Ivanovic   0.2335      0.1952      0.3027  0.1075  
Ka Pliskova  0.2095      0.1731      0.2715  0.0984  
G Muguruza   0.2214      0.1888      0.2855  0.0967  
A Kerber     0.1433      0.1127       0.191  0.0783  
B Bencic     0.1687      0.1514       0.197  0.0456  
A Radwanska  0.1207      0.1049      0.1464  0.0415  
S Errani     0.1297      0.1131      0.1613  0.0482  
E Svitolina  0.1511      0.1175      0.1981  0.0806  
M Keys       0.2677      0.2322      0.3464  0.1142  
V Azarenka   0.1762      0.1499      0.2164  0.0665  
V Williams   0.1944      0.1586       0.255  0.0964  
E Bouchard   0.2157      0.1757      0.2837   0.108  
WTA Tour     0.1877      0.1609      0.2341  0.0732

The first plot shows the relationship between serve and return aggression scores as well as the regression line with a confidence interval (note: since there are only 18 players in the sample, treat this regression line and all of the others in this post with caution).

Figure2

The second and third plots show the relationships between players’ aggression scores on first serves and their aggression scores on second serves for serve and return points respectively as well as the regression lines with confidence intervals.

Figure3

Figure4

The fourth and fifth plots show the relationship between the spread of serve and return aggression scores between first and second serve and the more aggressive point for the player, i.e. first serve for service points and second serve for return points as well as the regression lines with confidence intervals.

Figure5

Figure6

 

We can take away five preliminary observations.

Sara Errani knows where her money is made. The WTA is notoriously terrible for providing statistics. However, they do provide leaderboards for particular statistics, including return points and games won. Errani leads the tour in both this year. She also uniquely holds a higher Aggression Score on return points than serve points. From this information, we can hypothesize that Errani may play more aggressive on return points because she has greater confidence she can win those points or because she relies on those points more to win.

Maria Sharapova is insensitive to context; Elina Svitolina is highly sensitive to context. She falls outside of the confidence interval in all five plots. More specifically, Sharapova consistently is more aggressive on return points, second serve service points and first serve return points than her scores for service points, first serve service points and second serve return points respectively would predict. She has also lower spreads on serve and return than her more aggressive points would predict.

This result suggests that Sharapova differentiates relatively little in how she approaches points according to whether she is serving or returning or whether it is first serve or second serve. Svitolina exhibits the opposite trend as Sharapova. Considering anecdotal thoughts from watching Sharapova and Svitolina, these results make sense. Sharapova’s serve does not seem to vary between first and second and we see a lot of double faults. Svitolina can vary between aggressive shot-making and big first serves and conservative play. Hot takes are not always wrong.

Lucie Safarova, meet Eugenie Bouchard; Ana Ivanovic, meet Garbine Muguruza. Looking at the plots, it is interesting to note how Safarova and Bouchard seem to follow each other across the various measures. The same is true for Ivanovic and Muguruza. A potential application of the aggression score is that it can point us to players that are comparable and may have similar results. Players with good results against Safarova and Ivanovic may have good results against Bouchard and Muguruza, two younger players whom they are much less likely to have played.

Serena Williams and Karolina Pliskova serve like Madison Keys and Petra Kvitova, but they are very different. Serena, Pliskova, Keys and Kvitova are all players that are known for their serves as their weapons. Serena and Pliskova have the third and fourth highest Aggression Scores respectively. However, they also have wide spreads on serve and return scores and they have much lower second serve service point scores than their first serve scores would predict, whereas Keys is about where the prediction places her and Kvitova is far more aggressive than her first serve points would predict.

While Serena is still a relatively aggressive returner, she rates lower on first serve return aggression than Maria Sharapova. Pliskova falls to the middle of the pack on return aggression. Kvitova and Keys, in contrast, are both very aggressive on return points. My hypothesis for the difference is that while Serena and Pliskova are aggressive players, their scores get inflated by using their first serve as a weapon and they are only somewhat more aggressive than the players that score below them. Kvitova and Keys, on the other had, are exceptionally aggressive players.

The WTA runs through Victoria Azarenka and Madison Keys. Oddly, the players who seemed to best capture the relationships between all of the aggression scores and spreads of aggression scores were Victoria Azarenka and Madison Keys. Neither strayed outside of the confidence interval and often ended up on the best-fit line from the regressions. They define average for the WTA top 20.

These thoughts are preliminary and any suggestions on how they could be used or improved would be helpful. I also must beseech you to help with the Match Charting Project to put more players over the 2,000 point mark and get more points for the players on this list to help their Aggression Scores a better part of reality.

Is Serena Williams Taking Advantage of a Weak Era?

tl;dr: No.

Serena Williams is, without question, the best player in women’s tennis right now. She’s held that position off and on for over a decade, and it’s easy to make the case that she’s the best player in WTA history.

The longer one player dominates a sport, the tougher it is to distinguish between her ability level and the competitiveness of the field. Is Serena so successful right now because she is playing better than any woman in tennis history, or because by historical standards, the rest of the pack just isn’t very good?

As we’ll see, the level of play in women’s tennis has remained relatively steady over the last several decades. While there is no top player on tour these days who consistently challenges Serena as Justine Henin or peak Venus did, the overall quality of the pack is not much different than it has been at any point in the last 35 years.

Quantifying eras

Every year, a few new players break in, and a few players fade away. If the players who arrive are better than those who leave, the level of competition gets a bit harder for the players who were on tour for both seasons. That basic principle is enough to give us a rough estimate of “era strength.”

With this method, we can compare only adjacent years. But if we know that this year’s field is 1% stronger than last year’s, and last year’s field was 1% stronger than the year before that, we can calculate a comparison between this year’s field and that of two years ago.

Since 1978, the level of play has fluctuated within a range of about 10%. The 50th-best player from a strong year–1995, 1997, and 2006 stand out–would win 7% or 8% more points than the 50th-best player from a weak year, like 1982, 1991, and 2005. That’s not a huge difference. One or two key players retiring, breaking on to the scene, or missing substantial time due to injury can affect the overall level of play by a few percentage points.

The key here is that a dominant season in the mid-1980s isn’t much better or worse than a dominant season now. Perhaps Martina Navratilova faced a stiffer challenge from Chris Evert than Serena does from Maria Sharapova or Simona Halep, but that difference is at least partially balanced by a stronger pack beyond the top few players. Serena probably has to work harder to get through the early rounds of a Grand Slam than Martina did.

Direct comparisons

So, Serena’s great, and her greatness isn’t a mirage built on a weak era. Using this approach, how does she compare with the greats of the past?

Given an estimate of each season’s “pack strength,” we can rate every player-season back to 1978. For instance, if we approximate Serena’s points won in 2015 (based on games won and lost), we get a Dominance Ratio (the ratio of return points won to serve points lost) of 2.15. In layman’s terms, that means that she’s beating the 50th-ranked player in the world by a score of 6-1 6-1 or 6-1 6-2. The 2.15 number means she’s winning 115% more return points than that mid-pack opponent. If the pack were particularly strong this season, we’d adjust that number upwards to account for the level of competition.

Repeat the process for every top player, and we find some interesting things.

Serena’s 2.15–the second-best of her career, behind 2.19 in 2012–is extremely good, but only the 21st-ranked season since 1978. By this metric, the best season ever was Steffi Graf‘s 1995 campaign, at 2.42, with Navratilova’s 1986 and Evert’s 1981 close behind at 2.38.

Graf has seven of the top 20 seasons since 1978, Navratilova has four, and Evert has three. Venus’s 2000 ranks sixth, while Henin’s 2007 ranks tenth.

It seems to have become harder to post these extremely high single-season numbers. In the last ten years, only Serena, Henin, Sharapova, Kim Clijsters, and Lindsay Davenport have posted a season above 2.0. Serena has done so four times, making her the only player in that group to accomplish the feat more than once.

Best ever?

As we’ve seen in comparing Serena’s best seasons to those of the other greatest players in WTA history, it’s far from clear that Serena is the greatest of all time. Graf and Navratilova set an incredibly high standard, and since the greats all excelled in slightly different ways, against different peer groups, picking a GOAT may always be a matter of personal taste.

Assigning a rating to the current era, however, isn’t something we need to leave up to personal taste. I’m confident in the conclusion that Serena is not simply padding her career totals against a weak era. If anything, her own dominance–during an era when dominating the women’s game seems to be getting harder–is making her peers look weaker than they are.

Bouchard, Radwanska, and Second Serve Futility

In yesterday’s women’s semifinals, we were treated to some impressive return-of-serve performances. Li Na won almost 65% of points on Eugenie Bouchard‘s serve–a higher percentage than she won on her own.

A less positive view of the situation is that we saw some dreadful serving performances. In particular, both Bouchard and Agnieska Radwanska struggled to win any points at all on their second serves. Genie won just 5 of 27 after missing her first serve, while Aga won only 2 of 16.

You don’t need an IBM Key to the Match to realize that those numbers aren’t going to cut it.

The WTA features a more return-oriented game and more breaks of serve than the ATP does, but these numbers are far out of the ordinary, especially for a solid server such as Bouchard. Here are some circuit-wide averages, derived from about 1,000 tour-level matches played last season:

  • WTA players win 55.5% of service points: 62.3% on first serves and 44.6% on second serves.
  • When the second serve lands in play–in other words, excluding double faults, players win 51.8% of second-serve points.
  • In the average losing performance, players won 57.1% of first-serve points, 40.0% of second-serve points, and 47.2% of second-serve points in play.

Then again, Li and Dominika Cibulkova–especially the Slovakian–aren’t average returners. In 16 Cibulkova wins for which I have serve statistics, she never failed to win at least half of second-serve return points. Only once did she win less than 58% of them, and her median performance was a whopping 63% of second-serve points won. In 7 of the 16 matches, she won second-serve return points at a higher rate than her own first-serve points.

Domi’s dismantling of Radwanska’s second serve still stands out, but in this context, it doesn’t look quite so unusual. When Cibulkova is hitting the ball well, you might as well be throwing batting practice once you miss your first offering.

While Li’s best return performances don’t quite stack up with Cibulkova’s, she has little trouble neutralizing her opponents in Melbourne. In six matches, she has won more than half of second-serve return points in every match, peaking with a 12-of-15 performance in the fourth round against Ekaterina Makarova. Overall, Li has won 86 of 136 second-serve return points in the tournament, good for 63%.

On Saturday, one of these powerful forces will have to give way to the other. The last time Li and Cibulkova met, in Toronto last summer, Domi had one of her worst serving performances of the year, winning only 35.5% of second-serve points, 44.0% of those that landed in play. In that match, Cibulkova failed to display the dominating return game that has been her trademark in Australia, winning barely half of Li’s second offerings, and only 41% when excluding double faults.

But as Cibulkova showed by crushing Radwanska for only the second time in six career meetings, her performances aren’t predictable. Her all-or-nothing style guarantees that we’ll see some fireworks in the final from both servers and returners. And at the rate she’s going, Domi might set some more records in the process.

For even more detailed analysis of yesterday’s semifinals, check out the charting-based analysis of Li-Bouchard and Radwanska-Cibulkova.

Bouchard, Halep, and First-Time Quarterfinalists

Two of the final eight women in Melbourne, Eugenie Bouchard and Simona Halep, are playing in their first Grand Slam quarterfinals. Let’s take a look at how other women have done in their first appearances this late in a Slam.

In the Open era, 267 different women have reached the final eight of a Slam. At the time of their debut quarterfinal, their average age was roughly 21 years and four months. Their average WTA ranking was 42, not considering those who predated the ranking system or those who reached their first quarterfinal as an unranked player.

Of the 267, 197 (73.8%) progressed no further in their breakthrough slam. 52 (26.4%) won one more match, losing in the semifinals; 12 (6.1%) reached the final but lost; and the remaining six players won the title when the reached their first Open-era quarter.

However small 6 of 297 sounds, such an outcome is actually even rarer. Three of those six first-time quarterfinalists don’t really count–they reached their first QF in 1968, the first year of the Open era. Billie Jean King, winner of the Australian Open that year, isn’t that great a comp for Bouchard or Halep. The only other players to win a Grand Slam in their first quarterfinal appearance are Chris O’Neil (1978 Australian), Barbara Jordan (1980 Australian), and Serena Williams (1999 US Open).

While we can’t count on Bouchard or Halep winning the tournament this week, their appearances in Slam quarterfinals at relative young ages bodes well. The earlier a player reaches her first major QF, the more QFs she is likely to reach over the course of her career.  In fact, of the 22 women who have reached more than 10 Slam quarterfinals since 1984, only one of them–Jana Novotna–failed to reach her first one in her teens. She didn’t make it until the ripe old age of 20 years and 8 months.

Bouchard has just snuck in before her 20th birthday, which she’ll celebrate next month. Her most age-appropriate comp is Victoria Azarenka, who reached her first major quarterfinal–at the 2009 French Open–just a few weeks younger than Genie is now. Less than five years later, Vika will play her 12th Slam QF.

Less optimistic comparisons for Bouchard are Yanina Wickmayer and Anna Chakvetadze, both of whom reached their first major quarterfinal in the last two months of their teens. Chakvetadze made two more final eights; Wickmayer is still looking for her second.

If history is any guide, Halep’s prospects are bleaker. At 22 years and four months, she is much older than any of the players who have reached double-digit Slam quarterfinals except for Li Na, who is playing in her 10th QF this week. Li didn’t play in the final eight of a Grand Slam until she was 24 years old.

The 61 players who reached their first Slam QF at an older age than Halep did not, on average, achieve much more. They’ve totaled 81 additional QFs–well below two per person.

Of course, the age profile of the WTA is changing, so a 22-year-old debutante isn’t nearly the oddity it was a decade or two ago. It’s no coincidence that Halep’s most optimistic comp is Li, an active player. That’s the most positive outlook for the Romanian, anyway. To rack up an impressive career record, she’ll have to follow Li’s lead and overcome a late start.

The ATP final eight also features a newbie, Grigor Dimitrov. The changing age profile of the ATP is even more drastic, so age-based analysis is less meaningful. But we can take a quick look at the precedents for the Bulgarian’s first Slam quarterfinal.

There have been 329 ATP Slam quarterfinalists in the Open era, and first-timers stand a better chance in the men’s game. 32.5% of debut Slam quarterfinalists have advanced to the semis, and 13 of them (4.0%) went on to win the tournament. Then again, none of them had to beat Rafael Nadal in the quarters.

While Dimitrov is older than Halep–and as noted, 22-year-olds didn’t used to be considered so young on the ATP tour–there are some positive examples for Grigor to follow.

Michael Stich reached his first Slam QF at almost exactly the same age as Dimitrov is now, and he not only reached the semis at that event (the 1991 French Open), but qualified for the final eight in nine more majors. Jo Wilfried Tsonga, David Ferrer, and Nikolay Davydenko all reached their first Slam QF later than Dimitrov, and each has played in the final eight at least ten times.

On average, those optimistic comps are outweighed by all the guys who made it to one or two Slam QFs later in their career. The 153 players who reached their first final eight later than Dimitrov’s current age have returned to a total of 362 additional quarterfinals–good for one or two more appearances per player.

Despite all the hype, Dimitrov’s performance this year isn’t a drastic breakthrough. It’s only a single step in the right direction–especially considering that he reached this milestone by beating the #73 player in the world. He could be the next Tsonga, or he could be the next Robby Ginepri.

Should WTA Players Approach the Net More?

Italian translation at settesei.it

21st-century women’s tennis is a baseline game. Some players are better able to identify opportunities to approach the net than others, and some can handle themselves quite well when they get there. But if a fan from a few decades ago were dropped off at the 2014 Australian Open, she would be shocked by the rarity of net points and the clumsiness of many players when they move forward.

Since almost all television commentators were excellent players in a more net-centric era, a frequent refrain during almost any broadcast is that players should rush the net more often. “Frequent” might be understating it–in a fit of pique, I was driven to say this:

Regardless of repetition, it’s worth further investigation. It’s certainly true that a skilled netwoman could win more points by moving forward. But when pros don’t emphasize that part of their game and they gain little match experience approaching the net, do they have the skills necessary to take advantage of such an opportunity?

Enter some numbers

At this point, you might be tempted to look at the oft-collected “Net Points” stat. Resist the urge. In a baseline-oriented match, net points can have little to do with net approachesAttempting to return a drop shot is considered a net point. Putting away a weak service return is considered a net point. In many WTA matches, more than half of “net points” do not involve an approach. The player was induced to come to the net for some reason.

Making matters worse, that non-approach segment of net points has little to do with net approaches. Given a weak, floating return, any competent player should be able to whack it for a swinging volley winner. At the other end of the spectrum, chasing down a drop shot relies on a different set of skills than picking a moment to hit an approach shot and then confidently placing a volley or two.

Fortunately, the Match Charting Project gives us some more detailed, approach-specific data.

Twenty matches in the charting database are from the first month of the 2014 WTA season, most of them from the first week in Melbourne. This data differentiates between “net approaches” and “net points.” In one of the more aggressive performances in the database, Angelique Kerber, in her loss to Tsvetana Pironkova in Sydney, won 15 of 19 net points. Of her ten net approaches, she won all ten.

(For any match report in the charting database–here’s the Kerber-Pironkova match–click one of the two “Net Points” links to see those stats. There is a different table for each player.)

Kerber’s ten net approaches is tied for the most of any of the WTA matches that have been charted this year. Last night, Garbine Muguruza also tallied ten net approaches, though she did so in a longer match.

In these twenty matches, only 27 of 40 players made even one traditional net approach. Including those who made zero, the average is just over three net approaches per match. The 27 who approached the net at least once averaged 4.7 per match.

Clearly, a lot of opportunities for offense are going unclaimed.

How they’re doing

Of the 126 net approaches we’ve tracked, the approaching player has won 84–exactly two-thirds. While that isn’t an overwhelming endorsement–many approach shots are hit in response to a weak groundstroke that already puts the opponent at a disadvantage–it certainly doesn’t count as evidence against the practice.

In half of all net approaches, the netrusher either hits an outright winner at the net or induces a forced error with a net shot.  Only 12% of the time does the opponent hit a passing shot winner. In another 5% of these points, the opponent induces a forced error with a passing shot. In 12% of net approach points, the player who moved forward hits an unforced error at the net.

Of the 27 players in the database who approached the net at least once, only six failed to win half of those points (three of whom only came forward once), and three more won exactly half of their net approach points.

The women in this sample who seize the most opportunities to rush the net have been particularly successful, as well. Seven of the eight who moved forward the most won more than half of their approach points.  This allows us to tentatively conclude that all the other players–the ones who picked only a few spots to approach the net during their matches–could have seized more opportunities. There may be a limit in the modern game to how much netrushing is wise, but the observed maximum of ten points per match doesn’t seem to be it.

Inevitable unknowns

Whether we look at Kerber and her 10/10 net-approach performance in Sydney or Sloane Stephens and her 1/1 tally yesterday against Elina Svitolina, it’s impossible to know the results of the next approach shot–or the next five.  We can compare single-match results and see that it’s possible for a WTA player to have a perfect record on her ten net approaches, but we can’t perform lab experiments in which Sloane plays Svitolina again and comes forward ten times instead of one.

For all the success that players enjoy when they do move forward, there are plenty of reasons not to. As I said at the outset, today’s players don’t practice net skills nearly as much as baseline skills, and they certainly don’t get much in-match practice. If someone isn’t comfortable approaching the net at a certain time, is it really a good idea for her to do so?

In the abstract, both intuition and statistical analysis supports the position that WTA players could move forward more. When they do approach the net, they are often successful, putting away volley winners and rarely getting passed. But I suspect this implies a long-term strategy more than the sort of thing a coach should emphasize during a changeover.

When commentators suggest that a player should move forward, what I think they really mean is this: “If this player were more comfortable with her transition game, this would be a great opportunity to take advantage of that.” Or: “Players should work harder on their approach shots on the practice court so that they’re ready for opportunities like this one.” Or simply: “Martina would have won that point ten shots ago.”

There seems to be opportunity waiting for more, well, opportunistic young players. But it isn’t one that can be generated simply by a sudden coaching change or a harangue from John McEnroe. Only when a player emerges with the baseline game to contend with the best pros and a transition/net game that exceeds most of those on the tour today will we find out just how much opportunity today’s players have wasted.

The Changing Depth of the WTA

During yesterday’s broadcast of the Australian Open match between Alison Riske and Yanina Wickmayer, commentator Elise Burgin discussed whether the depth of the WTA has increased over the years. She felt strongly that it has, and she had a very useful illustration on screen, as 55th-ranked Riske was putting on an impressive display of shotmaking en route to a 6-1 6-1 victory.

From a quantitative perspective, “depth” can be hard to pin down. If lower-ranked players are holding their own against the top five, or ten, or thirty, it could mean that the field is very deep, or it could mean that we’re in an era without all-time greats.  As Burgin pointed out, the WTA might not currently have a top five to match those of some recent eras, but there’s little doubt that today’s top two could line up with just about any of the last few decades.

It would be very difficult to settle whether today’s top ranks are good, bad, or otherwise in historical terms, so for now, let’s assume they are average.  We’ll return to that in a bit.

Let’s start by looking at how the WTA top 32 has fared against everybody else. This encompasses about 900 matches per season. The trend isn’t overwhelming, but it does seem that the top 32 is not quite as dominant as it was in some previous periods:

depth32

The 2012 and 2013 winning percentages of 73.4% and 74.7% represented the lowest two-year span since 1984 (where my ranking database begins).  Aside from the outlier years of 2004 and 2007, the top 32 has won fewer than 77% of its matches against the pack for more than a decade.  In the 1980s and 1990s, the top 32 was consistently above that number.

Of course, drawing the line at the top 32 is arbitrary. Most of us would think of the 19th- or 26th-ranked player as part of the pack, not as a defining player of this generation.  Let’s see how the graph looks if we draw the line at the top 10:

depth10

Looking at the top 10 against everyone else doesn’t differentiate the current era quite as much as the top 32 does, but it continues to show that the pack is quite competitive in historical terms.

Since 1984, the top 10 has won almost exactly 80% of matches against everyone else, and for the last two years, the WTA has matched that number.  However, in the very recent past, from 2009 to 2011, the pack posted the three best single-season records against the top 10, peaking in 2010, when the top 10 won only 74% of matches against others.

As I noted at the outset, comparing “the top” with “the pack” in a series of years implies that one or the other is a constant. The top–especially a small group such as the top 10–almost certainly isn’t. In 2010, that great season for the pack, Serena Williams played only 29 matches, compared to 62 in 2009 and 82 in 2013. Add another 30 or 50 Serena matches to the sample and maybe the pack wouldn’t have looked so good.

While the pack is less affected by single injuries, it probably isn’t a constant either. After all, the claim that launched this post is that the pack has improved.  Thus, we can’t entirely trust these numbers as a rating of the top based on their record against the pack, or as a rating of the pack based on their record against the top.

However, we can see broad trends and supplement them with some qualitative judgments.  If you believe that today’s top ten is a particularly weak one, the fact that the pack is winning only 20% of their matches against that group isn’t exactly an endorsement. If you think the top of the game is particularly strong, that 20% looks much better, supporting Burgin’s position that the pack is better than ever.

An alternative theory that may explain this intuition about the pack is based on injuries. WTA injury numbers (based on retirements and withdrawals, anyway) are at an all-time high, and advances in sports medicine are getting players back on court quicker than ever. Thus, there is always a pool of players whose talent level is not represented by their ranking, either because they are injury prone and never reach that ranking, or because they’ve recently missed time and seen their ranking fall during that period.

Of course, there have always been players in the field returning from injury, but at any given time, there are probably more today than there were twenty years ago. And that means more unseeded, lower-ranked competitors with the capability of beating a top player. They usually don’t–as in the cases of Andrea Petkovic, Venus Williams, and Vera Zvonareva this week–but if you’re looking at a draw hunting for dark horses and interesting early-round matchups, those are the sorts of names that deliver.

Given all the moving parts in this sort of analysis, it’s tough to draw conclusions. If a couple of players suddenly emerge as dominant players and complement Serena and Vika at the top of the game, we could see these numbers swing in favor of the top. If Serena suddenly retires, they’ll probably swing in favor of the pack. For now, the best I can offer is that the pack–whether defined as those outside the top 10, top 32, or any number in between–is probably a bit better than the WTA’s historical average.

US Open Final: Serena Williams d. Victoria Azarenka: Recap and Detailed Stats

Today’s final was Serena Williams‘s for the taking.  She didn’t seize it as boldly as she might have, but she performed just well enough to overcome both the windy conditions and a reliably dogged opponent in Victoria Azarenka.

When Serena is playing as well as she did during the third set, it’s tough to see how she ever loses. But today we saw an excellent illustration of both her assets and her liabilities.  If her opponent can hang around in rallies, there will be enough errors to swing some matches in the other direction. Most of the WTA rank and file can’t absorb her pace and stick around long enough to reap the benefits of those errors, but Vika can.

And when Azarenka is playing her best, as she did on occasion throughout this match, she can attack on one of Serena’s less penetrating shots, creating opportunities for her own winners. A player with a bigger serve would do that with her serve; Vika must try to do so within each rally.

By the numbers, it’s a bit of a miracle that Vika forced a third set.  Twice in the second set, Serena served for the match and was broken.  It was a testament to Azarenka’s stubbornness, always putting one more ball back in play, forcing Serena to overcome both the pressure and the wind.  In that second set, Williams had a hard time doing that.

It was the wind–and Serena’s difficulty dealing with it–that kept this match going as long as it did.  While it made life difficult for both players at times, especially when playing on the right side of the chair, Serena struggled much more.  She never really adjusted to the conditions, setting up early and taking big swings when the wind was likely to move the ball a bit too much for that.  Many of Serena’s errors–especially her 33 unforced errors on the backhand side alone–can be attributed to that sloppiness.

By the third set, the wind had settled down and so had Serena.  Azarenka provided some help with two crucial double faults in the fourth game of the set, including one on break point.  It wasn’t her first poorly-timed double fault of the match–four of her five came at 30-30 or later–but this one was the beginning of the end.  Unlike in the second set, Serena didn’t let up.  She consolidated the break by holding to love, with an unreturnable, two aces, and a running backhand lob winner.

I wrote this morning that Azarenka’s chances hinged on her serve.  She won 54.5% of her service points, a bit less than she did against Serena in Cincinnati, but better than she did in each of her last three matches in New York.  Had she limited her double faults to less important moments, 54.5% may well have been enough.

In the end, Serena was simply too strong.  Vika is the very best on tour at what she does, negating the advantage of those huge weapons, but it allows her very little margin for error against Serena.  That margin for error wasn’t quite enough for her to pull off the upset today.

Here are the point-by-point-based serve, return, and shot-type stats for the match.

Does Azarenka Have a Chance?

The last two times Serena Williams and Victoria Azarenka have met on hard courts, Azarenka has come out on top.  As much confidence as that might give her going into today’s final, it might be the only evidence suggesting she’s likely to win.

Today’s match will come down to Vika’s ability to hold serve, and while she has moved quickly through her last two rounds, she has yet to show that she can serve well enough to hold off the onslaught that is Serena’s return game.

In the semifinal against Flavia Pennetta, she lost more service points than she won, and was broken in five of her nine service games.  Against Daniela Hantuchova, she lost 47% of her service points, suffering three service breaks.  Playing Ana Ivanovic, she lost more than half of her service points, and was broken seven times.

While each of those players had a nice tournament, this is not exactly a Hall of Fame lineup that has reduced Azarenka’s service games to coin flips.  None brings anywhere near the weaponry to the return game that Serena does.  And Serena is considerably more difficult to break back.

These numbers make it all the more surprising that the last meeting between these two players ended in Vika’s favor.  We have detailed data from that most recent matchup:  Azarenka managed to win 55% of her service points (the same figure she held Serena to) and landed 11 of 12 serves on game points, winning nine of them.

Another promising data point is last year’s US Open final, in which Serena managed to win only 44% of Azarenka’s service points.  In both of these recent contests, the differences between Vika’s first-serve and second-serve success rates is tiny–in New York last year, it was a mere two percentage points–suggesting that she needs only a slight edge at the beginning of a rally to win the point.

Azarenka has the ability to step up her game for the big matches, so the question she’ll have to answer today is: Can she serve more effectively than she has all tournament?  If she does, even at the modest level she did in Cincinnati, we’re in for a very competitive afternoon of tennis.

Check out this final preview from Tom Perrotta, in which everyone agrees that Vika will raise her level today.

If you missed it yesterday, I wrote recaps of both men’s semifinals.  Djokovic-Wawrinka here, and Nadal-Gasquet here.  In those posts you can find links to my point-by-point based stats for both matches.

Finally, don’t miss this piece from Carl Bialik, in which he looks at IBM’s not-very-predictive “predictive analytics,” otherwise known as their Keys to the Match.  Next week, I’ll offer a closer look at the details of the better-performing “Sackmann Keys,” which, it turns out, have much more value for tennis analysis than merely showing up the folks at IBM.