The Steadily Less Predictable WTA

Italian translation at settesei.it

Update: The numbers in this post summarizing the effectiveness of sElo are much too high–a bug in my code led to calculating effectiveness with post-match ratings instead of pre-match ratings. The parts of the post that don’t have to do with sElo are unaffected and–I hope–remain of interest.

One of the talking points throughout the 2017 WTA season has been the unpredictability of the field. With the absence of Serena Williams, Victoria Azarenka, and until recently, Petra Kvitova and Maria Sharapova, there is a dearth of consistently dominant players. Many of the top remaining players have been unsteady as well, due to some combination of injury (Simona Halep), extreme surface preferences (Johanna Konta), and good old-fashioned regression to the mean (Angelique Kerber).

No top seed has yet won a title at the Premier level or above so far this year. Last week, Stephanie Kovalchik went into more detail, quantifying how seeds have failed to meet expectations and suggesting that the official WTA ranking system–the algorithm that determines which players get those seeds–has failed.

There are plenty of problems with the WTA ranking system, especially if you expect it to have predictive value–that is, if you want it to properly reflect the performance level of players right now. Kovalchik is correct that the rankings have done a particularly poor job this year identifying the best players. However, there’s something else going on: According to much more accurate algorithms, the WTA is more chaotic than it has been for decades.

Picking winners

Let’s start with a really basic measurement: picking winners. Through Rome, there had been more than 1100 completed WTA matches. The higher-ranked player won 62.4% of those. Since 1990, the ranking system has picked the winner of 67.9% of matches, and topped 70% during several years in the 1990s. It never fell below 66% until 2014, and this year’s 62.4% is the worst in the 28-year time frame under consideration.

Elo does a little better. It rates players by the quality of their opponents, meaning that draw luck is taken out of the equation, and does a better job of estimating the ability level of players like Serena and Sharapova, who for various reasons have missed long stretches of time. Since 1990, Elo has picked the winner of 68.6% of matches, falling to an all-time low of 63.1% so far in 2017.

For a big improvement, we need surface-specific Elo (sElo). An effective surface-based system isn’t as complicated as I expected it to be. By generating separate rankings for each surface (using only matches on that surface), sElo has correctly predicted the winner of 76.2% of matches since 1990, almost cracking 80% back in 1992. Even sElo is baffled by 2017, falling to it’s lowest point of 71.0% in 2017.

(sElo for all three major surfaces is now shown on the Tennis Abstract Elo ratings report.)

This graph shows how effectively the three algorithms picked winners. It’s clear that sElo is far better, and the graph also shows that some external factor is driving the predictability of results, affecting the accuracy of all three systems to a similar degree:

Brier scores

We see a similar effect if we use a more sophisticated method to rate the WTA ranking system against Elo and sElo. The Brier score of a collection of predictions measures not only how accurate they are, but also how well calibrated they are–that is, a player forecast to win a matchup 90% of the time really does win nine out of ten, not six out of ten, and vice versa. Brier scores average the square of the difference between each prediction and its corresponding result. Because it uses the square, very bad predictions (for instance, that a player has a 95% chance of winning a match she ended up losing) far outweigh more pedestrian ones (like a player with a 95% chance going on to win).

In 2017 so far, the official WTA ranking system has a Brier score of .237, compared to Elo of .226 and sElo of .187. Lower is better, since we want a system that minimizes the difference between predictions and actual outcomes. All three numbers are the highest of any season since 1990. The corresponding averages over that time span are .207 (WTA), .202 (Elo), and .164 (sElo).

As with the simpler method of counting correct predictions, we see that Elo is a bit better than the official ranking, and both of the surface-agnostic methods are crushed by sElo, even though the surface-specific method uses considerably less data. (For instance, the clay-specific Elo ignores hard and grass court results entirely.) And just like the results of picking winners, we see that the differences in Brier scores of the three methods are fairly consistent, meaning that some other factor is causing the year-to-year differences:

The takeaway

The WTA ranking system has plenty of issues, but its unusually bad performance this year isn’t due to any quirk in the algorithm. Elo and sElo are structured completely differently–the only thing they have in common with the official system is that they use WTA match results–and they show the same trends in both of the above metrics.

One factor affecting the last two years of forecasting accuracy is the absence of players like Serena, Sharapova, and Azarenka. If those three played full schedules and won at their usual clip, there would be quite a few more correct predictions for all three systems, and perhaps there would be fewer big upsets from the players who have tried to replace them at the top of the game.

But that isn’t the whole story. A bunch of no-brainer predictions don’t affect Brier score very much, and the presence of heavily-favored players also make it more likely that massively surprising results occur, such as Serena’s loss to Madison Brengle, or Sharapova’s ouster at the hands of Eugenie Bouchard. Many unexpected results are completely independent of the top ten, like Marketa Vondrousova’s recent title in Biel.

While some of the year-to-year differences in the graphs above are simply noise, the last several years looks much more like a meaningful trend. It could be that we are seeing a large-scale changing of a guard, with young players (and their low rankings) regularly upsetting established stars, while the biggest names in the sport are spending more time on the sidelines. Upsets may also be somewhat contagious: When one 19-year-old aspirant sees a peer beating top-tenners, she may be more confident that she can do the same.

Whatever influences have given us the WTA’s current state of unpredictability, we can see that it’s not just a mirage created by a flawed ranking system. Upsets are more common now than at any other point in recent memory, whichever algorithm you use to pick your favorites.

Cool Down Tennis

This is a guest post by Carl Bialik.

Imagine you’re named boss of tennis. Right after being sworn in by Rod Laver and Martina Navratilova, you’re handed an empty wall calendar. You make the schedule for 2018. What’s your first move?

Mine would be to move Indian Wells and Miami earlier in the calendar, and the Australian Open later, after the two U.S. Masters tournaments.

I never wanted this more than while sweating my way around the Indian Wells grounds in search of shade last month. I wasn’t alone. The only full sections of the main stadium during day sessions were the ones protected from the sun. Around the fan-friendly venue, there are plenty of seats in the shade — under tents, or in Adirondack chairs that shade-seeking people push ever closer to the screen as the sun shifts. The players can only wait for shade to slowly descend on the court. Jack Sock needed a towel holding 50 ice cubes to cool down.

Sweating in the grass

 

Sure, it was unusually hot at this year’s Indian Wells tournament. But the climatological averages are clear: It’s hot in the California desert and in the Florida sunshine in March, and in the antipodean summer in January. It’d be cooler in Indian Wells, Miami and Melbourne if the two Masters events moved two months earlier and led up to the year’s first Grand Slam in March. Each of the two-week events would be, on average, 4 to 10 degrees Fahrenheit cooler each year. (The precipitation would be about the same, so Miami men’s finalist Rafael Nadal might continue to bemoan humidity, request sawdust and show more than he’d planned beneath his shorts; while women’s champ Johanna Konta might keep having to change clothes midmatch because they’ve accumulated approximately five kilograms of sweat.)

I’m using the averages because I don’t want to make too much of an unseasonably hot Indian Wells, or too little of an unusually cold March in Miami. But the averages might understate the problem because it’s precisely the outliers we’re worried about. A nudge downward of a few degrees, on average, could translate into a big drop in the probability of an unbearably hot fortnight — say, from 25 percent to 5 percent.

Changing the tennis calendar would also mean less daylight. That wouldn’t be so good for the nickname Sunshine Double, but it’d be good for tennis. Until more tennis stadiums adopt overhanging partial roofs — but for sun, not for rain — shorter days means less sun for fans to contend with and more reason to fill the seats. Plus, night tennis is exciting. The venues already have plenty of lights and evening sessions.

Scrambling the schedule would do more than cool down tennis. The three midyear majors’ proximity to each other helps the sport carry some momentum and mainstream buzz from one to the next. The Australian Open squanders all that in the four-month gap between its end and the start of the French Open. There’s even a month between the Aussie Open and the next big event.

The other three majors also get opening acts, to help players build up familiarity with the surface and for fans to build anticipation. The Australian Open gets two weeks at the start of the season — without so much as a 500 event on the men’s side.

The lack of buffer between the offseason and Melbourne also means it loses some players still recovering from the end of the previous season. That was the case this year with Juan Martin del Potro, who skipped this year’s first major after winning the Davis Cup with Argentina in November.

Imagine instead starting the season with Indian Wells and Miami — or Miami, then Indian Wells, while we’re scrambling things, for the convenience of travel from the sport’s power center of Europe — using the same courts and balls as Melbourne. Follow that month — or less, if one or both of the U.S. early-year Masters succumbs to the reality that they could be just a week — by Doha and Dubai, then Brisbane, Sydney and the like, before the main event in Melbourne at the start of March. We’d start the season with a real hard-court swing, ending with the first major.

From Australia, the tour could stay in the southern hemisphere. The swing through South America has a long history and a terrible spot on the current calendar. It was traditionally played on clay but some of its biggest events are moving to hard courts — first (North American) Acapulco, now, maybe, Rio, in search of Masters status — to the chagrin of Nadal and others. Too many players simply don’t think it’s worth it to compete on clay for a few weeks if that’s followed by a month of hard-court events. But move Indian Wells and Miami, and South American clay could move a month later in the calendar — while slightly tempering what Nadal bemoans as “too extreme” weather conditions by an average of 1 degree. The swing would give way seamlessly to Houston, Charleston and the European clay spell — which, by the way, would absorb Bucharest, Hamburg, Umag, Bastad and Gstaad from their awkward post-Wimbledon calendar slots. And no one would suggest Miami move to green clay.

We’d be left with a coherent calendar with five seasons of roughly equal length and importance, four with a major and one with the year-end finals: (1) Outdoor hard courts in the U.S., the Middle East and Oceania, followed by (2) clay in the Americas and Europe, (3) English and German grass (with Newport for those who want to visit the sport’s hall of fame), (4) North American and Asian outdoor hard courts, and (5) European indoor hard courts (absorbing the current winter events such as St. Petersburg and Rotterdam) culminating in wherever the tours’ multiplying year-end finals are calling home that year. And let’s play Davis Cup and Fed Cup at the same time — the tours acting in sync; what a concept! — on weekends at the edge of the five new seasons, giving hosts a wider range of sensible surfaces to choose from, and creating the option for combined venues if men and women from the same country are hosting the same round. (Prague in 2012 would’ve been tennis nirvana.) Or, hell, consider merging the events.

Could all this happen? Sure — if tennis power were centralized in a person or people who prioritize the overall good of the global game. Without a radical transformation of tennis, though, it’ll be slow going: It took years for the idea of lengthening the grass-court season by a week to become reality.

Carl Bialik has written about tennis for fivethirtyeight.com and The Wall Street Journal. He lives and plays tennis in New York City and has a Tennis Abstract page.

Playing Even Better Than Number One

Italian translation at settesei.it

Last night in Miami, Venus Williams beat newly re-minted WTA No. 1 Angelique Kerber. Venus, of course, has plenty of experience clashing with the very best in women’s tennis, with 15 Grand Slam finals and three spells at the No. 1 ranking herself.

Last night’s quarterfinal was Venus’s 37th match against a WTA No. 1  and her 15th win. Kerber became the sixth different top-ranked player to lose at the hands of the elder Williams sister.

All of these numbers are very impressive, especially when you consider that, taken as a whole, WTA No. 1s have won just over 88% of their nearly 2,300 matches since the modern ranking system was instituted. However, Venus doesn’t hold the record in any of these categories.

Records against No. 1s are a somewhat odd classification, since the best players tend to reach the top spot themselves. For example, Martina Hingis played only 11 matches against top-ranked opponents, barely one-fifth as many as the leader in that category. On the other hand, injuries and other layoffs have meant that many all-time greats have found themselves lower in the rankings for long stretches. That is particularly true of Venus and Serena Williams.

With her 37 matches played against No. 1s, Venus is approaching the top of the list, but it will take a superhuman effort to catch Arantxa Sanchez Vicario, at 51:

Rank  Player                   Matches vs No. 1
1     Arantxa Sanchez Vicario                51
2     Gabriela Sabatini                      38
3     Venus Williams                         37
4     Lindsay Davenport                      34
5     Conchita Martinez                      33
6     Helena Sukova                          31
7     Serena Williams                        28
8     Svetlana Kuznetsova                    27
-     Jana Novotna                           27
10    Amelie Mauresmo                        25
11    Maria Sharapova                        23

Wins against No. 1s is a more achievable goal. Martina Navratilova holds the current record at 18*, followed by Serena at 16, and then Lindsay Davenport and Venus at 15:

Rank  Player               Wins  Losses
1     Martina Navratilova    18*      
2     Serena Williams        16      12
3     Lindsay Davenport      15      19
-     Venus Williams         15      22
5     Steffi Graf            11       8
6     Gabriela Sabatini      10      28
7     Amelie Mauresmo         8      17
8     Svetlana Kuznetsova     7      20
-     Maria Sharapova         7      16
-     Mary Pierce             7      15
-     Justine Henin           7       9

*My database does not have rankings throughout Navratilova’s entire career, but other sources credit her with 18 wins.

Win percentage against top-ranked opponents is a bit trickier, as it depends where you set the minimum number of matches. I’ve drawn the line at five. That’s rather low, but I wanted to include Alize Cornet and Elina Svitolina, active players who have each won three of their six matches against No. 1s. By this standard, Venus ranks eighth, though equally reasonable thresholds of 8 or 10 matches would move her up two or three places:

Rank  Player             Wins  Losses   Win%
1     Steffi Graf          11       8  57.9%
2     Serena Williams      16      12  57.1%
3     Petra Kvitova         5       4  55.6%
4     Elina Svitolina       3       3  50.0%
-     Alize Cornet          3       3  50.0%
6     Lindsay Davenport    15      19  44.1%
7     Justine Henin         7       9  43.8%
8     Venus Williams       15      22  40.5%
9     Vera Zvonareva        4       7  36.4%
-     Dinara Safina         4       7  36.4%

Remember that the average player wins fewer than 12% of matches against No. 1s!

Finally, Venus’s defeat of Kerber gave her a win against her sixth different No. 1, moving her into second place in that department. As is so often the case, she trails only her sister, who has beaten seven. Oddly enough, there is very little overlap between Serena’s and Venus’s lists: Their only common victims are Hingis and Davenport. The full list:

Rank  Player               No. 1s defeated
1     Serena Williams                    7
2     Venus Williams                     6
3     Steffi Graf                        5
-     Kim Clijsters                      5
-     Amelie Mauresmo                    5
-     Maria Sharapova                    5
7     Petra Kvitova                      4
-     Lindsay Davenport                  4
-     Justine Henin                      4
-     Svetlana Kuznetsova                4

If Karolina Pliskova–who now stands within 1500 points of No. 1 and could further close the gap in Miami–reaches the top spot, Venus may get a chance to beat a 7th top player. Of course, Serena could get that chance, as well.

The Most Exciting Matches of the 2016 WTA Season

Italian translation at settesei.it

In my most recent piece for The Economist, I used a metric called Excitement Index (EI) to consider the implications of shortening singles matches to a format like the no-ad, super-tiebreak rules used for doubles. In my simulations, the shorter format didn’t fare well: The most gripping contests are often the longest ones, and the full-length third set is frequently the best part.

I used data from ATP tournaments in that piece, and several readers have asked how women’s matches score on the EI scale. Many matches from the 2016 season rate extremely highly, while some players we tend to think of as exciting fail to register among the best by this metric. I’ll share some of the results in a moment.

First, a quick overview of EI. We can calculate the probability that each player will win a match at any point in the contest, and using those numbers, it’s possible to determine the leverage of every point–that is, the difference between a player’s odds if she wins the next point and her odds if she loses it. At 40-0, down a break in the first set, that leverage is very low: less than 2%. In a tight third-set tiebreak, leverage can climb as high as 25%. The average point is around 5% to 6%, and as long as neither player has a substantial lead, points at 30-30 or later are higher.

EI is calculated by averaging the leverage of every point in the match. The more high-leverage points, the higher the EI. To make the results a bit more viewer-friendly, I multiply the average leverage by 1,000, so if the typical point has the potential for a 5% (0.05) swing, the EI is 50. The most boring matches, like Garbine Muguruza‘s 6-1 6-0 dismantling of Ekaterina Makarova in Rome, rate below 25. The most exciting will occasionally top 100, and the average WTA match this year scored a 53.7. By comparison, the average ATP match this year rated at 48.9.

Of course, the number and magnitude of crucial moments isn’t the only thing that can make a tennis match “exciting.” Finals tend to be more gripping than first-round tilts, long rallies and daring net play are more watchable than error-riddled ballbashing, and Fed Cup rubbers feature crowds that can make the warmup feel like a third-set tiebreak. When news outlets make their “Best Matches of 2016” lists, they’ll surely take some of those other factors into account. EI takes a narrower view, and it is able to show us which matches, independent of context, offered the most pressure-packed tennis.

Here are the top ten matches of the 2016 WTA season, ranked by EI:

Tournament    Match                Score                    EI  
Charleston    Lucic/Mladenovic     4-6 6-4 7-6(13)       109.9  
Wimbledon     Cibulkova/Radwanska  6-3 5-7 9-7           105.0  
Wimbledon     Safarova/Cepelova    4-6 6-1 12-10         101.7  
Kuala Lumpur  Nara/Hantuchova      6-4 6-7(4) 7-6(10)    100.2  
Brisbane      CSN/Lepchenko        4-6 6-4 7-5            99.0  
Quebec City   Vickery/Tig          7-6(5) 6-7(3) 7-6(7)   98.5  
Miami         Garcia/Petkovic      7-6(5) 3-6 7-6(2)      98.1  
Wimbledon     Vesnina/Makarova     5-7 6-1 9-7            97.2  
Beijing       Keys/Kvitova         6-3 6-7(2) 7-6(5)      96.8  
Acapulco      Stephens/Cibulkova   6-4 4-6 7-6(5)         96.7

Getting to 6-6 in the final set is clearly a good way to appear on this list. The top fifty matches of the season (out of about 2,700) all reached at least 5-5 in the third. The highest-rated clash that didn’t get that far was Angelique Kerber‘s 1-6 7-6(2) 6-4 defeat of Elina Svitolina, with an EI of 88.2. Svitolina’s 4-6 6-3 6-4 victory over Bethanie Mattek Sands in Wuhan, the top match on the list without any sets reaching 5-5, scored an EI of 87.3.

Wimbledon featured an unusual number of very exciting matches this year, especially compared to Roland Garros and the Australian Open, the other tournaments that forgo a tiebreak in the final set. The top-rated French Open contest was the first-rounder between Johanna Larsson and Magda Linette, which scored 95.3 and ranks 13th for the season, while the highest EI among Aussie Open matches is all the way down at 27th on the list, a 92.8 between Monica Puig and Kristyna Pliskova.

Dominika Cibulkova is the only player who appears twice on this list. That doesn’t mean she’s a sure thing for exciting matches: As we’ll see, elite players rarely are. The only year-end top-tenner who ranks among the highest average EIs is Svetlana Kuznetsova, who played as many “very exciting” matches–those rating among the top fifth of matches this season–as any other woman on tour:

Rank  Player                M  Avg EI  V. Exc  Exc %  Bor %  
1     Kristina Mladenovic  60    59.8      19  55.0%  25.0%  
2     Christina McHale     46    59.6      16  50.0%  19.6%  
3     Heather Watson       35    58.5      12  48.6%  25.7%  
4     Jelena Jankovic      43    57.6      12  55.8%  30.2%  
5     Svetlana Kuznetsova  64    57.4      21  48.4%  32.8%  
6     Venus Williams       38    57.1      10  55.3%  31.6%  
7     Yanina Wickmayer     43    56.5      13  46.5%  30.2%  
8     Alison Riske         46    56.5      10  45.7%  32.6%  
9     Caroline Garcia      62    56.4      18  43.5%  33.9%  
10    Irina-Camelia Begu   42    56.4      14  45.2%  40.5% 

(Minimum 35 tour-level matches (“M” above), excluding retirements. My data is also missing a random handful of matches throughout the season.)

The “V. Exc” column tallies how many top-quintile matches the player took part in. The “Exc %” column shows the percent of matches that rated in the top 40% of all WTA contests, while “Bor %” shows the same for the bottom 40%, the more boring matches. Big servers who reach a disproportionate number of tiebreaks and 7-5 sets do well on this list, though it is far from a perfect correspondence. Tiebreaks can create a lot of big moments, but if there were many love service games en route to 6-6, the overall picture isn’t nearly so exciting.

Unlike Kuznetsova, who played a whopping 32 deciding sets this year, most of the other top women enjoy plenty of blowouts. Muguruza, Simona Halep, and Serena Williams occupy the very last three places on the average-EI ranking, largely because when they win, they do so handily–and they win a lot. The next table shows the WTA year-end top-ten, with their ranking (out of 59) on the average-EI list:

Rank  Player        WTA#  Matches  Avg EI  V. Exc  Exc %  Bor %  
5     Kuznetsova       9       64    57.4      21  48.4%  32.8%  
13    Pliskova         6       66    55.6      19  48.5%  39.4%  
16    Keys             8       64    55.4      13  40.6%  35.9%  
23    Cibulkova        5       68    54.6      21  42.6%  42.6%  
28    Kerber           1       77    54.0      12  42.9%  41.6%  
      tour average                   53.7          40.0%  40.0%  
41    Radwanska        3       69    52.5      12  29.0%  44.9%  
51    Konta           10       67    51.2      12  34.3%  46.3%  
57    Muguruza         7       51    49.9       5  33.3%  43.1%  
58    Halep            4       59    49.6       8  30.5%  50.8%  
59    Williams         2       44    48.1       3  27.3%  50.0%

It’s a good thing that fans love Serena, because her matches rarely provide much in the way of big moments. As low as Williams and Halep rate on this measure, Victoria Azarenka scores even lower. Her Miami fourth-rounder against Muguruza was her only match this season to rank in the “exciting” category, and her average EI was a mere 44.0.

Clearly, EI isn’t much of a method for identifying the best players. Even looking at the lowest-rated competitors by EI would be misleading: In 56th place, right above Muguruza, is the otherwise unheralded Nao Hibino. EI excels as a metric for ferreting out the most riveting individual matches, whether they were broadcast worldwide or ignored entirely. And the next time someone suggests shortening matches, EI is a great tool to highlight just how much excitement would be lost by doing so.

Christina McHale’s Tokyo Marathon

At the Japan Open in Tokyo last week, Christina McHale won her first career title. It didn’t come easy. She played three sets in every one of her five matches, going all the way to third-set tiebreaks in her first two rounds. Altogether, she spent over 13 hours on court.

We need some context to appreciate just what an outlier that is. Of 50 tour-level WTA tournaments this year, no other titlist has spent more than about 11 hours and 35 minutes on court–and that includes Grand Slam winners, who play two more matches than McHale did! Before Christina’s marathon effort last week, the champion who spent the most time on court in a 32-draw event was Dominika Cibulkova, who needed “only” 9 hours and 20 minutes to win in Eastbourne.

There’s no complete source for historical WTA match-time data, so we can’t determine just how rare 13-hour efforts were in years past. We can, however, hunt for tournaments in which the winner needed to play so many sets.

Going back to 1991–encompassing almost 1,500 events–McHale’s effort marks only the second time a player has won a tournament while playing 15 sets in five matches. The only previous instance was Anastasia Pavlyuchenkova‘s Paris title run in 2014. Serena Williams played five three-setters en route to the Roland Garros title last year, but of course, she played two other matches as well. Three other players–none since 2003–received first-round byes and then won tournaments by playing three sets in each of their four matches.

In general, we might expect a player who goes the distance in every round to struggle in the final. First of all, we would expect her to be tired–especially if, as is almost always the case, her opponent hasn’t spent as much time on court. Second, we might deduce that, if a player needed three-sets to win early rounds, she’s in relatively weak form, compared to the typical tour-level finalist.

Sure enough, the last 25 years of WTA history give us 16 players who reached a final by playing three sets in every round. Of the 16, only four–McHale, Pavlyuchenkova, and two others who didn’t require three sets in the final–won the title. The other 12 couldn’t retain their three-set magic and lost in the final.

While 16 players don’t make up much of a sample, we get a similar result if we broaden our view to those who played three-setters in exactly three of their four matches before the final. Excluding those who faced opponents who also played so many three-setters, we’re left with 134 players, only 48 (35.8%) of whom won the title match. A simple ranking-based forecast indicates that 58 (43.3%) of those players should have won, suggesting that while these players are indeed weaker than their more-dominant opponents, their underperformance may be due partly to fatigue.

McHale spent over 10 hours on court simply reaching the Tokyo final, far more than the six-plus hours required by her opponent, Katerina Siniakova. Even when a player doesn’t spend the record-setting amount of time on court that the American did this week, competitors tend to underperform after playing so many three-setters. The fact that McHale didn’t, and that she triumphed in yet another marathon match, makes her achievement all the more impressive.

Elo-Forecasting the WTA Tour Finals in Singapore

With the field of eight divided into two round-robin groups for the WTA Tour Finals in Singapore, we can play around with some forecasts for this event. I’ve updated my Elo ratings through last week’s tournaments, and the first thing that jumps out is how different they are from the official rankings.

Here’s the Singapore field:

EloRank  Player                Elo  Group  
2        Maria Sharapova      2296    RED  
4        Simona Halep         2181    RED  
6        Garbine Muguruza     2147  WHITE  
8        Petra Kvitova        2136  WHITE  
9        Angelique Kerber     2129  WHITE  
11       Agnieszka Radwanska  2100    RED  
15       Lucie Safarova       2051  WHITE  
21       Flavia Pennetta      2004    RED

Serena Williams (#1 in just about every imaginable ranking system) chose not to play, but if Elo ruled the day, Belinda Bencic, Venus Williams, and Victoria Azarenka would be playing this week in place of Agnieszka Radwanska, Lucie Safarova, and Flavia Pennetta.

Anyway, we’ll work with what we’ve got. Maria Sharapova is, according to Elo, a huge favorite here. The ratings translate into a forecast that looks like this:

Player                  SF  Final  Title  
Maria Sharapova      83.7%  61.1%  43.6%  
Simona Halep         60.8%  35.4%  15.9%  
Garbine Muguruza     59.4%  25.7%  11.3%  
Petra Kvitova        55.2%  23.0%   9.8%  
Angelique Kerber     53.1%  21.7%   8.8%  
Agnieszka Radwanska  37.4%  17.4%   6.1%  
Lucie Safarova       32.3%   9.7%   3.1%  
Flavia Pennetta      18.1%   6.0%   1.4%

If Sharapova is really that good, the loser in today’s draw was Simona Halep. The top seed would typically benefit from having the second seed in the other group, but because Garbine Muguruza recently took over the third spot in the rankings, Pova entered the draw as a dangerous floater.

However, these ratings don’t reflect the fact that Sharapova hasn’t completed a match since Wimbledon. They don’t decline with inactivity, so Pova’s rating is the same as it was the day after she lost to Serena back in July. (My algorithm also excludes retirements, so her attempted return in Wuhan isn’t considered.)

With as little as we know about Sharapova’s health, it’s tough to know how to tweak her rating. For lack of any better ideas, I revised her Elo rating to 2132, right between Petra Kvitova and Angelique Kerber. At her best, Sharapova is better than that, but consider this a way of factoring in the substantial possibility that she’ll play much, much worse–or that she’ll get injured and her matches will be played by Carla Suarez Navarro instead. The revised forecast:

Player                  SF  Final  Title  
Simona Halep         69.9%  40.9%  24.0%  
Garbine Muguruza     59.4%  31.5%  16.5%  
Maria Sharapova      57.6%  29.5%  14.5%  
Petra Kvitova        55.6%  28.4%  14.4%  
Angelique Kerber     52.5%  26.3%  13.2%  
Agnieszka Radwanska  47.9%  22.3%   9.9%  
Lucie Safarova       32.6%  12.9%   4.9%  
Flavia Pennetta      24.7%   8.3%   2.7%

If this is a reasonably accurate estimate of Sharapova’s current ability, the Red group suddenly looks like the right place to be. Because Elo doesn’t give any particular weight to Grand Slams, it suggests that the official rankings far overestimate the current level of Safarova and Pennetta. The weakness of those two makes Halep a very likely semifinalist and also means that, in this forecast, the winner of the tournament is more likely (54% to 46%) to come from the White group.

Without Serena, and with Sharapova’s health in question, there are simply no dominant players in the field this week. If nothing else, these forecasts illustrate that we’d be foolish to take any Singapore predictions too seriously.

Forecasting the Effects of Performance Byes in Beijing

To the uninitiated, the WTA draw in Beijing this week looks a little strange. The 64-player draw includes four byes, which were given to the four semifinalists from last week’s event in Wuhan. So instead of empty places in the bracket next to the top four seeds, those free passes go to the 5th, 10th, and 15th seeds, along with one unseeded player, Venus Williams.

“Performance byes”–those given to players based on their results the previous week, rather than their seed–have occasionally featured in WTA draws over the last few years. If you’re interested in their recent history, Victoria Chiesa wrote an excellent overview.

I’m interested in measuring the benefit these byes confer on the recipients–and the negative effect they have on the players who would have received those byes had they been awarded in the usual way. I’ve written about the effects of byes before, but I haven’t contrasted different approaches to awarding them.

This week, the beneficiaries are Garbine Muguruza, Angelique Kerber, Roberta Vinci, and Venus Williams. The top four seeds–the women who were atypically required to play first-round matches, were Simona Halep, Petra Kvitova, Flavia Pennetta, and Agnieszka Radwanska.

To quantify the impact of the various possible formats of a 64-player draw, I used a variety of tools: Elo to rate players and predict match outcomes, Monte Carlo tournament simulations to consider many different permutations of each draw, and a modified version of my code to “reseed” brackets. While this is complicated stuff under the hood, the results aren’t that opaque.

Here are three different types of 64-player draws that Beijing might have employed:

  1. Performance byes to last week’s semifinalists. This gives a substantial boost to the players receiving byes, and compared to any other format, has a negative effect on top players. Not only are the top four seeds required to play a first-round match, they are a bit more likely to play last week’s semifinalists, since the byes give those players a better chance of advancing.
  2. Byes to the top four seeds. The top four seeds get an obvious boost, and everyone else suffers a bit, as they are that much more likely to face the top four.
  3. No byes: 64 players in the draw instead of 60. The clear winners in this scenario are the players who wouldn’t otherwise make it into the main draw. Unseeded players (excluding Venus) also benefit slightly, as the lack of byes mean that top players are less likely to advance.

Let’s crunch the numbers. For each of the three scenarios, I ran simulations based on the field without knowing how the draw turned out. That is, Kvitova is always seeded second, but she doesn’t always play Sara Errani in the first round. This approach eliminates any biases in the actual draw. To simulate the 64-player field, I added the four top-ranked players who lost in the final round of qualifying.

To compare the effects of each draw type on every player, I calculated “expected points” based on their probability of reaching each round. For instance, if Halep entered the tournament with a 20% chance of winning the event with its 1,000 ranking points, she’d have 200 “expected points,” plus her expected points for the higher probabilities (and lower number of points) of reaching every round in between. It’s simply a way of combining a lot of probabilities into a single easier-to-understand number.

Here are the expected points in each draw scenario (plus the actual Beijing draw) for the top four players, the four players who received performance byes, plus a couple of others (Belinda Bencic and Caroline Wozniacki) who rated particularly highly:

Player               Seed  PerfByes  TopByes  NoByes  Actual  
Simona Halep            1       323      364     330     341  
Petra Kvitova           2       276      323     290     291  
Venus Williams                  247      216     218     279  
Belinda Bencic         11       255      249     268     254  
Garbine Muguruza        5       243      202     210     227  
Angelique Kerber       10       260      224     235     227  
Caroline Wozniacki      8       208      203     205     199  
Flavia Pennetta         3       142      177     144     195  
Agnieszka Radwanska     4       185      233     192     188  
Roberta Vinci          15       120       91      94      90

As expected, the top four seeds are expected to reap far more points when given first-round byes. It’s most noticeable for Pennetta and Radwanska, who would enjoy a 20% boost in expected points if given a first-round bye. Oddly, though, the draw worked out very favorably for Flavia–Elo gave her a 95% chance of beating her first-round opponent Xinyun Han, and her draw steered her relatively clear of other dangerous players in subsequent rounds.

Similarly, the performance byes are worth a 15 to 30% advantage in expected points to the players who receive them. Vinci is the biggest winner here, as we would generally expect from the player most likely to suffer an upset without the bye.

Like Pennetta, Venus was treated very well by the way the draw turned out. The bye already gave her an approximately 15% boost compared to her expectations without a bye, and the draw tacked another 13% onto that. Both the structure of the draw and some luck on draw day made her the event’s third most likely champion, while the other scenarios would have left her in fifth.

All byes–conventional or unconventional–work to the advantage of some players and against others. However they are granted, they tend to work in favor of those who are already successful, whether that success is over the course of a year or a single week.

Performance byes are easy enough to defend: They give successful players a bit more rest between two demanding events, and from the tour’s perspective, they make it a little more likely that last week’s best players won’t pull off of this week’s tourney. And if all byes tend to the make the rich a little richer, at least performance byes open the possibility of benefiting different players than usual.

How Elo Rates US Open Finalists Flavia Pennetta and Roberta Vinci

Italian translation at settesei.it

Among the many good things that have happened to Flavia Pennetta and Roberta Vinci after reaching the final of this year’s US Open, both enjoyed huge leaps in Monday’s official WTA rankings. Pennetta rose from 26th to 8th, and Vinci jumped from 43rd to 19th.

Such large changes in rankings are always a little suspicious and expose the weakness of systems that award points based on round achieved. A lucky draw or one incredible outlier of a match doesn’t mean that a player is suddenly massively better than she was a couple of weeks ago.

To put it another way: As they are, the official rankings do a decent job of representing how a player has performed. What they don’t do so well is represent how well someone is playing, or the closely related issue of how well she will play.

For that, we can turn to Elo ratings, which Carl Bialik and Benjamin Morris used at the beginning of the US Open to compare Serena Williams to other all-time greats [1]. Elo awards points based on opponent quality, not the importance of the tournament or round. As such, the system provides a better estimate of the current skill level of each player than the official rankings do.

Sure enough, Elo agrees with my hypothesis, that Pennetta didn’t suddenly become the 8th best player in the world. Instead, she rose to 17th, just behind Garbine Muguruza (another Slam finalist overestimated by the rankings) and ahead of Elina Svitolina. Vinci didn’t really return to the top 20, either: Elo places her 34th, between Camila Giorgi and Barbora Strycova.

While her official ranking of 8th is Pennetta’s career high, Elo disagrees again. The system claims that Pennetta peaked during the US Open six years ago, after a strong summer that involved semifinal-or-better showings in four straight tournaments, plus a fourth-round win over Vera Zvonareva in New York. She’s more than 100 points below that career-high level, equivalent to the present gap between her and 7th-Elo-rated Angelique Kerber.

The current Elo rankings hold plenty of surprises like this, having little in common with the official rankings:

Rank  Player                 Elo  
1     Serena Williams       2460  
2     Maria Sharapova       2298  
3     Victoria Azarenka     2221  
4     Simona Halep          2204  
5     Petra Kvitova         2174  
6     Belinda Bencic        2144  
7     Angelique Kerber      2130  
8     Venus Williams        2126  
9     Caroline Wozniacki    2095  
10    Lucie Safarova        2084

Rank  Player                 Elo   
11    Ana Ivanovic          2078  
12    Carla Suarez Navarro  2062  
13    Agnieszka Radwanska   2054  
14    Timea Bacsinszky      2041  
15    Sloane Stephens       2031  
16    Garbine Muguruza      2031  
17    Flavia Pennetta       2030  
18    Elina Svitolina       2023  
19    Madison Keys          2019  
20    Jelena Jankovic       2016

While Victoria Azarenka is still nearly 200 points shy of her peak, Elo gives her credit for the extremely tough draws that have met her return from injury. Another player rated much higher here than in the WTA rankings is Belinda Bencic, whose defeat of Serena launched her into the top ten.

The oldest final

Pennetta and Vinci are both unusually old for Slam finalists, not to mention players who reached that milestone for the first time. Elo doesn’t consider them among the very best players active today, but next to other 32- and 33-year-olds in WTA history, they compare very well indeed.

Among players 33 or older, Pennetta’s current rating is sixth best in the last thirty-plus years [2]. As the all-time list shows, that puts her in extraordinarily good company:

Rank  Player                Age   Elo  
1     Martina Navratilova  33.4  2527  
2     Serena Williams      33.9  2480  
3     Chris Evert          33.4  2412  
4     Venus Williams       33.3  2175  
5     Nathalie Tauziat     33.9  2088  
6     Flavia Pennetta      33.5  2030  
7     Wendy Turnbull       33.1  2018  
8     Conchita Martinez    33.3  2014

In the 32-and-over category, Vinci stands out as well. Her lower rating, combined with the somewhat larger pool of players who remained competitive to that ago, means that she holds 24th place in this age group. For a player who has never cracked the top ten, 24th of all time is an impressive accomplishment.

Keep an eye out for more Elo-based analysis here. Soon, I’ll be able to post and update Elo ratings on Tennis Abstract and, once a few more kinks are worked out, use them to improve the WTA tournament forecasts on the site as well.

Continue reading How Elo Rates US Open Finalists Flavia Pennetta and Roberta Vinci

Break Point Persistence: Why Venus is Better Than Her Ranking

Some points matter a lot more than others. A couple of clutch break point conversions or a well-played tiebreak make it possible to win a match despite winning fewer than half of the points. Even when such statistical anomalies don’t occur, one point won at the right time can erase the damage done by several other points lost.

Break points are among the most important points, and because tennis’s governing bodies track them, we can easily study them. I’ve previously looked at break point stats, with a special emphasis on Federer, here and here. Today we’ll focus on break points in the women’s game.

The first step is to put break points in context. Rather than simply looking at a percentage saved or converted, we need to compare those rates to a player’s serve or return points won in general. Serena Williams is always going to save a higher percentage of break points than Sara Errani does, but that has much more to do with her excellent service game than any special skills on break points.

Once we do that, we have two results for each player: How much better (or worse) she is when facing break point on serve, and how much better (or worse) she is with a break point on return.

For instance, this year Serena has won 2.8% more service points than average when facing break point, and 7.5% more return points than average with a break point opportunity. The latter number is particularly good–not only compared to other players, but compared to Serena’s own record over the last ten years, when she’s converted break points exactly as often as she has won other break points.

Serena’s experience isn’t unusual. From one year to the next, these rates aren’t persistent, meaning that most players don’t consistently win or lose many more break points than expected. Since 2006, Maria Sharapova has converted 1% fewer break points than expected. Caroline Wozniacki has recorded exactly the same rate, while Victoria Azarenka has converted 2% fewer break points than expected.

On serve, the story is similar, with a slight twist. Inexperienced players seem to perform a little worse when trying to convert a break point against a more experienced opponent, so most top players save break points about 4% more often than they win other service points. Serena, Sharapova, Wozniacki, Azarenka, and Petra Kvitova all have career rates at about this level.

Unlike in the men’s game, there’s little evidence that left-handers have a special advantage saving break points on serve. Angelique Kerber is a few percentage points above average, but Kvitova, Lucie Safarova, and Ekaterina Makarova are all within one percentage point of neutral.

While a few marginal players are as much as ten percentage points away from neutral saving break points or converting them, the main takeaway here is that no one is building a great career on the back of consistent clutch performances on break points. Among women with at least 250 tour-level matches in the last decade, only Barbora Strycova has won more than 3% more break points (serve and return combined) than expected. Maria Kirilenko is the only player more than 3% below expected.

This analysis doesn’t tell us anything very interesting about the intrinsic skills of our favorite players, but that doesn’t mean it’s without value. If we can count on almost all players posting average numbers over the long term, we can identify short-term extremes and predict that certain players will return to normal.

And that (finally) brings us to Venus Williams. Since 2006, Venus has played break points a little bit worse than average, saving 2% more break points than typical serve points (compared to +4% for most stars) and winning break points on return 3% less often than other return points.

But this year, Venus has saved break points 17% less often than typical service points, the lowest single-season number from someone who played more than 20 tour-level matches. That’s roughly once per match this year that Venus has failed to save a break point that–in an average year–she would’ve saved.

There’s no guarantee that saving those additional break points would’ve changed many of Venus’s results this year, but given the usual strength of her service game, holding serve even a little bit more would make a difference.

This type of analysis can’t say whether a rough patch like Venus’s is due to bad luck, mental lapses, or something else entirely, but it does suggest very strongly than she will bounce back. In fact, she already has. In her successful US Open run, she’s won about 66% of service points while saving 63% of break points. That’s not nearly as good as Serena’s performance this year, but it’s much closer to her own career average.

Like so many tennis stats that fluctuate from match to match or year to year, this is another one that evens out in the end. A particularly good or bad number probably isn’t a sign of a long-term trend. Instead, it’s a signal that the short-term streak is unlikely to last.

Will the US Open First-Round Bloodbath Benefit Serena Williams?

After only two days of play, the US Open women’s draw is a shell of its former self.

Ten seeds have been eliminated, only the fifth time in the 32-seed era that the number of first-round upsets has reached double digits. Four of the top ten seeds were among the victims, marking the first time since 1994 that so many top-tenners failed to reach the second round of a Grand Slam.

Things are particularly dramatic in the top half of the draw, where Serena Williams can now reach the final without playing a single top-ten opponent. In a single day of play, my (conservative) forecast of her chances of winning the tournament rose from 42% to 47%, only a small fraction of which owed to her defeat of Vitalia Diatchenko.

However, plenty of obstacles remain. Serena could face Agnieszka Radwanska or Madison Keys in the fourth round, and then Belinda Bencic–the last player to beat her–in the quarters. A possible semifinal opponent is Elina Svitolina, a rising star who took a set from Serena at this year’s Australian Open.

The first-round carnage didn’t include most of the players who have demonstrated they can challenge the top seed. Five of the last six players to beat Serena–Bencic, Petra Kvitova, Simona Halep, Venus Williams, and Garbine Muguruza–are still alive. Only Alize Cornet, the 27th seed who holds an improbable .500 career record against Serena, is out of the picture.

What’s more, early-round bloodbaths haven’t, in the past, cleared the way for favorites. In the 59 majors since 2001, when the number of seeds increased to 32, the number of first-round upsets has had little to do with the likelihood that the top seed goes on to win the tournament.

In 18 of those 59 Slams, four or fewer seeds were upset in the first round. The top seed went on to win five times. In 22 of the 59, five or six seeds were upset in the first round, and the top seed won eight times.

In the remaining 19 Slams, in which seven or more seeds were upset in the first round, the top seed won only five times. Serena has “lost” four of those events, most recently last year’s Wimbledon, when nine seeds fell in their opening matches and Cornet defeated her in the third round.

This is necessarily a small sample, and even setting aside statistical qualms, it doesn’t tell the whole story. While Serena has failed to win four of these carnage-ridden majors, she has won three more of them when she wasn’t the top seed, including the 2012 US Open, when ten seeds lost in the first round and Williams went on to beat Victoria Azarenka in the final.

Taken together, the evidence is decidedly mixed. With the exception of Cornet, the ten defeated seeds aren’t the ones Serena would’ve chosen to remove from her path. While her odds have improved a bit on paper, the path through Keys, Bencic, Svitolina, and Halep or Kvitova in the final is as difficult as any she was likely to face.