How Elo Solves the Olympics Ranking Points Conundrum

Italian translation at settesei.it

Last week’s Olympic tennis tournament had superstars, it had drama, and it had tears, but it didn’t have ranking points. Surprise medalists Monica Puig and Juan Martin del Potro scored huge triumphs for themselves and their countries, yet they still languish at 35th and 141st in their respective tour’s rankings.

The official ATP and WTA rankings have always represented a collection of compromises, as they try to accomplish dual goals of rewarding certain behaviors (like showing up for high-profile events) and identifying the best players for entry in upcoming tournaments. Stripping the Olympics of ranking points altogether was an even weirder compromise than usual. Four years ago in London, some points were awarded and almost all the top players on both tours showed up, even though many of them could’ve won more points playing elsewhere.

For most players, the chance at Olympic gold was enough. The level of competition was quite high, so while the ATP and WTA tours treat the tournament in Rio as a mere exhibition, those of us who want to measure player ability and make forecasts must factor Olympics results into our calculations.

Elo, a rating system originally designed for chess that I’ve been using for tennis for the past year, is an excellent tool to use to integrate Rio results with the rest of this season’s wins and losses. Broadly speaking, it awards points to match winners and subtracts points from losers. Beating a top player is worth many more points than beating a lower-rated one. There is no penalty for not playing–for example, Stan Wawrinka‘s and Simona Halep‘s ratings are unchanged from a week ago.

Unlike the ATP and WTA ranking systems, which award points based on the level of tournament and round, Elo is context-neutral. Del Potro’s Elo rating improved quite a bit thanks to his first-round upset of Novak Djokovic–the same amount it would have increased if he had beaten Djokovic in, say, the Toronto final.

Many fans object to this, on the reasonable assumption that context matters. It certainly seems like the Wimbledon final should count for more than, say, a Monte Carlo quarterfinal, even if the same player defeats the same opponent in both matches.

However, results matter for ranking systems, too. A good rating system will do two things: predict winners correctly more often than other systems, and give more accurate degrees of confidence for those predictions. (For example, in a sample of 100 matches in which the system gives one player a 70% chance of winning, the favorite should win 70 times.) Elo, with its ignorance of context, predicts more winners and gives more accurate forecast certainties than any other system I’m aware of.

For one thing, it wipes the floor with the official rankings. While it’s possible that tweaking Elo with context-aware details would better the results even more, the improvement would likely be minor compared to the massive difference between Elo’s accuracy and that of the ATP and WTA algorithms.

Relying on a context-neutral system is perfect for tennis. Instead of altering the ranking system with every change in tournament format, we can always rate players the same way, using only their wins, losses, and opponents. In the case of the Olympics, it doesn’t matter which players participate, or what anyone thinks about the overall level of play. If you defeat a trio of top players, as Puig did, your rating skyrockets. Simple as that.

Two weeks ago, Puig was ranked 49th among WTA players by Elo–several places lower than her WTA ranking of 37. After beating Garbine Muguruza, Petra Kvitova, and Angelique Kerber, her Elo ranking jumped to 22nd. While it’s tough, intuitively, to know just how much weight to assign to such an outlier of a result, her Elo rating just outside the top 20 seems much more plausible than Puig’s effectively unchanged WTA ranking in the mid-30s.

Del Potro is another interesting test case, as his injury-riddled career presents difficulties for any rating system. According to the ATP algorithm, he is still outside the top 100 in the world–a common predicament for once-elite players who don’t immediately return to winning ways.

Elo has the opposite problem with players who miss a lot of time due to injury. When a player doesn’t compete, Elo assumes his level doesn’t change. That’s clearly wrong, and it has cast a lot of doubt over del Potro’s place in the Elo rankings this season. The more matches he plays, the more his rating will reflect his current ability, but his #10 position in the pre-Olympics Elo rankings seemed overly influenced by his former greatness.

(A more sophisticated Elo-based system, Glicko, was created in part to improve ratings for competitors with few recent results. I’ve tinkered with Glicko quite a bit in hopes of more accurately measuring the current levels of players like Delpo, but so far, the system as a whole hasn’t come close to matching Elo’s accuracy while also addressing the problem of long layoffs. For what it’s worth, Glicko ranked del Potro around #16 before the Olympics.)

Del Potro’s success in Rio boosted him three places in the Elo rankings, up to #7. While that still owes something to the lingering influence of his pre-injury results, it’s the first time his post-injury Elo rating comes close to passing the smell test.

You can see the full current lists elsewhere on the site: here are ATP Elo ratings and WTA Elo ratings.

Any rating system is only as good as the assumptions and data that go into it. The official ATP and WTA ranking systems have long suffered from improvised assumptions and conflicting goals. When an important event like the Olympics is excluded altogether, the data is incomplete as well. Now as much as ever, Elo shines as an alternative method. In addition to a more predictive algorithm, Elo can give Rio results the weight they deserve.

New at TennisAbstract: Weekly Elo Reports

Starting today, you can find weekly Elo ranking reports on the home page of Tennis Abstract. Here are the men’s ratings, and here are the women’s ratings.

Elo is a rating system originally designed for chess, and now used across a wide range of sports. It awards points based on who you beat, not when you beat them. That’s in direct contrast to the official ATP and WTA ranking systems, which award points based on tournament and round, regardless of whether you play a qualifier or the number one player in the world.

As such, there are some notable differences between Elo-based rankings and the official lists. In addition to some rearrangement in the top ten, ATP Elo ratings place last week’s champion Roberto Bautista Agut up at #12 (compared to #17 in the official ranking) and Jack Sock at #13 (instead of #23).

The shuffling is even more dramatic on the women’s side. Belinda Bencic, still outside the top ten in the official WTA ranking, is up to #5 by Elo. After her Fed Cup heroics last weekend, Bencic is a single Elo point away from drawing equal with #4 Angelique Kerber.

These new Elo reports also show peaks for every player. That way, you can see how close each player is to his or her career best. You can also spot which players–like Bencic and Bautista Agut–are currently at their peak.

Like any rating system, Elo isn’t perfect. In this simple form, it doesn’t consider surface at all. I haven’t factored Challenger, ITF, or qualifying results into these calculations, either. Elo also doesn’t make any adjustments when a player misses considerable time to injury; a player just re-assumes his or her old rating when they return.

That said, Elo is a more reliable way of comparing players and predicting match outcomes than the official ranking system. And now, you can check in on each player’s rating every week.

Elo-Forecasting the WTA Tour Finals in Singapore

With the field of eight divided into two round-robin groups for the WTA Tour Finals in Singapore, we can play around with some forecasts for this event. I’ve updated my Elo ratings through last week’s tournaments, and the first thing that jumps out is how different they are from the official rankings.

Here’s the Singapore field:

EloRank  Player                Elo  Group  
2        Maria Sharapova      2296    RED  
4        Simona Halep         2181    RED  
6        Garbine Muguruza     2147  WHITE  
8        Petra Kvitova        2136  WHITE  
9        Angelique Kerber     2129  WHITE  
11       Agnieszka Radwanska  2100    RED  
15       Lucie Safarova       2051  WHITE  
21       Flavia Pennetta      2004    RED

Serena Williams (#1 in just about every imaginable ranking system) chose not to play, but if Elo ruled the day, Belinda Bencic, Venus Williams, and Victoria Azarenka would be playing this week in place of Agnieszka Radwanska, Lucie Safarova, and Flavia Pennetta.

Anyway, we’ll work with what we’ve got. Maria Sharapova is, according to Elo, a huge favorite here. The ratings translate into a forecast that looks like this:

Player                  SF  Final  Title  
Maria Sharapova      83.7%  61.1%  43.6%  
Simona Halep         60.8%  35.4%  15.9%  
Garbine Muguruza     59.4%  25.7%  11.3%  
Petra Kvitova        55.2%  23.0%   9.8%  
Angelique Kerber     53.1%  21.7%   8.8%  
Agnieszka Radwanska  37.4%  17.4%   6.1%  
Lucie Safarova       32.3%   9.7%   3.1%  
Flavia Pennetta      18.1%   6.0%   1.4%

If Sharapova is really that good, the loser in today’s draw was Simona Halep. The top seed would typically benefit from having the second seed in the other group, but because Garbine Muguruza recently took over the third spot in the rankings, Pova entered the draw as a dangerous floater.

However, these ratings don’t reflect the fact that Sharapova hasn’t completed a match since Wimbledon. They don’t decline with inactivity, so Pova’s rating is the same as it was the day after she lost to Serena back in July. (My algorithm also excludes retirements, so her attempted return in Wuhan isn’t considered.)

With as little as we know about Sharapova’s health, it’s tough to know how to tweak her rating. For lack of any better ideas, I revised her Elo rating to 2132, right between Petra Kvitova and Angelique Kerber. At her best, Sharapova is better than that, but consider this a way of factoring in the substantial possibility that she’ll play much, much worse–or that she’ll get injured and her matches will be played by Carla Suarez Navarro instead. The revised forecast:

Player                  SF  Final  Title  
Simona Halep         69.9%  40.9%  24.0%  
Garbine Muguruza     59.4%  31.5%  16.5%  
Maria Sharapova      57.6%  29.5%  14.5%  
Petra Kvitova        55.6%  28.4%  14.4%  
Angelique Kerber     52.5%  26.3%  13.2%  
Agnieszka Radwanska  47.9%  22.3%   9.9%  
Lucie Safarova       32.6%  12.9%   4.9%  
Flavia Pennetta      24.7%   8.3%   2.7%

If this is a reasonably accurate estimate of Sharapova’s current ability, the Red group suddenly looks like the right place to be. Because Elo doesn’t give any particular weight to Grand Slams, it suggests that the official rankings far overestimate the current level of Safarova and Pennetta. The weakness of those two makes Halep a very likely semifinalist and also means that, in this forecast, the winner of the tournament is more likely (54% to 46%) to come from the White group.

Without Serena, and with Sharapova’s health in question, there are simply no dominant players in the field this week. If nothing else, these forecasts illustrate that we’d be foolish to take any Singapore predictions too seriously.

The Case for Novak Djokovic … and Roger Federer … and Rafael Nadal

Italian translation at settesei.it

By winning the US Open last weekend and increasing his career total to ten Grand Slams, Novak Djokovic has pushed himself even further into conversations about the greatest of all time. At the very least, his 2015 season is shaping up to be one of the best in tennis history.

A recent FiveThirtyEight article introduced Elo ratings into the debate, showing that Djokovic’s career peak–achieved earlier this year at the French Open–is the highest of anyone’s, just above 2007 Roger Federer and 1980 Bjorn Borg. In implementing my own Elo ratings, I’ve discovered just how close those peaks are.

Here are my results for the top 15 peaks of all time [1]:

Player                 Year   Elo  
Novak Djokovic         2015  2525  
Roger Federer          2007  2524  
Bjorn Borg             1980  2519  
John McEnroe           1985  2496  
Rafael Nadal           2013  2489  
Ivan Lendl             1986  2458  
Andy Murray            2009  2388  
Jimmy Connors          1979  2384  
Boris Becker           1990  2383  
Pete Sampras           1994  2376  
Andre Agassi           1995  2355  
Mats Wilander          1984  2355  
Juan Martin del Potro  2009  2352  
Stefan Edberg          1988  2346  
Guillermo Vilas        1978  2325

A one-point gap is effectively nothing: It means that peak Djokovic would have a 50.1% chance of beating peak Federer. The 35-point gap separating Novak from peak Rafael Nadal is considerably more meaningful, implying that the better player has a 55% chance of winning.

Surface-specific Elo

If we limit our scope to hard-court matches, Djokovic is still a very strong contender, but Fed’s 2007 peak is clearly the best of all time:

Player          Year  Hard Ct Elo  
Roger Federer   2007         2453  
Novak Djokovic  2014         2418  
Ivan Lendl      1989         2370  
Pete Sampras    1997         2356  
Rafael Nadal    2014         2342  
John McEnroe    1986         2332  
Andy Murray     2009         2330  
Andre Agassi    1995         2326  
Stefan Edberg   1987         2285  
Lleyton Hewitt  2002         2262

Ivan Lendl and Pete Sampras make much better showings on this list than on the overall ranking. Still, they are far behind Fed and Novak–the roughly 100-point difference between peak Fed and peak Pete is equivalent to a 64% probability that the higher-rated player would win.

On clay, I’ll give you three guesses who tops the list–and your first two guesses don’t count. It isn’t even close:

Player           Year  Clay Ct Elo  
Rafael Nadal     2009         2550  
Bjorn Borg       1982         2475  
Novak Djokovic   2015         2421  
Ivan Lendl       1988         2408  
Mats Wilander    1984         2386  
Roger Federer    2009         2343  
Jose Luis Clerc  1981         2318  
Guillermo Vilas  1982         2316  
Thomas Muster    1996         2313  
Jimmy Connors    1980         2307

Borg was great, but Nadal is in another league entirely. Though Djokovic has pushed Nadal out of many greatest-of-all-time debates–at least for the time being–there’s little doubt that Rafa is the greatest clay court player of all time, and likely the most dominant player in tennis history on any single surface.

Djokovic is well back of both Nadal and Borg, but in his favor, he’s the only player ranked in the top three for both major surfaces.

The survivor

As the second graph in the 538 article shows, Federer stands out as the greatest player of all time at his age. Most players have retired long before their 34th birthday, and even those who stick around aren’t usually contesting Grand Slam finals. In fact, Federer’s Elo rating of 2393 after his US Open semifinal win against Stanislas Wawrinka last week would rank as the sixth-highest peak of all time, behind Lendl and just ahead of Andy Murray.

Here are the top ten Elo peaks for players over 34:

Player         Age   34+ Elo  
Roger Federer  34.1     2393  
Jimmy Connors  34.1     2234  
Andre Agassi   35.3     2207  
Rod Laver      36.6     2207  
Ken Rosewall   37.4     2195  
Tommy Haas     35.3     2111  
Arthur Ashe    35.7     2107  
Ivan Lendl     34.1     2054  
Andres Gimeno  35.0     2035  
Mark Cox       34.0     2014

The 160-point gap between Federer and Jimmy Connors implies that 34-year-old Fed would win about 70% of the time against 34-year-old Connors. No one has ever sustained this level of play–or anything close to it–for this long.

At the risk of belaboring the point, similar arguments can be made for 33-year-old Fed, all the way to 30-year-old Fed. At almost any stage in the last four years, Federer has been better than any player in history at that age [2].  Djokovic has matched many of Roger’s career accomplishments so far, especially on clay, but it would be truly remarkable if he maintained a similar level of play through the end of the decade.

Current Elo ratings

While it’s not really germane to today’s subject, I’ve got the numbers, so let’s take a look at the current ATP Elo ratings. Since Elo is new to most tennis fans, I’ve included columns to indicate each player’s chances of beating Djokovic and of beating the current #10, Milos Raonic, based on their rating. As a general rule, a 100-point gap translates to a 64% chance of winning for the favorite, a 200-point gap implies 76%, and a 500-point gap is equivalent to 95%.

Rank  Player                  Elo  Vs #1  Vs #10  
1     Novak Djokovic         2511      -     91%  
2     Roger Federer          2386    33%     84%  
3     Andy Murray            2332    26%     79%  
4     Kei Nishikori          2256    19%     71%  
5     Rafael Nadal           2256    19%     71%  
6     Stan Wawrinka          2186    13%     62%  
7     David Ferrer           2159    12%     58%  
8     Tomas Berdych          2148    11%     56%  
9     Richard Gasquet        2128    10%     54%  
10    Milos Raonic           2103     9%       -  
                                                  
Rank  Player                  Elo  Vs #1  Vs #10  
11    Gael Monfils           2084     8%     47%  
12    Jo-Wilfried Tsonga     2083     8%     47%  
13    Marin Cilic            2081     8%     47%  
14    Kevin Anderson         2074     7%     46%  
15    John Isner             2035     6%     40%  
16    David Goffin           2027     6%     39%  
17    Grigor Dimitrov        2021     6%     38%  
18    Gilles Simon           2005     5%     36%  
19    Jack Sock              1994     5%     35%  
20    Roberto Bautista Agut  1986     5%     34%  
                                                  
Rank  Player                  Elo  Vs #1  Vs #10  
21    Philipp Kohlschreiber  1982     5%     33%  
22    Tommy Robredo          1963     4%     31%  
23    Feliciano Lopez        1955     4%     30%  
24    Nick Kyrgios           1951     4%     29%  
25    Ivo Karlovic           1949     4%     29%  
26    Jeremy Chardy          1940     4%     28%  
27    Alexandr Dolgopolov    1940     4%     28%  
28    Bernard Tomic          1936     4%     28%  
29    Fernando Verdasco      1932     3%     27%  
30    Fabio Fognini          1925     3%     26%

Continue reading The Case for Novak Djokovic … and Roger Federer … and Rafael Nadal

How Elo Rates US Open Finalists Flavia Pennetta and Roberta Vinci

Italian translation at settesei.it

Among the many good things that have happened to Flavia Pennetta and Roberta Vinci after reaching the final of this year’s US Open, both enjoyed huge leaps in Monday’s official WTA rankings. Pennetta rose from 26th to 8th, and Vinci jumped from 43rd to 19th.

Such large changes in rankings are always a little suspicious and expose the weakness of systems that award points based on round achieved. A lucky draw or one incredible outlier of a match doesn’t mean that a player is suddenly massively better than she was a couple of weeks ago.

To put it another way: As they are, the official rankings do a decent job of representing how a player has performed. What they don’t do so well is represent how well someone is playing, or the closely related issue of how well she will play.

For that, we can turn to Elo ratings, which Carl Bialik and Benjamin Morris used at the beginning of the US Open to compare Serena Williams to other all-time greats [1]. Elo awards points based on opponent quality, not the importance of the tournament or round. As such, the system provides a better estimate of the current skill level of each player than the official rankings do.

Sure enough, Elo agrees with my hypothesis, that Pennetta didn’t suddenly become the 8th best player in the world. Instead, she rose to 17th, just behind Garbine Muguruza (another Slam finalist overestimated by the rankings) and ahead of Elina Svitolina. Vinci didn’t really return to the top 20, either: Elo places her 34th, between Camila Giorgi and Barbora Strycova.

While her official ranking of 8th is Pennetta’s career high, Elo disagrees again. The system claims that Pennetta peaked during the US Open six years ago, after a strong summer that involved semifinal-or-better showings in four straight tournaments, plus a fourth-round win over Vera Zvonareva in New York. She’s more than 100 points below that career-high level, equivalent to the present gap between her and 7th-Elo-rated Angelique Kerber.

The current Elo rankings hold plenty of surprises like this, having little in common with the official rankings:

Rank  Player                 Elo  
1     Serena Williams       2460  
2     Maria Sharapova       2298  
3     Victoria Azarenka     2221  
4     Simona Halep          2204  
5     Petra Kvitova         2174  
6     Belinda Bencic        2144  
7     Angelique Kerber      2130  
8     Venus Williams        2126  
9     Caroline Wozniacki    2095  
10    Lucie Safarova        2084

Rank  Player                 Elo   
11    Ana Ivanovic          2078  
12    Carla Suarez Navarro  2062  
13    Agnieszka Radwanska   2054  
14    Timea Bacsinszky      2041  
15    Sloane Stephens       2031  
16    Garbine Muguruza      2031  
17    Flavia Pennetta       2030  
18    Elina Svitolina       2023  
19    Madison Keys          2019  
20    Jelena Jankovic       2016

While Victoria Azarenka is still nearly 200 points shy of her peak, Elo gives her credit for the extremely tough draws that have met her return from injury. Another player rated much higher here than in the WTA rankings is Belinda Bencic, whose defeat of Serena launched her into the top ten.

The oldest final

Pennetta and Vinci are both unusually old for Slam finalists, not to mention players who reached that milestone for the first time. Elo doesn’t consider them among the very best players active today, but next to other 32- and 33-year-olds in WTA history, they compare very well indeed.

Among players 33 or older, Pennetta’s current rating is sixth best in the last thirty-plus years [2]. As the all-time list shows, that puts her in extraordinarily good company:

Rank  Player                Age   Elo  
1     Martina Navratilova  33.4  2527  
2     Serena Williams      33.9  2480  
3     Chris Evert          33.4  2412  
4     Venus Williams       33.3  2175  
5     Nathalie Tauziat     33.9  2088  
6     Flavia Pennetta      33.5  2030  
7     Wendy Turnbull       33.1  2018  
8     Conchita Martinez    33.3  2014

In the 32-and-over category, Vinci stands out as well. Her lower rating, combined with the somewhat larger pool of players who remained competitive to that ago, means that she holds 24th place in this age group. For a player who has never cracked the top ten, 24th of all time is an impressive accomplishment.

Keep an eye out for more Elo-based analysis here. Soon, I’ll be able to post and update Elo ratings on Tennis Abstract and, once a few more kinks are worked out, use them to improve the WTA tournament forecasts on the site as well.

Continue reading How Elo Rates US Open Finalists Flavia Pennetta and Roberta Vinci