Forecasting the 2016 ATP World Tour Finals

Italian translation at settesei.it

Andy Murray is the #1 seed this week in London, but as I wrote for The Economist, Novak Djokovic likely remains the best player in the world. According to my Elo ratings, he would have a 63% chance of winning a head-to-head match between the two. And with the added benefit of an easier round-robin draw, the math heavily favors Djokovic to win the tournament.

Here are the results of a Monte Carlo simulation of the draw:

Player        SF      F      W  
Djokovic   95.3%  73.9%  54.6%  
Murray     86.3%  58.3%  29.7%  
Nishikori  60.4%  24.9%   7.8%  
Raonic     50.9%  16.3%   3.3%  
Wawrinka   29.4%   7.8%   1.6%  
Monfils    33.2%   8.7%   1.4%  
Cilic      23.9%   5.8%   1.1%  
Thiem      20.7%   4.1%   0.5%

I don’t think I’ve ever seen a player favored so heavily to progress out of the group stage. Murray’s 86% chance of doing so is quite high in itself; Novak’s 95% is otherworldly. His head-to-heads against the other players in his group are backed up by major differences in Elo points–Dominic Thiem is a lowly 15th on the Elo list, given only a 7.4% chance of beating the Serb.

If Milos Raonic is unable to compete, Djokovic’s chances climb even higher. Here are the probabilities if David Goffin takes Raonic’s place in the bracket:

Player        SF      F      W  
Djokovic   96.8%  75.2%  55.4%  
Murray     86.2%  60.7%  30.6%  
Nishikori  60.7%  26.3%   8.1%  
Monfils    47.7%  12.4%   1.8%  
Wawrinka   29.3%   8.5%   1.7%  
Cilic      23.8%   6.2%   1.1%  
Thiem      29.5%   5.8%   0.7%  
Goffin     26.0%   4.9%   0.5%

The luck of the draw was on Novak’s side. I ran another simulation with Djokovic and Murray swapping groups. Here, Djokovic is still heavily favored to win the tournament, but Murray’s semifinal chances get a sizable boost:

Player        SF      F      W  
Djokovic   92.8%  75.1%  54.9%  
Murray     90.9%  58.1%  29.8%  
Nishikori  58.4%  26.9%   7.5%  
Raonic     52.3%  14.3%   3.3%  
Wawrinka   26.9%   8.4%   1.6%  
Monfils    35.3%   7.5%   1.4%  
Cilic      21.9%   6.2%   1.0%  
Thiem      21.6%   3.4%   0.5%

Elo rates Djokovic so highly that he is favored no matter what the draw. But the draw certainly helped.

Doubles!

I’ve finally put together a sufficient doubles dataset to generate Elo ratings and tournament forecasts for ATP doubles. While I’m not quite ready to go into detail, I can say that, by using the Elo algorithm and rating players individually, the resulting forecasts outperform the ATP rankings about as much as singles Elo ratings do.

Here is the forecast for the doubles event at the World Tour Finals:

Team               SF      F      W  
Herbert/Mahut   76.4%  49.5%  32.1%  
Bryan/Bryan     68.7%  36.8%  19.9%  
Kontinen/Peers  55.7%  29.1%  13.8%  
Dodig/Melo      58.4%  28.1%  13.2%  
Murray/Soares   48.3%  20.8%   8.6%  
Lopez/Lopez     37.7%  16.4%   6.2%  
Klaasen/Ram     30.2%  11.9%   4.0%  
Huey/Mirnyi     24.6%   7.3%   2.2%

This distribution is more like what round-robin forecasts usually look like, without a massive gap between the top of the field and the rest. Pierre-Hugues Herbert and Nicolas Mahut are the top rated team, followed closely by Bob Bryan and Mike Bryan. Max Mirnyi was, at his peak, one of the highest Elo-rated doubles players, but his pairing with Treat Huey is the weakest of the bunch.

The men’s doubles bracket has some legendary names, along with some players–like Herbert and Henri Kontinen–who may develop into all-time greats, but it has no competitors who loom over the rest of the field like Murray and Djokovic do in singles.

Elo-Forecasting the WTA Tour Finals in Singapore

With the field of eight divided into two round-robin groups for the WTA Tour Finals in Singapore, we can play around with some forecasts for this event. I’ve updated my Elo ratings through last week’s tournaments, and the first thing that jumps out is how different they are from the official rankings.

Here’s the Singapore field:

EloRank  Player                Elo  Group  
2        Maria Sharapova      2296    RED  
4        Simona Halep         2181    RED  
6        Garbine Muguruza     2147  WHITE  
8        Petra Kvitova        2136  WHITE  
9        Angelique Kerber     2129  WHITE  
11       Agnieszka Radwanska  2100    RED  
15       Lucie Safarova       2051  WHITE  
21       Flavia Pennetta      2004    RED

Serena Williams (#1 in just about every imaginable ranking system) chose not to play, but if Elo ruled the day, Belinda Bencic, Venus Williams, and Victoria Azarenka would be playing this week in place of Agnieszka Radwanska, Lucie Safarova, and Flavia Pennetta.

Anyway, we’ll work with what we’ve got. Maria Sharapova is, according to Elo, a huge favorite here. The ratings translate into a forecast that looks like this:

Player                  SF  Final  Title  
Maria Sharapova      83.7%  61.1%  43.6%  
Simona Halep         60.8%  35.4%  15.9%  
Garbine Muguruza     59.4%  25.7%  11.3%  
Petra Kvitova        55.2%  23.0%   9.8%  
Angelique Kerber     53.1%  21.7%   8.8%  
Agnieszka Radwanska  37.4%  17.4%   6.1%  
Lucie Safarova       32.3%   9.7%   3.1%  
Flavia Pennetta      18.1%   6.0%   1.4%

If Sharapova is really that good, the loser in today’s draw was Simona Halep. The top seed would typically benefit from having the second seed in the other group, but because Garbine Muguruza recently took over the third spot in the rankings, Pova entered the draw as a dangerous floater.

However, these ratings don’t reflect the fact that Sharapova hasn’t completed a match since Wimbledon. They don’t decline with inactivity, so Pova’s rating is the same as it was the day after she lost to Serena back in July. (My algorithm also excludes retirements, so her attempted return in Wuhan isn’t considered.)

With as little as we know about Sharapova’s health, it’s tough to know how to tweak her rating. For lack of any better ideas, I revised her Elo rating to 2132, right between Petra Kvitova and Angelique Kerber. At her best, Sharapova is better than that, but consider this a way of factoring in the substantial possibility that she’ll play much, much worse–or that she’ll get injured and her matches will be played by Carla Suarez Navarro instead. The revised forecast:

Player                  SF  Final  Title  
Simona Halep         69.9%  40.9%  24.0%  
Garbine Muguruza     59.4%  31.5%  16.5%  
Maria Sharapova      57.6%  29.5%  14.5%  
Petra Kvitova        55.6%  28.4%  14.4%  
Angelique Kerber     52.5%  26.3%  13.2%  
Agnieszka Radwanska  47.9%  22.3%   9.9%  
Lucie Safarova       32.6%  12.9%   4.9%  
Flavia Pennetta      24.7%   8.3%   2.7%

If this is a reasonably accurate estimate of Sharapova’s current ability, the Red group suddenly looks like the right place to be. Because Elo doesn’t give any particular weight to Grand Slams, it suggests that the official rankings far overestimate the current level of Safarova and Pennetta. The weakness of those two makes Halep a very likely semifinalist and also means that, in this forecast, the winner of the tournament is more likely (54% to 46%) to come from the White group.

Without Serena, and with Sharapova’s health in question, there are simply no dominant players in the field this week. If nothing else, these forecasts illustrate that we’d be foolish to take any Singapore predictions too seriously.

Forecasting the Effects of Performance Byes in Beijing

To the uninitiated, the WTA draw in Beijing this week looks a little strange. The 64-player draw includes four byes, which were given to the four semifinalists from last week’s event in Wuhan. So instead of empty places in the bracket next to the top four seeds, those free passes go to the 5th, 10th, and 15th seeds, along with one unseeded player, Venus Williams.

“Performance byes”–those given to players based on their results the previous week, rather than their seed–have occasionally featured in WTA draws over the last few years. If you’re interested in their recent history, Victoria Chiesa wrote an excellent overview.

I’m interested in measuring the benefit these byes confer on the recipients–and the negative effect they have on the players who would have received those byes had they been awarded in the usual way. I’ve written about the effects of byes before, but I haven’t contrasted different approaches to awarding them.

This week, the beneficiaries are Garbine Muguruza, Angelique Kerber, Roberta Vinci, and Venus Williams. The top four seeds–the women who were atypically required to play first-round matches, were Simona Halep, Petra Kvitova, Flavia Pennetta, and Agnieszka Radwanska.

To quantify the impact of the various possible formats of a 64-player draw, I used a variety of tools: Elo to rate players and predict match outcomes, Monte Carlo tournament simulations to consider many different permutations of each draw, and a modified version of my code to “reseed” brackets. While this is complicated stuff under the hood, the results aren’t that opaque.

Here are three different types of 64-player draws that Beijing might have employed:

  1. Performance byes to last week’s semifinalists. This gives a substantial boost to the players receiving byes, and compared to any other format, has a negative effect on top players. Not only are the top four seeds required to play a first-round match, they are a bit more likely to play last week’s semifinalists, since the byes give those players a better chance of advancing.
  2. Byes to the top four seeds. The top four seeds get an obvious boost, and everyone else suffers a bit, as they are that much more likely to face the top four.
  3. No byes: 64 players in the draw instead of 60. The clear winners in this scenario are the players who wouldn’t otherwise make it into the main draw. Unseeded players (excluding Venus) also benefit slightly, as the lack of byes mean that top players are less likely to advance.

Let’s crunch the numbers. For each of the three scenarios, I ran simulations based on the field without knowing how the draw turned out. That is, Kvitova is always seeded second, but she doesn’t always play Sara Errani in the first round. This approach eliminates any biases in the actual draw. To simulate the 64-player field, I added the four top-ranked players who lost in the final round of qualifying.

To compare the effects of each draw type on every player, I calculated “expected points” based on their probability of reaching each round. For instance, if Halep entered the tournament with a 20% chance of winning the event with its 1,000 ranking points, she’d have 200 “expected points,” plus her expected points for the higher probabilities (and lower number of points) of reaching every round in between. It’s simply a way of combining a lot of probabilities into a single easier-to-understand number.

Here are the expected points in each draw scenario (plus the actual Beijing draw) for the top four players, the four players who received performance byes, plus a couple of others (Belinda Bencic and Caroline Wozniacki) who rated particularly highly:

Player               Seed  PerfByes  TopByes  NoByes  Actual  
Simona Halep            1       323      364     330     341  
Petra Kvitova           2       276      323     290     291  
Venus Williams                  247      216     218     279  
Belinda Bencic         11       255      249     268     254  
Garbine Muguruza        5       243      202     210     227  
Angelique Kerber       10       260      224     235     227  
Caroline Wozniacki      8       208      203     205     199  
Flavia Pennetta         3       142      177     144     195  
Agnieszka Radwanska     4       185      233     192     188  
Roberta Vinci          15       120       91      94      90

As expected, the top four seeds are expected to reap far more points when given first-round byes. It’s most noticeable for Pennetta and Radwanska, who would enjoy a 20% boost in expected points if given a first-round bye. Oddly, though, the draw worked out very favorably for Flavia–Elo gave her a 95% chance of beating her first-round opponent Xinyun Han, and her draw steered her relatively clear of other dangerous players in subsequent rounds.

Similarly, the performance byes are worth a 15 to 30% advantage in expected points to the players who receive them. Vinci is the biggest winner here, as we would generally expect from the player most likely to suffer an upset without the bye.

Like Pennetta, Venus was treated very well by the way the draw turned out. The bye already gave her an approximately 15% boost compared to her expectations without a bye, and the draw tacked another 13% onto that. Both the structure of the draw and some luck on draw day made her the event’s third most likely champion, while the other scenarios would have left her in fifth.

All byes–conventional or unconventional–work to the advantage of some players and against others. However they are granted, they tend to work in favor of those who are already successful, whether that success is over the course of a year or a single week.

Performance byes are easy enough to defend: They give successful players a bit more rest between two demanding events, and from the tour’s perspective, they make it a little more likely that last week’s best players won’t pull off of this week’s tourney. And if all byes tend to the make the rich a little richer, at least performance byes open the possibility of benefiting different players than usual.

The Pervasive Role of Luck in Tennis

Italian translation at settesei.it

No matter what the scale, from a single point to a season-long ranking–even to a career–luck plays a huge role in tennis. Sometimes good luck and bad luck cancel each other out, as is the case when two players benefit from net cord winners in the same match. But sometimes luck spawns more of the same, giving fortunate players opportunities that, in turn, make them more fortunate still.

Usually, we refer to luck only in passing, as one possible explanation for an isolated phenomenon. It’s important that we examine them in conjunction with each other to get a better sense of just how much of a factor luck can be.

Single points

Usually, we’re comfortable saying that the results of individual points are based on skill. Occasionally, though, something happens to give the point to an undeserving player. The most obvious examples are points heavily influenced by a net cord or a bad bounce off an uneven surface, but there are others.

Officiating gets in the way, too. A bad call that the chair umpire doesn’t overturn can hand a point to the wrong player. Even if the chair umpire (or Hawkeye) does overrule a bad call, it can result in the point being replayed–even if one player was completely in control of the point.

We can go a bit further into the territory of “lucky shots,” including successful mishits, or even highlight-reel tweeners that a player could never replicate. While the line between truly lucky shots and successful low-percentage shots is an ambiguous one, we should remember that in the most extreme cases, skill isn’t the only thing determining the outcome of the point.

Lucky matches

More than 5% of matches on the ATP tour this year have been won by a player who failed to win more than half of points played. Another 25% were won by a player who failed to win more than 53% of points–a range that doesn’t guarantee victory.

Depending on what you think about clutch and momentum in tennis, you might not view some–or even any–of those outcomes as lucky. If a player converts all five of his break point opportunities and wins a match despite only winning 49% of total points, perhaps he deserved it more. The same goes for strong performance in a tiebreaks, another cluster of high-leverage points that can swing a match away from the player who won more points.

But when the margins are so small that executing at just one or two key moments can flip the result–especially when we know that points are themselves influenced by luck–we have to view at least some of these tight matches as having lucky outcomes. We don’t have to decide which is which, we simply need to acknowledge that some matches aren’t won by the better player, even if we use the very loose definition of “better player that day.”

Longer-term luck

Perhaps the most obvious manifestation of luck in tennis is in the draw each week. An unseeded player might start his tournament with an unwinnable match against a top seed or with a cakewalk against a low-ranked wild card. Even seeded players can be affected by fortune, depending on which unseeded players they draw, along with which fellow seeds they will face at which points in the match.

Another form of long-term luck–which is itself affected by draw luck–is what we might call “clustering.” A player who goes 20-20 on a season by winning all of his first-round matches and losing all of his second-round matches will not fare nearly as well in terms of rankings or prize money as someone who goes 20-20 by winning only 10 first-round matches, but reaching the third round every time he does.

Again, this may not be entirely luck–this sort of player would quickly be labeled “streaky,” but combined with draw luck, he might simply be facing players he can beat in clusters, instead of getting easy first-rounders and difficult second-rounders.

The Matthew effect

All of these forms of tennis-playing fortune are in some way related. The sociologist Robert Merton coined the term “Matthew effect“–alternatively known as the principle of cumulative advantage–to refer to situations where one entity with a very small advantage will, by the very nature of a system, end up with a much larger advantage.

The Matthew effect applies to a wide range of phenomena, and I think it’s instructive here. Consider the case of two players separated by only a few points in the rankings–a margin that could have come about by pure luck: for instance, when one player won a match by walkover. One of these players gets the 32nd seed at the Australian Open and the other is unseeded.

These two players–who are virtually indistinguishable, remember–face very different challenges. One is guaranteed two matches against unseeded opponents, while the other will almost definitely face a seed before the third round, perhaps even a high seed in the first. The unseeded player might get lucky, either in his draw or in his matches, cancelling out the effect of the seeding, but it’s more likely that the seeded player will walk away from the tournament with more points, solidifying the higher ranking–that he didn’t earn in the first place.

Making and breaking careers

The Matthew effect can have an impact on an even broader scale. Today’s tennis pros have been training and competing from a young age, and most of them have gotten quite a bit of help along the way, whether it’s the right coach, support from a national federation, or well-timed wild cards.

It’s tough to quantify things like the effect of a good or bad coach at age 15, but wild cards are a more easily understood example of the phenomenon. The unlucky unseeded player I discussed above at least got to enter the tournament. But when a Grand Slam-hosting federation decides which promising prospect gets a wild card, it’s all or nothing: One player gets a huge opportunity (cash and ranking points, even if they lose in the first round!) while the other one gets nothing.

This, in a nutshell, is why people like me spend so much time on our hobby horses ranting about wild cards. It isn’t the single tournament entry that’s the problem, it’s the cascading opportunities it can generate. Sure, sometimes it turns into nothing–Ryan Harrison’s career is starting to look that way–but even in those cases, we never hear about the players who didn’t get the wild cards, the ones who never had the chance to gain from the cumulative advantage of a small leg up.

Why all this luck matters

If you’re an avid tennis fan, most of this isn’t news to you. Sure, players face good and bad breaks, they get good and bad draws, and they’ve faced uneven challenges along the way.

By discussing all of these types of fortune in one place, I hope to emphasize just how much luck plays a part in our estimate of each player at any given time. It’s no accident that mid-range players bounce around the rankings so much. Some of them are truly streaky, and injuries play a part, but much of the variance can be explained by these varying forms of luck. The #30 player in the rankings is probably better than the #50 player, but it’s no guarantee. It doesn’t take much misfortune–especially when bad luck starts to breed more opportunities for bad luck–to tumble down the list.

Even if many of the forms of luck I’ve discussed are truly skill-based and, say, break point conversions are a matter of someone playing better that day, the evidence generally shows that major rises and falls in things like tiebreak winning percentage and break point conversion rates are temporary–they don’t persist from year to year. That may not be properly classed as luck, but if we’re projecting the rankings a year from now, it might as well be.

While match results, tournament outcomes, and the weekly rankings are written in stone, the way that players get there is not nearly so clear. We’d do well to accept that uncertainty.

Unlikely Davis Cup Finalists and an Early Forecast for Ghent

Among nations that have reached Davis Cup finals, neither Great Britain or Belgium quite fits the mold.

The fortunes of the UK team depend almost entirely on Andy Murray. If you have to choose one player, you couldn’t do much better, but it’s hardly a strategy with lots of room for error. While the Belgian team is a bit more balanced, it doesn’t boast the sort of superstar singles player that most successful nations can send into battle.

Thanks to injury and apathy, the Brits and the Belgians haven’t defeated the level of competition usually required of Davis Cup finalists. Belgium hasn’t had to face any singles player better than Leonardo Mayer, and the only top-ten singles player to show up against Britain was Gilles Simon.

Measured by season-best singles rankings, these are two of the weakest Davis Cup finalists in the modern era [1]. The last time a finalist didn’t have two top-50 singles players was 1987, when the Indian team snuck past the Australians in the semifinals, only to be trounced by a powerhouse Swedish side in the final. This year, neither side has two top-50 players [2].

It’s even worse for the Belgians: David Goffin, their best singles player, has never topped 14th in the rankings. Only three times since 2000 has a nation reached the final without a top-ten player, and to find a side that won the Davis Cup without a top-tenner, we must go back to 1996, when the French team, headed by Arnaud Boetsch and Cedric Pioline, claimed the Cup.

Even when a nation reaches the final without a top-ten singles player, they typically have another singles player in the same range. Yet Belgium’s Steve Darcis has only now crept back into the top 60.

Despite a widespread belief that you can throw logic out the window in the riot that is Davis Cup, the better players still tend to win. Here are Elo-rating-based predictions for the four probable rubbers on clay:

  • Murray d. Darcis (94.3%)
  • Goffin d. GBR-2 (90.1%)
  • Murray d. Goffin (86.7%)
  • Darcis d. GBR-2 (78.1%)

Predicting the outcome of any doubles matches–let alone best-of-five-setters with players yet to be determined, probably including one very good but low-ranked player in Andy Murray–is beyond me. But based on the Murray brothers’ performance against Australia and the Belgians’ lack of a true doubles specialist, the edge has to go to Britain–let’s say 65%.

If we accept these individual probabilities, Great Britain has a 65.2% chance of winning the Davis Cup. That doesn’t take into account home court advantage, which will probably be a factor and favor the Belgians [3].

It’s a huge opportunity for the Brits, but it’s still quite a chance for Belgium, which hasn’t been this close to the Davis Cup for a century.  After all, the Cup is inscribed with country names, not judgments about that nation’s easy path to the final.

Continue reading Unlikely Davis Cup Finalists and an Early Forecast for Ghent

The Myth of the Tricky First Meeting

Italian translation at settesei.it

Today, both Roger Federer and Stan Wawrinka will play opponents they’ve never faced before. In Federer’s case, the challenger is Steve Darcis, a 31-year-old serve-and-volleyer playing in his 22nd Grand Slam event. Wawrinka will face Hyeon Chung, a 19-year-old baseliner in only his second Slam draw.

For all those differences, both Federer and Wawrinka will need to contend with a new opponent–slightly different spins, angles, and playing styles than they’ve seen before.  In the broadcast introduction to each match, we can expect to hear about this from the commentators. Something along the lines of, “No matter what the ranking, it’s never easy to play someone for the first time. He’s probably watched some video, but it’s different being out there on the court.”

All true, as even rec players can attest. But does it matter? After all, both players are facing a new opponent. While Darcis, for example, has surely watched a lot more video of Federer than Roger has of him, isn’t it just as different being out on the court facing Federer for the first time?

Attempting to apply common sense to the cliche will only get us so far. Let’s turn to the numbers.

Math is tricky; these matches aren’t

Usually, when we talk about “tricky first meetings,” we’re referring to these sorts of star-versus-newcomer or star-versus-journeyman battles. When two newcomers or two journeymen face off for the first time, it isn’t so notable. So, looking at data from the last fifteen years, I limited the view to matches between top-ten players and unseeded opponents.

This gives us a pretty hefty sample of nearly 7,000 matches. About 2,000 of those were first meetings. Even though the sample is limited to matches since 2000, I checked 1990s data–including Challengers–to ensure that these “first meetings” really were firsts.

Let’s start with the basics. Top-tenners have won 86.4% of these first meetings. The details of who they’re facing doesn’t matter too much. Their record when the new opponent is a wild card is almost identical, as is the success rate when the new opponent came through qualifying.

The first-meeting winning percentage is influenced a bit by age. When a top-tenner faces a player under the age of 24 for the first time, he wins 84.6% of matches. Against 24-year-olds and up, the equivalent rate is 88.0%. That jibes with what we’d expect: a newcomer like Chung or Borna Coric is more likely to cause problems for a top player than someone like Darcis or Joao Souza, Novak Djokovic‘s first-round victim.

The overall rate of 86.4% doesn’t do justice to guys like Federer. As a top-tenner, Roger has won 95% of his matches against first-time opponents, losing just 8 of 167 meetings. Djokovic, Rafael Nadal, and Andy Murray are all close behind, each within rounding distance of 93%.

By every comparison I could devise, the first-time meeting is the easiest type of match for top players.

The most broad (though approximate) control group consists of matches between top-tenners and unseeded players they have faced before. Favorites won 76.9% of those matches. Federer and Djokovic win 91% of those matches, while Nadal wins 89% and Murray 86%. In all of these comparisons, first-time meetings are more favorable to the high-ranked player.

A more tailored control group involves first-time meetings that had at least one rematch. In those cases, we can look at the winning percentage in the first match and the corresponding rate in the second match, having removed much of the bias from the larger sample.

Against opponents they would face again, top-tenners won their first meetings 85.1% of the time. In their second meeting, that success rate fell to 80.2%. It’s tough to say exactly why that rate went down–in part, it can be explained by underdogs improving their games, or learning something in the first match–but to make a weak version of the argument, it certainly doesn’t provide any evidence that first matches are the tough ones.

It may be true that first matches–no matter the quality of the opponent–feel tricky. It’s possible it takes more time to get used to first-time opponents, and that those underdogs are more likely to take a first set, or at least push it to a tiebreak. That’s a natural thing to think when such a match turns out closer than expected.

Whether or not any of that is true, the end result is the same. Top players appear to be generally immune to whatever trickiness first meetings hold, and they win such contests at a rate higher than any comparable set of matches.

Certainly, Fed fans have little to worry about. Most of his first-meeting losses were against players who would go on to have excellent careers: Mario Ancic, Guillermo Canas, Gilles Simon, Tomas Berdych, and Richard Gasquet.

His last loss facing a new opponent was his three-tiebreak heartbreaker to Nick Kyrgios in Madrid, only his third first-meeting defeat in a decade. As a rising star, Kyrgios fits the pattern of Fed’s previous first-meeting conquerors. Darcis, however, looks like yet another opponent that Federer will find distinctly not tricky.

The Limited Value of Head-to-Head Records

Italian translation at settesei.it

Yesterday at the Australian Open, Ana Ivanovic defeated Serena Williams, despite having failed to take a set in four previous meetings. Later in the day, Tomas Berdych beat Kevin Anderson for the tenth straight time.

Commentators and bettors love head-to-head records. You’ll often hear people say, “tennis is a game of matchups,” which, I suppose, is hardly disprovable.

But how much do head-to-head records really mean?  If Player A has a better record than Player B but Player B has won the majority of their career meetings, who do you pick? To what extent does head-to-head record trump everything (or anything) else?

It’s important to remember that, most of the time, head-to-head records don’t clash with any other measurement of relative skill. On the ATP tour, head-to-head record agrees with relative ranking 69% of the time–that is, the player who is leading the H2H is also the one with the better record. When a pair of players have faced each other five or more times, H2H agrees with relative ranking 75% of the time.

Usually, then, the head-to-head record is right. It’s less clear whether it adds anything to our understanding. Sure, Rafael Nadal owns Stanislas Wawrinka, but would we expect anything much different from the matchup of a dominant number one and a steady-but-unspectacular number eight?

H2H against the rankings

If head-to-head records have much value, we’d expect them–at least for some subset of matches–to outperform the ATP rankings. That’s a pretty low bar–the official rankings are riddled with limitations that keep them from being very predictive.

To see if H2Hs met that standard, I looked at ATP tour-level matches since 1996. For each match, I recorded whether the winner was ranked higher than his opponent and what his head-to-head record was against that opponent. (I didn’t consider matches outside of the ATP tour in calculating head-to-heads.)

Thus, for each head-to-head record (for instance, five wins in eight career meetings), we can determine how many the H2H-favored player won, how many the higher-ranked player won, and so on.

For instance, I found 1,040 matches in which one of the players had beaten his opponent in exactly four of their previous five meetings.  65.0% of those matches went the way of the player favored by the head-to-head record, while 68.8% went to the higher-ranked player. (54.5% of the matches fell in both categories.)

Things get more interesting in the 258 matches in which the two metrics did not agree.  When the player with the 4-1 record was lower in the rankings, he won only 109 (42.2%) of those matchups. In other words, at least in this group of matches, you’d be better off going with ATP rankings than with head-to-head results.

Broader view, similar conclusions

For almost every head-to-head record, the findings are the same. There were 26 head-to-head records–everything from 1-0 to 7-3–for which we have at least 100 matches worth of results, and in 20 of them, the player with the higher ranking did better than the player with the better head-to-head.  In 19 of the 26 groups, when the ranking disagreed with the head-to-head, ranking was a more accurate predictor of the outcome.

If we tally the results for head-to-heads with at least five meetings, we get an overall picture of how these two approaches perform. 68.5% of the time, the player with the higher ranking wins, while 66.0% of the time, the match goes to the man who leads in the head-to-head. When the head-to-head and the relative ranking don’t match, ranking proves to be the better indicator 56.5% of the time.

The most extreme head-to-heads–that is, undefeated pairings such as 7-0, 8-0, and so on, are the only groups in which H2H consistently tells us more than ATP ranking does.  80% of the time, these matches go to the higher-ranked player, while 81.9% of the time, the undefeated man prevails. In the 78 matches for which H2H and ranking don’t agree, H2H is a better predictor exactly two-thirds of the time.

Explanations against intuition

When you weigh a head-to-head record more heavily than a pair of ATP rankings, you’re relying on a very small sample instead of a very big one. Yes, that small sample may be much better targeted, but it is also very small.

Not only is the sample small, often it is not as applicable as you might think. When Roger Federer defeated Lleyton Hewitt in the fourth round of the 2004 Australian Open, he had beaten the Aussie only twice in nine career meetings. Yet at that point in their careers, the 22-year-old, #2-ranked Fed was clearly in the ascendancy while Hewitt was having difficulty keeping up. Even though most of their prior meetings had been on the same surface and Hewitt had won the three most recent encounters, that small subset of Roger’s performances did not account for his steady improvement.

The most recent Fed-Hewitt meeting is another good illustration. Entering the Brisbane final, Roger had won 15 of their previous 16 matches, but while Hewitt has maintained a middle-of-the-pack level for the last several years, Federer has declined. Despite having played 26 times in their careers before the Brisbane final, none of those contests had come in the last two years.

Whether it’s surface, recency, injury, weather conditions, or any one of dozens of other factors, head-to-heads are riddled with external factors. That’s the problem with any small sample size–the noise is much more likely to overwhelm the signal. If noise can win out in the extensive Fed-Hewitt head-to-head, most one-on-one records don’t stand a chance.

Any set of rankings, whether the ATP’s points system or my somewhat more sophisticated (and more predictive) jrank algorithm, takes into account every match both players have been involved in for a fairly long stretch of time. In most cases, having all that perspective on both players’ current levels is much more valuable than a noise-ridden handful of matches. If head-to-heads can’t beat ATP rankings, they would look even worse against a better algorithm.

Some players surely do have an edge on particular opponents or types of opponents, whether it’s Andy Murray with lefties or David Ferrer with Nicolas Almagro. But most of the time, those edges are reflected in the rankings–even if the rankings don’t explicitly set out to incorporate such things.

Next time Kevin Anderson draws Berdych, he should take heart. His odds of beating the Czech next time aren’t that much different from any other man ranked around #20 against someone in the bottom half of the top ten. Even accounting for the slight effect I’ve observed in undefeated head-to-heads, a lopsided one-on-one record isn’t fate.

Winners and Losers in the 2014 Australian Open Men’s Draw

Every draw carries with it plenty of luck, but even by Grand Slam standards, this year’s Australian Open men’s singles draw seems a bit lopsided.  The top half makes possible a Rafael NadalRoger Federer semifinal, at least if Federer gets past Andy Murray and Nadal beats the likes of Bernard Tomic.

While Novak Djokovic is seeded below Nadal, he gets the benefit of a projected semifinal matchup with David Ferrer.  A more substantial challenge may arise one round earlier, as a possible quarterfinal opponent is Stanislas Wwrinka, who took Djokovic to a fifth set twice in the last four Grand Slams.

As I’ve done in the past, let’s quantify each player’s draw luck.  Using my forecast, combined with a forecast generated by randomizing the bracket, we can see who were the biggest winners and losers in yesterday’s draw ceremony.

The algorithmic approach is most useful in confirming our suspicions about the draw luck of the top players.  Djokovic and Ferrer, the top seeds in the bottom half, definitely came out ahead.  While Djokovic had a respectable 28.0% chance of winning the tournament in the randomized projection, he has a 33.7% chance given the way the draw turned out.  In turns of expected ranking points, the draw gave him a 10.7% boost, from an expectation of 747 points to one of 827 points.  In percentage terms, Ferrer’s expectation jumped even more, from 312 to 368 (18.0%).

Nadal, however, had the worst draw luck of the top ten seeds.  Before the bracket was arranged, he had a 30.7% chance of winning the title, with an expectation of 763 ranking points.  Once the draw was set, his title chances fell to 24.9% and his point expectation dropped to 662.  No one else in the top ten lost more than 7% of their expected ranking points on draw day; Nadal lost 13%.

It doesn’t take an algorithm, though, to identify the draw’s worst losers.  They’re placed where you’ll always find them: right next to the top two seeds.  In the randomized projection, Tomic had a 58% chance of winning his first-round match and a 27% chance of reaching the third round.  In reality, though, he’ll play Nadal first.  His slight chance of earning a place in the second round gives him an expectation of 29 ranking points (10 of which he earns simply by showing up).  In the random projection, his ranking point expectation was 75.

Lukas Lacko, the unlucky man who will play Djokovic in the first round, didn’t suffer quite so much, if only because he didn’t have as high of expectations in the first place.  Before the draw, he could expect 48 ranking points and a 15% chance of reaching the third round.  Now, his projection is a mere 24 ranking points, one of the worst in the entire draw.

The luckiest players are always those who had little chance of progressing far in the draw, but managed to draw someone equally inept.  At the Australian Open, the four luckiest guys have yet to be identified: all are qualifiers.  The luckiest man of all will be the one who is placed in the topmost qualifying spot, opposite Lucas Pouille.  At this stage, my rating system doesn’t think much of the Frenchman, so it is likely that the qualifier will be the heavy favorite entering that match.

In the randomized projection, each qualifier has a 29% chance of winning his first match and a 6% chance of winning his second, for a weighted average of 32 ranking points.  The man who plays Pouille, however, will enter the field with an expectation of 55 ranking points.  Other qualifiers with nearly the same happy outcome will be those who draw Federico Delbonis, Julian Reister, and Jan Hajek in the opening round.

Here are the pre-draw and post-draw expected ranking points of the men’s seeds, along with the percentage of pre-draw points they gained or lost:

Player                 Seed  Pre  Post  Change  
Rafael Nadal           1     763   662  -13.2%  
Novak Djokovic         2     747   827   10.7%  
David Ferrer           3     312   368   18.0%  
Andy Murray            4     473   488    3.1%  
Juan Martin Del Potro  5     421   393   -6.6%  
Roger Federer          6     411   397   -3.4%  
Tomas Berdych          7     264   317   20.2%  
Stanislas Wawrinka     8     290   279   -3.9%  

Player                 Seed  Pre  Post  Change
Richard Gasquet        9     186   186    0.1%  
Jo Wilfried Tsonga     10    151   187   23.8%  
Milos Raonic           11    223   234    5.0%  
Tommy Haas             12    207   222    7.5%  
John Isner             13    176   196   11.2%  
Mikhail Youzhny        14    190   193    1.5%  
Fabio Fognini          15    101    81  -19.3%  
Kei Nishikori          16    172   135  -21.6%  

Player                 Seed  Pre  Post  Change
Tommy Robredo          17     71    61  -13.4%  
Gilles Simon           18    116    95  -18.3%  
Kevin Anderson         19     80   107   33.9%  
Jerzy Janowicz         20     99   154   55.3%  
Philipp Kohlschreiber  21    125   132    6.2%  
Grigor Dimitrov        22    136   122  -10.1%  
Ernests Gulbis         23    125   107  -14.1%  
Andreas Seppi          24     94    49  -47.8%  

Player                 Seed  Pre  Post  Change
Gael Monfils           25    147   101  -31.4%  
Feliciano Lopez        26    100    80  -20.7%  
Benoit Paire           27     94    89   -5.5%  
Vasek Pospisil         28     82    81   -0.9%  
Jeremy Chardy          29    111   126   13.7%  
Dmitry Tursunov        30    101    80  -21.0%  
Fernando Verdasco      31    106   105   -0.8%  
Ivan Dodig             32    104   106    1.8%

Challenger Tour Finals Forecast

I wrote an extensive preview of this week’s Challenger Tour Finals for The Changeover, so you should check that out first.  (Also worth a read is the preview at Foot Soldiers of Tennis.)

Because so much less separates players at this level (compared to those at last year’s World Tour Finals), my forecast stops just short of throwing its hands up in dismay.  Coming into the event, Italian clay specialist Filippo Volandri was the favorite, with a 15.5% chance of winning the event.  He lost today to Alejandro Gonzalez, making it much less likely that he’ll progress out of the round-robin stage.

Today’s other winners were top seed Teymuraz Gabashvili, Oleksandr Nedovyesov, and Jesse Huta Galung.  My numbers now consider Huta Galung the favorite, with a better than 20% chance of winning the title.  The situation in Grupo Verde will become much more clear after tomorrow’s night match between Gabashvili and Nedovyesov.

Here is the pre-tournament forecast:

Player       3-0  2-1  1-2  0-3     SF      F      W  
Gabashvili   12%  38%  37%  13%  49.8%  24.3%  12.0%  
Volandri     15%  40%  35%  10%  55.3%  29.3%  15.5%  
Nedovyesov   14%  39%  36%  11%  53.0%  26.9%  13.7%  
Huta Galung  14%  39%  36%  11%  53.8%  28.2%  14.6%  
Gonzalez     10%  35%  40%  15%  45.0%  21.8%  10.4%  
Ungur        10%  35%  40%  15%  45.0%  20.9%   9.8%  
Martin       11%  36%  39%  14%  46.0%  22.4%  10.7%  
Clezar       13%  38%  37%  11%  52.2%  26.3%  13.3%

And here is the forecast updated with the results of today’s four matches:

Player       3-0  2-1  1-2  0-3     SF      F      W  
Gabashvili   24%  50%  26%   0%  71.5%  35.0%  17.1%  
Volandri      0%  27%  50%  23%  30.2%  16.2%   8.6%  
Nedovyesov   28%  50%  22%   0%  75.7%  38.3%  19.5%  
Huta Galung  27%  50%  23%   0%  74.7%  39.0%  20.5%  
Gonzalez     23%  50%  27%   0%  70.1%  33.7%  15.8%  
Ungur         0%  22%  50%  29%  23.1%  10.8%   5.1%  
Martin        0%  23%  50%  27%  25.1%  12.2%   5.9%  
Clezar        0%  27%  50%  23%  29.6%  14.9%   7.4%

(My algorithm doesn’t implement the details of the number-of-sets-won tiebreaker, so Guilherme Clezar, the only loser today to win a set, probably has a slightly better chance of advancing than these numbers give him credit for.)

Challenger charting: The most interesting match of the day–if not the cleanest–was the last one, between Nedovyesov and Clezar.  I charted it, so you can check out detailed serve, return, and shot-by-shot stats for that contest.

And if you’re really into this stuff–Challengers and/or charting–here are my stat reports from yesterday’s first-round matches in Champaign between Ram and Giron and Sandgren and Peliwo.

Rafael Nadal, Top Twosomes, and the Future

The only match that either Rafael Nadal or Novak Djokovic lost in London was the final, when Nadal fell to Djokovic.  It was a good summary of the season as a whole.  The top two weren’t undefeated for the entire season, but they might as well have been.

Between them, Rafa and Novak lost only 16 matches this year, six of them to each other.  Fittingly, they split those six matches.  No single player poses a serious threat to their dominance.  Only Juan Martin del Potro defeated both this year, and he lost his five other encounters with the top-ranked duo.  The injured Andy Murray remains only a wildcard, having split Grand Slam finals with Djokovic this year but without having played Nadal since 2011.

Barring a huge upset loss in Davis Cup, Djokovic will end the season with the best-ever winning percentage for a #2-ranked player.  His 88.9% just edges out the 88.7% posted by Nadal in 2005, when he finished second to Roger Federer.  In the last thirty years, only five other #2’s won at least 85% of their matches.

Taking these six prior pairs as the best single-year twosomes the ATP has recently produced, it’s surprising to see what happened to them the following year.  In three of those seasons, neither of the ultra-dominant duos finished the next season at #1.  A third player overcame them both.

Here is the list of the seven most dominant twosomes of the last thirty years, along with their year-end rankings 12 months after the end of their notable seasons (Nx):

Yr  #1              W-L    Nx  #2              W-L    Nx  
83  John McEnroe    62-9    1  Mats Wilander   74-11   4  
85  Ivan Lendl      83-7    1  John McEnroe    72-10  14  
87  Ivan Lendl      70-7    2  Stefan Edberg   76-12   5  
89  Ivan Lendl      80-7    3  Boris Becker    58-8    2  
05  Roger Federer   81-4    1  Rafael Nadal    79-10   2  
12  Novak Djokovic  75-12   2  Roger Federer   74-13   6  
13  Rafael Nadal    76-7    ?  Novak Djokovic  72-9    ?

In 1988, Mats Wilander overcame both Ivan Lendl and Stefan Edberg to claim the #1 position.  In 1990, it was Edberg who leapfrogged Lendl and Boris Becker.  This year, of course, Nadal reclaimed the top spot from last year’s top two of Djokovic and Federer.

Those of us who watched the Tour Finals for the last week might find it hard to imagine that anyone–certainly not any of the other six men in London–would outperform either Rafa or Novak over the course of a season.  But injuries strike, slumps take hold, and–unlikely as it may seem in 2013–young players emerge and dominate. For all of the radical changes in the game since the late 80s, these precedents serve as an important reminder of the unpredictability of tennis.