Rethinking the Mental Game

Italian translation at settesei.it

Everyone seems to agree that a huge part of tennis is mental. It’s less clear exactly what that means. Pundits and fans often say that certain players are mentally strong or mentally weak, attributes that help explain the gap when there’s a mismatch between talent and results.

Here are three more adjectives you’ll hear in ‘mental game’ discussions: clutch, streaky, consistent. I’ve frequently railed against commentators’ overuse of these terms. For instance, hitting an ace facing break point is ‘clutch,’ in the sense that the player executed well in a key moment. But that doesn’t mean the player himself can be described as clutch. Just because he sometimes performs well under pressure doesn’t mean he does so any more than the average player. Same goes for ‘streaky’–humans tend to overgeneralize from small samples, so if you see a player hit three down-the-line backhand winners in a row, you’ll probably think it’s a hot streak, even though such a sequence will occasionally arise by luck alone.

Some players probably are more or less clutch, more or less streaky, or more or less consistent than their peers, even beyond what can be explained by chance. At the same time, no tour pro is so much more or less clutch that their high-leverage performance explains a substantial part of their success or failure on tour. Most players win about as many tiebreaks as you’d expect based on their non-tiebreak records and convert about as many break points as you’d predict based on their overall return stats. Nothing magical happens in these most-commonly cited pressure situations, and no player becomes either superhuman or completely hopeless.

If you’re reading my blog, you’ve probably heard most of this before, either from me or from innumerable other sports analysts. I’m not taking the extreme position that there is no clutch (or streakiness or consistency), but I am pointing out that these effects are small–so small that we are unlikely to notice them just by watching matches, and sometimes so tiny that even analysts find it difficult to differentiate them from pure randomness.

Still, we’re left with the unanimous–and appealing!–belief that tennis is a mental game. In trying to explain various simplified models, I’ll often say something like, “this is what it would look like if players were robots.” Even though some of those models are rather accurate, I think we can all agree that players aren’t robots, Milos Raonic notwithstanding.

Completely mental

An extreme version of the ‘mental game’ position is one I’ve heard attributed to James Blake, that the difference between #1 and #100 is all mental. (I’m guessing that’s an oversimplification of what Blake thinks, but I’ve heard similar opinions often enough that the general idea is worth considering.) That’s a bit hard to stomach–does anybody think that Radu Albot (the current No. 99) is as talented as Rafael Nadal? But once we backtrack a little bit from the most extreme position, we can see its appeal. At the moment, both Bernard Tomic and Ernests Gulbis are ranked between 80 and 100. Can you say with confidence that those guys aren’t as talented as top-tenners Kevin Anderson or Marin Cilic? Yet Tomic often excels in pressure situations, and Cilic is the one known to crumble.

The problem with Tomic, Gulbis, and so many of the innumerable underachievers in the history of sport, isn’t that they fall apart when the stakes are high. We can all remember matches–or sets, or other long stretches of play–in which a player seems uninterested, unmotivated, or just low-energy for no apparent reason. Even accounting for selection bias, I think the underachievers are more likely to provide these inexplicably mediocre performances. (Can you imagine Nadal appearing unmotivated? Or Maria Sharapova?) In a very broad sense, I could be talking about streakiness or consistency here, but I don’t think it’s what people usually mean by those two terms. It operates at a larger scale–an entire set of mediocrity instead of say, three double faults in a single game–and it offers us a new way of thinking about the mental aspect of tennis.

Focus

Let’s call this new variable focus. There are millions of potential distractions, internal and external, that stand in the way of peak performance. The more a player is able to ignore, disregard, or somehow overcome those distractions, the more focused she is.

Imagine that every player has her own maximum sustainable ability level, and on a scale of 1 to 10, that’s a 10. (I’m saying ‘sustainable’ to make it clear that we’re not talking about ninja Radwanska behind-the-back drop-volley stuff, but the best level that a player can keep up. Nadal’s 10 is different from Albot’s 10.) A rating of 1, at the bottom of the scale, is something we rarely see from the pros–imagine Guillermo Coria or Elena Dementieva getting serve yips. The more focused the player, the more often she’s performing at a 10 and, while she may not be able to sustain that, the more focused player remains closer to a 10 more of the time.

This idea of ‘focus’ sounds a lot like the old notion of ‘consistency’, and maybe it’s what people really mean when they call a player consistent. But there are several reasons why I think it’s important to move away from ‘consistency.’ The first one is pedantic: ‘consistent’ isn’t necessarily good. If you tell a player to be consistent and she hits nothing but unforced errors on her forehand, she has followed your directions by being consistently bad. More seriously, ‘consistency’ is often conflated with ‘low-risk’, which is a strategy, not a positive or negative trait. A player like Petra Kvitova will never be consistent–her signature level of aggression will always result in plenty of errors, sometimes ugly ones, and occasionally in ill-timed bunches. Even an optimized strategy for a highly-focused Kvitova will appear to be inconsistent.

If you’re the type of person who thinks a lot about tennis, you probably see the limitations in my definition of consistency. I agree: The concept I’ve knocked down is a bit of a strawman. If I could do a better job of consisely defining what tennis people talk about when they talk about consistency, I would–again, part of the problem is that the term is overloaded. Even if you mean ‘focus’ when you’re saying ‘consistency,’ I think it’s valuable to use a separate term with less baggage.

Chess

Is ‘focus’ any better than the other mental-game concepts I’ve knocked down? We can objectively measure clutch effects, but it’s a lot harder to look at the data from a match or an entire season and quantify a player’s level of focus.

Nonetheless, I strongly suspect that at the elite level, focus varies more than, say, micro-level streakiness. Put another way: The difference in focus among top players has the potential to explain much of their difference in performance.

I started to think about the importance of focus–again, the ability to sustain a peak or near-peak level for long periods of time–while following last month’s World Chess Championship between Magnus Carlsen and Fabiano Caruana. (I wrote about the chess match here.) Chess is very different from tennis, of course. But because it doesn’t rely on physical strength, speed, or agility at all, it has a much stronger claim to the ‘mental game’ moniker than tennis does. While flashes of brilliance have their place in chess, classical games require sustained concentration at a level that few of us can even fathom. One blunder against an elite player, and you might as well give up and get some extra rest before the next game.

A common stereotype of a chess grandmaster is an old man, whose decades of knowledge and savvy help him brush aside younger upstarts. Yet Carlsen and Caruana, the two best chess players in the world, are in their mid-20s. The current top 30 includes only four men born before 1980. 12 of the top 30 were born in the 1990s, two of them since 1998. The age distribution in elite chess is awfully similar to that of elite tennis.

The aging curve in tennis lends itself to easy explanations: Players can start reaching the top when they hit physical maturity in their late teens, they continue to improve throughout their 20s as they gain experience and enjoy the benefits of physical youth, and then physical deterioration creeps in, beginning to have an effect in the late 20s or early 30s and increasing in severity over time. There’s obviously some truth in that. No matter how important the mental aspect of tennis, it’s hard to compete once you’ve lost a step, and even harder with chronic back or knee pain.

Yet the chess analogy persists: If tennis were mental, with much of the variation between elites explained by focus, the aging curve would look about the same. As modern science has improved training, nutrition, and injury recovery–thus reducing the effect of physical deterioration–tennis’s aging curve has developed a flatter plateau in the late 20s and 30s. In other words, as physical risks are mitigated, the elite career trajectory of tennis looks even more like that of chess.

Thinking ahead

For now, this is just a theory. Maybe you agree with me that it’s a very appealing one, but it remains untested, and it’s possibly very difficult to test at all.

If sustained focus is such a key factor in elite tennis performance, how would we even identify it? The most direct way would be to avoid the tennis court altogether and devise experiments so that we could measure the concentration of top players. I doubt we could convince the ATP top 100 to join us in the lab for a fun day of testing. There is some long-term potential, though, as national federations could do just that with their rising stars. Some might be doing so already; some professional baseball and American football teams administer cognitive tests to potential signees as well.

Unfortunately, we can’t make the best tennis players in the world our guinea pigs. If we looked instead at match-level results, we could try to measure focus using a similar approach to what I’ve done before in the name of quantifying consistency (oops!). My earlier algorithm attempted to measure the predictability of a player’s results–that is, is the 11th best player usually losing to the top ten and beating everyone else, or are his results less predictable? That’s not what we’re interested in here, because by that definition, ‘consistency’ isn’t necessarily good.

We could work along similar lines, though. Given a year or more or results, we could estimate a player’s peak level, perhaps by taking the average of his five best results. (His absolute best result might be the result of an injured opponent, an untimely rain delay, or something else unusual.) That would indicate the level that marks a ’10’ on his personal scale of 1 to 10. Then, compare his other results to that peak. If most of his results are close to that level–like the ‘consistent’ player who loses to the top ten and beats everyone else–he appears to be focused, at least from one match to the next. If he has a lot of bad losses by comparison, he is failing to sustain a level we know he’s capable of.

That sort of approach isn’t entirely satisfying, as is often the case when working with match-level stats. Perhaps with shot-level or camera-based data, we could do even better. Using a similar approach to the above–define a peak, compare other performances to that peak–we could look at serve speed or effectiveness, putting returns in play, converting opportunities at net, and so on. It would be complicated, in part because opponent quality and surface speed always have the potential to impact those numbers, but I think it’s worth pursuing.

If I’m right about this–that tennis isn’t just a mental game, it’s a game heavily influenced by sustained concentration–the long term impact is on player development. Academies and coaches already spend plenty of time off court, talking tactics and utilizing insights from psychology. This would be a further step in that direction.

The mental side of tennis–and sports in general–remains a huge mess of unknowns. As the next generation of elite players tries to develop small technical and tactical improvements in order to find an edge, perhaps the mental side is the next frontier, one that would finally enable a new generation to sweep away the old.

Dominic Thiem In Pressure Service Games

Embed from Getty Images

Dominic Thiem has good reason to be frustrated.

Italian translation at settesei.it

On Tuesday night, Rafael Nadal and Dominic Thiem delivered the match of the 2018 US Open thus far. After nearly five hours of play, nothing separated them as they battled their way to 5-5 in a fifth-set tiebreak. Nadal finally crept ahead by the narrowest of margins, sealing a victory by the unlikely score of 0-6 6-4 7-5 6-7(4) 7-6(5).

Both players had plenty of chances, and while Rafa prepares for a semi-final against Juan Martin del Potro, Thiem will have plenty of time to mull over the opportunities he missed. In the second set, he failed to hold in both of his last two service games, including the final frame of the set, at 4-5. In the third set, he took the lead by breaking Nadal in the seventh game, but failed to follow up his advantage, losing serve when he attempted to serve it out at 5-4. Two games later, he proved unable to hold serve to stay in the set at 5-6, though he forced Rafa to four deuces before finally giving way.

These three missed chances are hardly the entire story of the match, but they stick out in memory. Overall, Thiem served quite well, allowing Nadal only one break per set. That’s 21 holds in 26 service games, an 81% hold rate, a significant achievement compared to the 66% that Nadal’s opponents have averaged against him on hard courts this year, or the paltry 52% that Rafa has allowed overall. The problem isn’t that the Austrian served badly–he didn’t–but that he weakened at the wrong times. Thiem broke Nadal more often than Rafa returned the favor–six to five–but because three of Thiem’s breaks came in the first, 6-0 set [editor’s note: !??!?!?!?] , Nadal’s six proved less costly than Thiem’s five.

Bad day, or just bad?

Is this something Thiem does, or is it just something that he did, perhaps nudged over the edge one of the greatest returners of all time? Too often, viewers–along with many of those paid to talk and write about tennis–see the latter and assume the former. Does Thiem make a habit of serving strong in lower-leverage games and then wilting when the pressure ratchets up?

If he does, it would make him an exception. I looked at “serving for the set” opportunities a few years ago and found that ATP players serve almost exactly as well when a hold would earn them the set than otherwise. The difference is a mere 0.7%, meaning that the “difficulty” of serving for the set translates into one additional break per 143 opportunities. The effect wasn’t any more noticeable when I narrowed the focus to situations in which the player led by only a single break, like Thiem’s dropped service game at 5-4 in the third set last night.

Let’s look again, and pay specific attention to Thiem. My dataset of sequential point-by-point data, spanning most ATP tour matches between late 2011 and a few weeks ago, now covers over 400,000 service games, including 30,000 serving-for-the-set chances, over two-thirds of them with a lead of a single break. Over 1% of them have Thiem serving, so at least our sample size benefits from the Austrian’s strenuous schedule, even if it doesn’t do him any favors on the court. In other words, we’ve got a ton of data here, so if there is an effect, we should be able to find it.

Thiem’s missed chances included chances to both finish a set and stay in a set, so I’ve expanded our view to a variety of pressure situations. For each situation, I’ve calculated the hold rate for players in that position relative to their typical hold rate in those matches. (A player with a lot of serve-to-stay-in opportunities is probably on the losing end, with a lower hold rate than average, but this method should control for that.) A ratio of 1.0 means that the hold rate in the pressure situation is exactly the same as normal. A ratio above 1.0 means the hold rate is higher than usual, and below 1.0 signifies a lower hold rate–the lag many of us expect to see when the stakes get higher. Here are the ratios for a variety of situations, including serving for the set (plus a category one-break leads), serving to stay in the set (also with one-break deficits identified), ties late in the set such as 4-4 and 5-5, and for comparison’s sake, low-pressure situations–“All Else”–which is a catch-all for everything not in the above categories.*

* Yes, it includes the famous seventh game, which I’ve previously shown isn’t particularly important, no matter what Bill Tilden said.

Situation          Examples  Hold% / Avg  
For-Set            5-4; 5-2        0.994  
- For-Set Close    5-3; 6-5        0.989    
To-Stay            4-5; 1-5        0.999  
- To-Stay Close    5-6; 3-5        0.969    
Tied Late          4-4; 5-5        0.953  
All Else           2-3; etc        1.003

The “serving-for-the-set” effect is almost exactly same as what I found three years ago: a drop of a bit more than half a percent. Last year, the impact of serving for the set with a single break lead was a bit greater than I initially found, but it’s still small. We find servers struggling the most when serving to stay in the set while trailing by a a single break–losing serve 3.1% more often than usual–and when serving at 4-4 and 5-5, when they drop serve almost 5% more frequently than expected. These are the most substantial effects I’ve seen, but keep in mind the magnitude–even a 5% difference means it only flips the outcome of one service game in twenty. It certainly matters, but it would be awfully hard to spot with the naked eye.

The one percent

How does Thiem compare? Here is the same set of ratios for him, with separate columns for his career numbers (subject to the limitations of my dataset, which includes few matches before 2012) and for single-season figures from 2016, 2017, and 2018:

Situation        Career   2016   2017   2018  
For-Set           0.996  1.049  1.011  0.966  
- For-Set Close   0.984  1.078  1.008  0.887  
To-Stay           1.030  1.160  1.027  0.940  
- To-Stay Close   0.984  1.148  0.957  0.964  
Tied Late         0.984  0.976  0.991  0.889  
All Else          1.004  0.994  1.009  1.030

Thiem’s career numbers reveal little, just a player who is a tiny bit worse in high-leverage situations, though perhaps a little less affected by the pressure than his peers. The concern is his numbers so far this year, which are way down across the board. Each one of the categories represents a relatively small sample–for example, I have only 42 games in which he was serving for the set with a single break advantage–but taken together, the set of sub-1.0 ratios don’t point in an encouraging direction. We could never have forecast before last night’s match that Thiem would serve so well in general but so much weaker in the clutch, but there were subtle hints lurking in his 2018 performance.

A puzzle

I want to show you the same set of data, but for another player. In one way, it’s the opposite of Thiem’s: many more breaks in pressure situations over the course of the player’s career, but the opposite trend in the last few years, pointing toward more service holds:

Situation        Career   2016   2017   2018  
For-Set           0.929  0.931  1.200  1.077  
- For-Set Close   0.910  0.895  1.333  1.000  
To-Stay           1.026  1.077  1.083  1.061  
- To-Stay Close   0.929  1.100  1.167  1.044  
Tied Late         0.905  1.050  1.000  1.048  
All Else          1.011  1.013  1.024  1.013

Any ideas? It’s a bit of a trick question–you’re looking at the tour serving against Rafa. From 2012-15, Nadal absolutely shut down opposing servers starting at about 4-4. (He wasn’t as good–relative to his average, anyway–late in sets on his own serve.) Very few players or seasons show effects of greater than 5% in either direction, but Rafa’s opponents saw their hold rate dip by more than twice that in some seasons. Yet the story has been different for the last year or two, with Rafa himself becoming the underperformer in his late-set return games.

Again, we shouldn’t read too much into a single year of this data: The sample size is an issue, especially for a top player’s return games, because not many guys find themselves serving for a set against him. But had we looked at Nadal’s return record in pressure situations alongside Thiem’s recent serve performance, it would have made for a more complicated picture, one less likely to predict some of the crucial moments in last night’s match. In any given contest, there are simply too few key games for us to forecast their outcome with any success, especially when a let cord, an untimely distraction, or a missed line call could reverse the result. But that doesn’t mean we shouldn’t try to understand them. Unlucky, unclutch, or whatever else, Thiem could have flipped the outcome of the entire match by holding just one of those three games. The stakes could hardly be higher.

The Victims of Tiebreak Pressure

The conventional wisdom is that tiebreaks are all about two things: serves and mental strength. Despite my previous efforts, pundits continue to promote the idea that big servers have an edge in the first-to-seven shootout. Less contestably, experts remind us that a lot is at stake in a tiebreak, and the player who can withstand the pressure will prevail.

Back in 2012, I wrote a few articles about tiebreaks, using a year’s worth of data from men’s matches at grand slams to discover that servers hold less of an advantage during shootouts. On average, more points go the direction of the returner. I also found that very few players exceeded expectations in tiebreaks–that is, a player’s performance in non-tiebreak situations did a very good job of predicting his chances of winning tiebreaks. Last, I determined that big servers were not any more likely than their weaker-serving peers to be among the small group of players who boasted stronger-than-expected results in shootouts.

I’ve dug into a much larger dataset to revisit the first of these conclusions. My collection of sequential point-by-point data allows us to look at over 15,000 tiebreaks from the ATP tour alone, compared to fewer than 400 that I used in my earlier study. The broader and deeper sample will allow us go beyond general statements about serve or return advantages and look at how particular players fare in the jeu décisif.

Serving under pressure

First, the basics. In these 15,000 tour-level breakers, servers won 3.4% fewer points than they did in non-tiebreak situations. This is an apples-to-apples comparison: For each player in each match, I used his rate of service points won (SPW) on non-tiebreak points and his SPW on tiebreak points. To get the aggregate figure, I calculated the average of all player-matches, weighted by the number of tiebreaks in the match.*

* Initially, I weighted by the number of tiebreak points, thinking that, say, a 16-point tiebreak should be weighted more than an 8-point breaker. That gave me results that pointed to a huge improvement in SPW in tiebreaks … because of selection bias. When a tiebreak goes beyond 12 points, it often means that both players are serving well. Thus, when two servers are hot, they must play more points, increasing their weight in this calculation. It’s always possible that an extra-long tiebreak results from a lot of return points won, but in the serve-leaning men’s game, it is the much less likely scenario.

The 3.4% decrease in serve points won means that, for instance, a server who wins 65% on his own deal in the twelve games before the tiebreak will fall to 62.8% in the breaker. Fortunately for him, his opponent probably suffers the same drop. Benefits only accrue to those players who either maintain or increase their SPW after the twelfth game of the set.

It makes sense that servers suffer a bit under the pressure. In the men’s game, at least, the returner has little to lose. Since tiebreaks are thought to be serve-dominated, every return point won seems like a lucky break. Perhaps if players knew the real numbers, the mental game would shift back in their favor. They wouldn’t have to focus on becoming superhuman, unbreakable servers; they would need only to maintain the level that got them into the tiebreak in the first place.

The less-breakables

When we split things up by player, the dataset conveniently spits out 50 players with at least 100 tiebreaks. (Well, 49, but Nicolas Mahut was next on the list, so we’ll include him also.) The guys who play the most tiebreaks are either good, lucky, or both, because they’ve managed to stick around and play so many tour matches, so the average player on this list is a little better than the average player in general.

Here are the top and bottom ten in our group of the 50 most prolific tiebreak players. The first stat, “SPW Ratio,” is the ratio between tiebreak SPW and non-tiebreak SPW, so a higher number means that the player wins more serve points in tiebreaks than otherwise. Because that stat awkwardly centers on 0.966 (the 3.4% decrease), I’ve shown another stat, called here “Ratio+,” with all numbers normalized so the average is 1.0. Again, a higher number means more serve points won in tiebreaks. The 1.09 held by John Isner at the top of the list means that the big man wins 9% more breakers than expected, where “expected” is defined as the tour-average 3.4% drop.

Player               TBs  SPW Ratio  Ratio+  
Andy Murray          141       1.05    1.09  
John Isner           368       1.05    1.09  
Nick Kyrgios         109       1.05    1.08  
David Ferrer         132       1.01    1.05  
Alexandr Dolgopolov  116       1.01    1.05  
Lukas Rosol          100       1.01    1.05  
Jo-Wilfried Tsonga   188       1.01    1.04  
Roger Federer        175       1.01    1.04  
Nicolas Mahut         94       1.01    1.04  
Benoit Paire         139       1.00    1.04  
…                                            
Denis Istomin        120       0.94    0.98  
Viktor Troicki       104       0.94    0.97  
Tomas Berdych        181       0.93    0.96  
Nicolas Almagro      118       0.93    0.96  
Fernando Verdasco    156       0.93    0.96  
Robin Haase          123       0.93    0.96  
Adrian Mannarino     101       0.91    0.95  
Jiri Vesely          105       0.90    0.93  
Ryan Harrison        100       0.89    0.92  
Pablo Cuevas         100       0.87    0.90

Most of the big names who aren’t shown above (Rafael Nadal, Novak Djokovic, Juan Martin del Potro, Milos Raonic) are a bit better than average, with a Ratio+ stat around 1.02. I’m not surprised to see Isner or Roger Federer near the top, as those two have traditionally won more tiebreaks than expected. Less predictable is the chart-topping Andy Murray, who apparently manages to raise his serve game in breakers as well as anyone else.

Warning: Negative result ahead

Murray, Isner, and Federer have consistently served well in tiebreaks over the last seven years, the time span of this dataset. But even they have had seasons where they just barely edged out the tour average: Murray was 9% better than his peers in 2013 and 10% better in 2016, serving better in tiebreaks than non-tiebreaks by a 5% and 6% margin, resepectively, but in between, he was merely average. Isner, who was at least 10% better than tour average in each season from 2012 to 2015, served slightly worse in tiebreaks than in non-tiebreaks in 2016, and is just barely better than average in his first fifty shootouts of 2018.

These are small margins, and most players do not sustain positive or negative trends from year to year. To take another example, from 2014 to 2017, Raonic recorded single-season Ratio+ numbers of 1.11, 0.92, 1.00, and 0.98. I wouldn’t recommend putting any money on Milos’s full-season 2018 figure, let alone his tiebreak serve success in 2019.

Despite the evocative appearance of Isner, Federer, and Murray at the top of the list and some players considered to be mentally weaker near the bottom, there is no evidence that this is a skill, something that players will predictably repeat, rather than luck. As I did in my match point study earlier this week, I divided each player’s tiebreaks randomly into two groups. If tiebreak serve prowess were a skill, a player’s SPW Ratio in one random group would be reasonably predictive of his corresponding number in the other group. It is not to be: No matter where we set the minimum number of tiebreaks for inclusion, there is no correlation between the two groups.

If you’ve gone through many of my posts, you’ve read something like this before. Handling the pressure and serving well in tiebreaks seems like something that certain players will do well and others will not. This overall finding isn’t sufficient proof to say that no players have tendencies in either direction–most tour pros simply don’t contest enough tiebreaks over their entire careers to know that for sure. But with possible exceptions like Isner, Murray, Federer, and the unfortunate Pablo Cuevas, players converge around the tour average, which means their service game becomes a little less effective in breakers. If someone posts a particularly high or low SPW Ratio for a season, it probably means luck figured heavily in their results. If you’re going to bet on something using these numbers, the smart money suggests that most players will revert to the mean.

Simona Halep’s Match Points

Italian translation at settesei.it

In the second-set tiebreak of Sunday’s Cincinnati final, Simona Halep reached match point against Kiki Bertens. She failed to convert, then Bertens claimed the tiebreak, and the third set–and the championship–went the way of the Dutchwoman. It was a bit of painful deja vu for Halep fans, who watched the top-ranked player reach match point against Su Wei Hsieh at Wimbledon only to miss her chance and crash out in the third round.

Halep has a reputation as a bit of a weak closer–not just match points, but set points and, more generally, service games with the set or match on the line. Her overall ability to finish matches is beyond the scope of a single post, but we can start by biting off the smaller chunk of, specifically, her performance on match points, and how that compares to the rest of the WTA.

Let’s start with the basics. For everyone, reaching match point is (obviously!) a really good sign that she’ll go on to win the match. Across about 16,000 WTA matches since 2011 for which I have sequential point-by-point data, players who hold match point end up winning the match a bit more than 97% of the time. That doesn’t mean that they convert on the first try, or even in the game or set of their first opportunity, but even when conversion is elusive, players manage to generate more chances until they finish the job.

If Simona really is a weak closer, we’ll need to look elsewhere for evidence. In the matches for which I possess the point-by-point sequence*, there are 251 contests in which Halep held a match point, stretching between the end of 2011 and this month’s Rogers Cup in Montreal. Of those, she eventually converted a match point 250 times. That is, with the exception of the Wimbledon match against Hsieh, she didn’t lose any matches in which she was a point away from victory.

* I don’t have the point-by-point sequence for every Halep match, but I have most of them, and the missing ones are random. The same applies to just about every WTA player. Some of the raw data is available here; I’m hoping to update with 2017 and 2018 data in the near future.

Compared to the best players, this level of MP conversion doesn’t even stand out. Among the 50 women with at least 100 matches in which they held match point, five–Serena Williams, Victoria Azarenka, Andrea Petkovic, Ekaterina Makarova, and Elena Vesninaalways converted, if not always on the first try. (Again, I’m missing some matches, but that doesn’t take away from the fact that in a random sample of 259 matches, Serena remains perfect.) Until Sunday’s Cincinnati final, Halep was one of eight more–with Petra Kvitova, Maria Sharapova, and Ana Ivanovic, among others–who failed to convert only once.

Situational performance

It’s no accident that the most dominating names in tennis are near the top of that list. Yes, the best players are most likely to win at match point, but just as important, the best players are more likely to earn several opportunities. Deep in a tiebreak, one missed chance can represent the final hope, but most of the time when Halep, Serena, or someone else of their ilk fails to convert an opportunity, they’re still leading by, say, a set and a break, making it easy to generate more chances.

That leads us to another question: How do players perform on match point itself? Does the pressure lead to fewer points won, compared to non-MP serve and return points? Or do other factors, like momentum or crowd support, cause players to do even better when one point away from victory?

It turns out that there’s no single answer; the results are a bit different depending on whether the player holding match point is serving or returning. When a player is serving to finish off a match, she is slightly less likely to win the point, compared to her serve performance up to that point. It’s not a big difference–a bit less than a 3% drop in the rate of serve points won–but it is persistent across several years of WTA results. When players are one point away from victory but are returning, there is no match-point effect. They win return points at the same rate regardless of whether a handshake is imminent.

Match points are almost evenly distributed between serve and return points–on the WTA tour, about 55% are serve points, leaving 45% return points. Thus, given the 3% drop on serve performance and the lack of change on return points, players win approximately 1.5% fewer points when one step away from victory than otherwise. One player who almost exactly parallels the average is Caroline Wozniacki–in 271 match-point matches and 474 match points, she won those MPs at a rate 1.7% lower than non-MPs.

Some of the players who almost always win their match-point matches aren’t any better than average when we look at individual points. For instance, Sharapova wins MPs a rate 1.2% lower than non-MPs, and Azarenka’s success rate drops by 1.4%. Dominika Cibulkova won 198 of the 201 match-point matches in my dataset despite her success rate falling by a whopping seven percent.

Halep, however, doesn’t fit in that category. In her 251 match-point matches, she has held 420 individual match points, which she has won at a rate 4.4% higher than her non-MP rates in the same set of matches. Few players are better, though a handful are overwhelmingly so, such as Kvitova at +9.0%, and Vesnina at +13.9%. The vast majority of women are within a few percentage points of neutral: They win match points, whether serve or return, about as often as they win non-match-points.

Random results

These numbers tell us only one thing: what has happened in the past. It is tempting to use them to make predictions, or perhaps lay down a sizable wager the next time Vesnina is a point away from victory. But when most players are so close to neutral, it’s a warning that much of what we’re looking at may be random.

If players have consistent tendencies in match point situations, we would be able to identify that in the data. For instance, we might see that Kvitova converts match points at a high rate in each individual season. Since the single-season totals make for sometimes small samples, I took a slightly different approach. For players with at least 60 match-point matches, I randomly divided their matches into two separate groups, and determined how their performance at MP compared to their success rate on other points. Again, if this were a real skill, we would expect that players would be roughly the same in each of their two random groups–better than usual on MP in both groups, or worse.

Alas, for this population of 80 players with sufficient match-point samples, there is no correlation at all. If women have consistent, predictable tendencies to outperform or underperform in match-point opportunities, these inclinations are either extremely small, or they don’t persist over several years.

This is a familiar refrain when looking at specific situations in tennis matches. Our hyperactive, pattern-seeking brains find it easy to identify apparent tendencies, but in general, players win points at about the same rate regardless of the context. Over the medium term, like the half-decade represented by my point-by-point dataset, some players will stick out, like Kvitova, Vesnina, and to a lesser extent, Halep. But past results are hardly a guarantee of future match-point performance. The smart prediction for any player’s upcoming results on match point is that she’ll do exactly as well as she does the rest of the time. It’s a rather boring conclusion. Thankfully, the match points situations themselves are usually exciting enough on their own.

Measuring a Season’s Worth of Luck

In Toronto last week, Stefanos Tsitsipas was either very clutch, very lucky, or both. Against Alexander Zverev in Friday’s quarter-final, he won fewer than half of all points, claiming only 56.7% of his service points, compared to Zverev’s 61.2%. The next day, beating Kevin Anderson in the semi-final in a third-set tiebreak, he again failed to win half of total points, holding 69.9% of his service points against Anderson’s 75.5%.

Whether the Greek prospect played his best on the big points or benefited from a hefty dose of fortune, this isn’t sustainable. Running those serve- and return-points-won (SPW and RPW) numbers through my win probability model, we find that–if you take luck and clutch performance out of the mix–Tsitsipas had a 27.8% chance of beating Zverev and a 26.5% chance of beating Anderson. These two contests–perhaps the two days that have defined the youngster’s career up to this point–are the very definition of “lottery matches.” They could’ve gone either way, and over a long enough period of time, they’ll probably even out.

Or will they? Are some players more likely to come out on top in these tight matches? Are they consistently–dare I say it–clutch? Using this relatively simple approach of converting single-match SPW and RPW rates into win probabilities, we can determine which players are winning more or less often than they “should,” and whether it’s a skill that some players consistently display.

Odds in the lottery

Let’s start with some examples. When one player wins more than 55% of points, he is virtually guaranteed to win the match. Even at 53%, his chances are extremely good. Still, a lot of matches–particularly best-of-threes on fast surfaces–end up in the range between 50% and 53%, and that’s what most interesting from this perspective.

Here are Tsitsipas’s last 16 matches, along with his SPW and RPW rates and the implied win probability for each:

Tournament  Round  Result  Opponent     SPW    RPW  WinProb  
Toronto     F      L       Nadal      62.9%  21.1%       3%  
Toronto     SF     W       Anderson   69.9%  24.5%      27%  
Toronto     QF     W       A Zverev   56.7%  38.8%      28%  
Toronto     R16    W       Djokovic   77.2%  32.0%      85%  
Toronto     R32    W       Thiem      83.3%  30.2%      93%  
Toronto     R64    W       Dzumhur    82.8%  35.0%      98%  
Washington  SF     L       A Zverev   54.7%  25.5%       1%  
Washington  QF     W       Goffin     71.2%  32.7%      67%  
Washington  R16    W       Duckworth  80.0%  37.5%      98%  
Washington  R32    W       Donaldson  59.5%  45.5%      74%  
Wimbledon   R16    L       Isner      72.5%  18.0%      10%  
Wimbledon   R32    W       Fabbiano   64.0%  55.9%     100%  
Wimbledon   R64    W       Donaldson  70.1%  40.9%      95%  
Wimbledon   R128   W       Barrere    71.5%  39.0%      94%  
Halle       R16    L       Kudla      59.7%  28.8%       8%  
Halle       R32    W       Pouille    78.3%  42.9%      99%

More than half of the matches are at least 90% or no more than 10%. But that leaves plenty of room for luck in the remaining matches. Thanks in large part to his last two victories, the win probability numbers add up to only 9.8 wins, compared to his actual record of 12-4. All four losses were rather one-sided, but in addition to the Toronto matches against Zverev and Anderson, his wins against David Goffin in Washington and, to a lesser extent, Novak Djokovic in Toronto, were far from sure things.

In the last two months, Stefanos has indeed been quite clutch, or quite lucky.

Season-wide views

When we expand our perspective to the entire 2018 season, however, the story changes a bit. In 48 tour-level matches through last week’s play (excluding retirements), Tsitsipas has gone 29-19. The same win probability algorithm indicates that he “should” have won 27.4 matches–a difference of 1.6 matches, or about five percent, which is less than the gap we saw in his last 16. In other words, for the first two-thirds of the season, his results were either unlucky or un-clutch, if only slightly. At the very least, the aggregate season numbers are less dramatic than his recent four-event run.

For two-thirds of a season, a five percent gap between actual wins and win-probability “expected” wins isn’t that big. For players with at least 30 completed tour-level matches this season, the magnitude of the clutch/luck effect extends from a 20% bonus (for Pierre Hugues Herbert) to a 20% penalty (for Sam Querrey, which he reduced a bit by beating John Isner in Cincinnati on Monday despite winning less than 49% of total points). Here are the ten extremes at each end, of the 59 ATPers who have reached the threshold so far in 2018:

Player                 Matches  Wins  Exp Wins  Ratio  
Pierre Hugues Herbert       30    16      13.2   1.22  
Nikoloz Basilashvili        34    17      14.0   1.21  
Frances Tiafoe              39    24      20.0   1.20  
Evgeny Donskoy              30    13      10.9   1.19  
Grigor Dimitrov             34    20      17.1   1.17  
Lucas Pouille               31    16      13.7   1.17  
Gael Monfils                34    21      18.3   1.15  
Daniil Medvedev             34    18      15.8   1.14  
Marco Cecchinato            33    19      16.7   1.14  
Maximilian Marterer         32    17      15.2   1.12  
…                                                      
Leonardo Mayer              37    19      20.1   0.95  
Guido Pella                 37    20      21.2   0.95  
Marin Cilic                 38    27      28.8   0.94  
Novak Djokovic              37    27      29.3   0.92  
Marton Fucsovics            30    16      17.5   0.92  
Joao Sousa                  36    18      19.8   0.91  
Dusan Lajovic               34    17      18.7   0.91  
Fernando Verdasco           43    22      24.5   0.90  
Mischa Zverev               39    18      20.7   0.87  
Sam Querrey                 30    15      18.8   0.80

A difference of three or four wins, as many of these players display between their actual and expected win totals, is more than enough to affect their standing in the rankings. The degree to which it matters depends enormously on which matches they win or lose, as Tsitsipas’s semi-final defeat of Anderson has a much greater impact on his point total than, say, Querrey’s narrow victory over Isner does for his. But in general, the guys at the top of this list are ones who have seen unexpected ranking boosts this season, while some of the guys at the bottom have gone the other way.

The last full season

Let’s take a look at an entire season’s worth of results. Last year, a few players–minimum 40 completed tour-level matches–managed at least a 20% luck/clutch bonus, but with the surprising exception of Daniil Medvedev, none of them have repeated the feat so far in 2018:

Player                 Matches  Wins  Exp Wins  Ratio  
Donald Young                43    21      16.2   1.30  
Fabio Fognini               58    35      28.5   1.23  
Jack Sock                   55    36      29.8   1.21  
Jiri Vesely                 45    22      19.3   1.14  
Daniil Medvedev             43    22      19.7   1.11  
John Isner                  57    36      32.3   1.11  
Damir Dzumhur               56    33      29.7   1.11  
Gilles Muller               48    30      27.1   1.11  
Alexander Zverev            74    53      48.1   1.10  
Juan Martin del Potro       53    37      33.6   1.10

A few of these players have had solid seasons, but posting a good luck/clutch number in 2017 is hardly a guaranteed, as the likes of Donald Young, Jack Sock, and Jiri Vesely can attest. Here is the same list, with 2018 luck/clutch ratios shown alongside last year’s figures:

Player                 2017 Ratio  2018 Ratio     
Donald Young                 1.30        0.89  *  
Fabio Fognini                1.23         1.1     
Jack Sock                    1.21        0.68  *  
Jiri Vesely                  1.14        1.08  *  
Daniil Medvedev              1.11        1.14     
John Isner                   1.11        0.96     
Damir Dzumhur                1.11        1.01     
Gilles Muller                1.11        0.84  *  
Alexander Zverev             1.10        1.06     
Juan Martin del Potro        1.10        1.07

* fewer than 30 completed tour-level matches

The average luck/clutch ratio of these ten players has fallen to a bit below 1.0.

Unsustainable luck

You can probably see where this is going. I generated full-season numbers for each year from 2008 to 2017, and identified those players who appeared in the lists for adjacent pairs of seasons. If luck/clutch ratio is a skill–that is, if it’s more clutch than luck–guys who post good numbers will tend to do so the following year, and those who post lower numbers will be more likely to remain low.

Across 325 pairs of player-seasons, that’s not what happened. There is almost no relationship between one year of luck/clutch ratio and the next. The r^2 value–a measure of correlation–is 0.07, meaning that the year-to-year numbers are close to random.

Across sports, analysts have found plenty of similar results, and they are often quick to pronounce that “clutch doesn’t exist,” which leads to predictable rejoinders from the laity that “of course it does,” and so on. It’s boring, and I’m not particularly interested in that debate. What this specific finding shows is:

This type of luck, defined as winning more matches than implied by a player’s SPW and RPW in each match, is not sustainable.

What Tsitsipas accomplished last weekend in Toronto was “clutch” by almost any definition. What this finding demonstrates is that a few such performances–or even a season’s worth of them–doesn’t make it any more likely that he’ll do the same next year. Or, another possibility is that the players who stick at the top level of professional tennis are all clutch in this sense, so while Tsitsipas might be quite mentally strong in key moments, he’ll often run up against players who have similar mental skills, and he won’t be able to consistently win these close matches.

If Stefanos is able to maintain a ranking in the top 20, which seems plausible, he’ll probably need to win more serve and return points than he has so far. Fortunately for him, he’s still almost eight years younger than his typical peer, so he has plenty of time to improve. The occasional lottery matches that tilt his way will need to be mere bonuses, not the linchpin of his strategy to reach the top.

Smaller Swings In Big Moments

Italian translation at settesei.it

Despite the name, unforced errors aren’t necessarily bad. Sometimes, the right tactic is to play more aggressively, and in order to hit more winners, most players will commit more errors as well. Against some opponents, increasing the unforced error count–as long as there is a parallel improvement in winners or other positive point-ending shots–might be the only way to win.

Last week, I showed that one of the causes of Angelique Kerber’s first-round loss was her disproportionate number of errors in big moments. But as my podcasting partner Carl Bialik pointed out, that isn’t the whole story. If Kerber played more aggressively on the most important points–one possible cause of more errors–it might be the case that her winner rate was higher, as well. Since the 6-2 6-2 scoreline was so heavily tilted against her, it was a safe bet that Kerber recorded more high-leverage errors than winners. Still, Carl makes a valid point, and one worth testing.

To do so, let’s revisit the data: 500 women’s singles matches from the last four majors and the first four rounds of this year’s French Open. By measuring the importance of each point, we can determine the average leverage (LEV) of every point in each match, along with the average leverage of points which ended with a player hitting an unforced error, or a winner. Last week, we found that Kerber’s UEs in her first-round loss had an average LEV of 5.5%, compared to a LEV of 3.8% on all other points. For today’s purposes, let’s use match averages as a reference point: Her average UE LEV of 5.5% also compares unfavorably to the overall match average LEV of 4.1%.

What about winners? Kerber’s 15 winners came on points with an average LEV of 3.9%, below the match average. Case closed: On more important points, Kerber was more likely to commit an error, and less likely to hit a winner.

Across the whole population, players hit more errors and fewer winners in crucial moments, but only slightly. Points ending in errors are about one percent more important than average (percent, not percentage point, so 4.14% instead of 4.1%), and points ending in winners are about two percent less important than average. In bigger moments, players increase their winner rate about 39% of the time, and they improve their W-UE ratio about 45% of the time. Point being, there are tour-wide effects on more important points, but they are quite small.

Of course, Kerber’s first-round upset isn’t indicative of how she has played at Slams in general. In my article last week, I mentioned the four players who did the best job of reducing errors at big moments: Kerber, Agnieszka Radwanska, Timea Bacsinszky, and Kiki Bertens. Kerber and Radwanska both hit fewer winners on big points as well, but Bacsinszky and Bertens manage a perfect combination, hitting slightly more winners as the pressure cranks up. Among players with more than 10 Slam matches since last year’s French, Bacsinszky is the only one to hit winners on more important points than her unforced errors over 75% of the time.

Compared to her peers, Kerber’s big-moment tactics are remarkably passive. The following table shows the 21 women for whom I have data on at least 13 matches. “UE Rt.” (“UE Ratio”) is similar to the metric I used last week, comparing the average importance of points ending in errors to average points; “W Ratio” is the same, but for points ending in winners, and “W+UE Ratio” is–you guessed it–a (weighted) combination of the two. The combined measure serves as an rough approximation of aggression on big points, where ratios below 1 are more passive than the player’s typical tactics and ratios above 1 are more aggressive.

Player                     M  UE Rt.  W Rt.  W+UE Rt.  
Angelique Kerber          20    0.92   0.85      0.88  
Alize Cornet              13    0.92   0.87      0.94  
Agnieszka Radwanska       17    0.91   0.95      0.95  
Simona Halep              19    0.93   0.94      0.95  
Samantha Stosur           13    0.95   0.98      0.96  
Timea Bacsinszky          14    0.89   1.02      0.97  
Elina Svitolina           15    1.02   0.95      0.97  
Karolina Pliskova         18    0.97   0.98      0.97  
Caroline Wozniacki        14    0.93   1.00      0.97  
Johanna Konta             13    1.00   0.97      0.98  
Caroline Garcia           14    0.94   1.02      0.98  
Svetlana Kuznetsova       17    0.96   0.98      0.99  
Garbine Muguruza          20    1.02   0.94      0.99  
Venus Williams            25    1.00   0.97      0.99  
Elena Vesnina             13    0.96   1.03      0.99  
Anastasia Pavlyuchenkova  15    1.03   0.99      0.99  
Coco Vandeweghe           13    1.08   0.95      1.01  
Madison Keys              13    1.01   1.02      1.01  
Serena Williams           27    0.99   1.05      1.02  
Carla Suarez Navarro      14    1.00   1.14      1.05  
Dominika Cibulkova        14    1.11   1.03      1.07

Kerber’s combined measure stands out from the pack. Her point-ending shots–both winners and errors, but especially winners–occur disproportionately on less important points, and the overall effect is double that of the next most passive big-moment player, Alize Cornet. Every other player is close enough to neutral that I would hesitate before making any conclusions about their pressure-point tactics.

Even when Kerber wins, she does so with effective defense at key points. In only two of her last 20 matches at majors did her winners occur on particularly important points. (Incidentally, one of those two was last year’s US Open final.) In general, her brand of passivity works–she won 16 of those matches. But defensive play doesn’t leave very much room for error–figuratively or literally. The tactics were familiar and proven, but against Makarova, they were poorly executed.

Angelique Kerber’s Unclutch Unforced Errors

Italian translation at settesei.it

It’s been a rough year for Angelique Kerber. Despite her No. 1 WTA ranking and place at the top of the French Open draw, she lost her opening match on Sunday against the unseeded Ekaterina Makarova. Adding insult to injury, the loss goes down in the record books as a lopsided-looking 6-2 6-2.

Andrea Petkovic chimed in with her diagnosis of Kerber’s woes:

She’s simply playing without confidence right now. It was tight, even though the scoreline was 2 and 2 but everyone who knows a thing about tennis knew that Angie made errors whenever it mattered because she’s playing without any confidence right now – errors she didn’t make last year.

This is one version of a common analysis: A player lost because she crumbled on the big points. While that probably doesn’t cover all of Kerber’s issues on Sunday–Makarova won 72 points to her 55–it is true that big points have a disproportionate effect on the end result. For every player who squanders a dozen break points yet still wins the match, there are others who falter at crucial moments and ultimately lose.

This family of theories–that a player over- or under-performed at big moments–is testable. For instance, I showed last summer that Roger Federer’s Wimbledon loss to Milos Raonic was due in part to his weaker performance on more important points. We can do the same with Kerber’s early exit.

Here’s how it works. Once we calculate each player’s probability of winning the match before each point, we can assign each point a measure of importance–I prefer to call it leverage, or LEV–that quantifies how much the single point could effect the outcome of the match. At 3-0, 40-0, it’s almost zero. At 3-3, 40-AD in the deciding set, it might be over 10%. Across an entire tournament’s worth of matches, the average LEV is around 5% to 6%.

If Petko is right, we’ll find that the average LEV of Kerber’s unforced errors was higher than on other points. (I’ve excluded points that ended with the serve, since neither player had a chance to commit an unforced error.) Sure enough, Kerber’s 13 groundstroke UEs (that is, excluding double faults) had an average LEV of 5.5%, compared to 3.8% on points that ended some other way. Her UE points were 45% more important than non-UE points.

Let’s put that number in perspective. Among the 86 women for whom I have point-by-point UE data for their first-round matches this week*, ten timed their errors even worse than Kerber did. Magdalena Rybarikova was the most extreme: Her eight UEs against Coco Vandeweghe were more than twice as important, on average, as the rest of the points in that match. Seven of the ten women with bad timing lost their matches, and two others–Agnieszka Radwanska and Marketa Vondrousova–committed so few errors (3 and 4, respectively), that it didn’t really matter. Only Dominika Cibulkova, whose 15 errors were about as badly timed as Kerber’s, suffered from unclutch UEs yet managed to advance.

* This data comes from the Roland Garros website. I aggregate it after each major and make it available here.

Another important reference point: Unforced errors are evenly distributed across all leverage levels. Our instincts might tell us otherwise–we might disproportionately recall UEs that came under pressure—-but the numbers don’t bear it out. Thus, Kerber’s badly timed errors are just as badly timed when we compare her to tour average.

They are also poorly timed when compared to her other recent performances at majors. Petkovic implied as much when she said her compatriot was making “errors she didn’t make last year.” Across her 19 matches at the previous four Slams, her UEs occurred on points that were 11% less important than non-UE points. Her errors caused her to lose relatively more important points in only 5 of the 19 matches, and even in those matches, the ratio of UE leverage to non-UE leverage never exceeded 31%, her ratio in Melbourne this year against Tsurenko. That’s still better than her performance on Sunday.

Across so many matches, a difference of 11% is substantial. Of the 30 players with point-by-point UE data for at least eight matches at the previous four majors, only three did a better job timing their unforced errors. Radwanska heads the list, at 16%, followed by Timea Bacsinszky at 14% and Kiki Bertens at 12%. The other 26 players committed their unforced errors at more important moments than Kerber did.

As is so often the case in tennis, it’s difficult to establish if a stat like this is indicative of a longer-trend trend, or if it is mostly noise. We don’t have point-by-point data for most of Kerber’s matches, so we can’t take the obvious next step of checking the rest of her 2017 matches for similarly unclutch performances. Instead, we’ll have to keep tabs on how well she limits UEs at big moments on those occasions where we have the data necessary to do so.

Measuring the Clutchness of Everything

Italian translation at settesei.it

Matches are often won or lost by a player’s performance on “big points.” With a few clutch aces or un-clutch errors, it’s easy to gain a reputation as a mental giant or a choker.

Aside from the traditional break point stats, which have plenty of limitations, we don’t have a good way to measure clutch performance in tennis. There’s a lot more to this issue than counting break points won and lost, and it turns out that a lot of the work necessary to quantify clutchness is already done.

I’ve written many times about win probability in tennis. At any given point score, we can calculate the likelihood that each player will go on to win the match. Back in 2010, I borrowed a page from baseball analysts and introduced the concept of volatility, as well. (Click the link to see a visual representation of both metrics for an entire match.) Volatility, or leverage, measures the importance of each point–the difference in win probability between a player winning it or losing it.

To put it simply, the higher the leverage of a point, the more valuable it is to win. “High leverage point” is just a more technical way of saying “big point.”  To be considered clutch, a player should be winning more high-leverage points than low-leverage points. You don’t have to win a disproportionate number of high-leverage points to be a very good player–Roger Federer’s break point record is proof of that–but high-leverage points are key to being a clutch player.

(I’m not the only person to think about these issues. Stephanie wrote about this topic in December and calculated a full-year clutch metric for the 2015 ATP season.)

To make this more concrete, I calculated win probability and leverage (LEV) for every point in the Wimbledon semifinal between Federer and Milos Raonic. For the first point of the match, LEV = 2.2%. Raonic could boost his match odds to 50.7% by winning it or drop to 48.5% by losing it. The highest leverage in the match was a whopping 32.8%, when Federer (twice) had game point at 1-2 in the fifth set. The lowest leverage of the match was a mere 0.03%, when Raonic served at 40-0, down a break in the third set. The average LEV in the match was 5.7%, a rather high figure befitting such a tight match.

On average, the 166 points that Raonic won were slightly more important, with LEV = 5.85%, than Federer’s 160, at LEV = 5.62%. Without doing a lot more work with match-level leverage figures, I don’t know whether that’s a terribly meaningful difference. What is clear, though, is that certain parts of Federer’s game fell apart when he needed them most.

By Wimbledon’s official count, Federer committed nine unforced errors, not counting his five double faults, which we’ll get to in a minute. (The Match Charting Project log says Fed had 15, but that’s a discussion for another day.) There were 180 points in the match where the return was put in play, with an average LEV = 6.0%. Federer’s unforced errors, by contrast, had an average LEV nearly twice as high, at 11.0%! The typical leverage of Raonic’s unforced errors was a much less noteworthy 6.8%.

Fed’s double fault timing was even worse. Those of us who watched the fourth set don’t need a fancy metric to tell us that, but I’ll do it anyway. His five double faults had an average LEV of 13.7%. Raonic double faulted more than twice as often, but the average LEV of those points, 4.0%, means that his 11 doubles had less of an impact on the outcome of the match than Roger’s five.

Even the famous Federer forehand looks like less of a weapon when we add leverage to the mix. Fed hit 26 forehand winners, in points with average LEV = 5.1%. Raonic’s 23 forehand winners occurred during points with average LEV = 7.0%.

Taking these three stats together, it seems like Federer saved his greatness for the points that didn’t matter as much.

The bigger picture

When we look at a handful of stats from a single match, we’re not improving much on a commentator who vaguely summarizes a performance by saying that a player didn’t win enough of the big points. While it’s nice to attach concrete numbers to these things, the numbers are only worth so much without more context.

In order to gain a more meaningful understanding of this (or any) performance with leverage stats, there are many, many more questions we should be able to answer. Were Federer’s high-leverage performances typical? Does Milos often double fault on less important points? Do higher-leverage points usually result in more returns in play? How much can leverage explain the outcome of very close matches?

These questions (and dozens, if not hundreds more) signal to me that this is a fruitful field for further study. The smaller-scale numbers, like the average leverage of points ending with unforced errors, seem to have particular potential. For instance, it may be that Federer is less likely to go for a big forehand on a high-leverage point.

Despite the dangers of small samples, these metrics allow us to pinpoint what, exactly, players did at more crucial moments. Unlike some of the more simplistic stats that tennis fans are forced to rely on, leverage numbers could help us understand the situational tendencies of every player on tour, leading to a better grasp of each match as it happens.

Winning Return Points When It Matters

In my post last week about players who have performed better than expected in tiebreaks (temporarily, anyway), I speculated that big servers may try harder in tiebreaks than in return games.

If we interpret “try harder” as “win points more frequently,” we can test it. With my point-by-point dataset, we can look at every top player in the men’s game and compare their return-point performance in tiebreaks to their return-point performance earlier in the set.

As it turns out, top players post better return numbers in tiebreaks than they do earlier in the set. I looked at every match in my dataset (most tour-level matches from the last few seasons) for the ATP top 50, and found that these players, on average, won 5.2% more return points than they did earlier in those sets.

That same group of players saw their serve performance decline slightly, by 1.1%. Since the top 50 frequently play each other, it’s no surprise that the serve and return numbers point in different directions. However, the return point increase and the serve point decrease don’t cancel each other out, suggesting that the top 50 is winning a particularly large number of tiebreaks against the rest of the pack, mostly by improving their return game once the tiebreak begins.

(There’s a little bit of confirmation bias here, since some of the players on the edge of the top 50 got there thanks to good luck in recent tiebreaks. However, most of top 50–especially those players who make up the largest part of this dataset–have been part of this sample of players for years, so the bias remains only minor.)

My initial speculation concerned big servers–the players who might reasonably relax during return games, knowing that they probably won’t break anyway. However, big servers aren’t any more likely than others to return better in tiebreaks. (Or, put another way, to return worse before tiebreaks.) John Isner, Ivo Karlovic, Kevin Anderson, and Roger Federer all win slightly more return points in tiebreaks than they do earlier in sets, but don’t improve as much as the 5.2% average. What’s more, Isner and Anderson improve their serve performance for tiebreaks slightly more than they do their return performance.

There are a few players who may be relaxing in return games. Bernard Tomic improves his return points won by a whopping 27% in tiebreaks, Marin Cilic improves by 16%, and Milos Raonic improves by 11%. Tomic and Raonic, in particular, are particularly ineffective in return games when they have a break advantage in the set (more on that in a moment), so it’s plausible they are saving their effort for more important moments.

Despite these examples, this is hardly a clear-cut phenomenon. Kei Nishikori, for example, ups his return game in tiebreaks almost as much as Cilic does, and we would never think of him as a big server, nor do I think he often shows signs of tactically relaxing in return games. We have plenty of data for most of these players, so many of these trends are more than just statistical noise, but the results for individual players don’t coalesce into any simple, overarching narratives about tiebreak tendencies.

There is one nearly universal tendency that turned up in this research. When leading a set by one break or more, almost every player returns worse. (Conversely, when down a break, almost every player serves better.) The typical top 50 player’s return game declines by almost 5%, meaning that a player winning 35% of return points falls to 33.4%.

Almost every player fits this pattern. 48 of the top 50–everyone except for David Ferrer and Aljaz Bedene–win fewer return points when up a break, and 46 of 50 win more service points when down a break.

Pinning down exactly why this is the case is–as usual–more difficult than establishing that the phenomenon exists. It may be that players are relaxing on return. A one-break advantage, especially late, is often enough to win the set, so it may make sense for players to conserve their energy for their own service games. Looking at it from the server’s perspective, that one-break disadvantage might remove some pressure.

What’s clear is this: Players return worse than usual when up a break, and better than usual in tiebreaks. The changes are much more pronounced for some ATPers than others, but there’s no clear relationship with big serving. As ever, tiebreaks remain fascinating and more than a little inscrutable.

The Luck of the Tiebreak, 2015 in Review

Tiebreak outcomes are influenced by luck a lot more than most people think. All else equal, big servers aren’t any more successful than weak servers, and one season’s tiebreak king is often the next season’s tiebreak chump.

I’ve written a lot about this in the past, so I won’t repeat myself too much. (If you want to read more, here’s a good place to start.) In short, the data shows this: Good players win more tiebreaks than bad players do, but only because they’re better in general, not because they have special tiebreak skills. Very few players perform better or worse than they usually do in tiebreaks.

In the past, I’ve found that three players–Roger Federer, Rafael Nadal, and John Isner–consistently increase their level in tiebreaks. In other words, when you calculate how many tiebreaks Federer (or Nadal, or Isner) should win based on his overall rate of serve and return points won, you discover than he wins even more tiebreaks than that.

In any given year, some players score very high or very low–winning or losing far more tiebreaks than their overall level of play would suggest that they should. But the vast majority of those players regress back to the mean in subsequent years.

Here’s a look at which players outperformed the most in 2015 (minimum 20 tiebreaks). TBExp is the number of tiebreaks we would expect them to win, given their usual rate of serve and return points won. TBOE (Tie Breaks Over Expectations) is the difference between the number they won and the number we’d expect them to win, and TBOR is that difference divided by total tiebreaks.

Player              TBs  TBWon  TBExp  TBOE   TBOR  
Stan Wawrinka        46     34   24.9   9.1  19.8%  
Martin Klizan        25     17   12.2   4.8  19.0%  
Marin Cilic          35     26   21.0   5.0  14.2%  
Tomas Berdych        34     24   20.0   4.0  11.7%  
John Isner           64     39   31.7   7.3  11.3%  
Feliciano Lopez      42     27   22.4   4.6  11.0%  
Jiri Vesely          28     16   13.2   2.8  10.1%  
Sam Groth            31     18   14.9   3.1  10.1%  
Gilles Muller        45     27   22.7   4.3   9.5%  
Gael Monfils         28     18   15.4   2.6   9.4%

There are a lot of big servers here (more on that later) and a lot of new faces. Federer and Nadal were roughly neutral in 2015, winning exactly as many tiebreaks as we’d expect. Of the tiebreak masters, only Isner remained among the leaders. He has never posted a season below +5% TBOR, and only twice has he been below +11% TBOR. Just from this leaderboard, you can tell how elite that is.

Along with Isner, we have Marin Cilic, Feliciano Lopez, Sam Groth, and Gilles Muller, all players one would reasonably consider to be big servers. As I mentioned above, big serving doesn’t typically correlate with exceeding tiebreak expectations. It may just be a fluke: Lopez was roughly neutral in 2013 and 2014, and -15% in 2012; Groth doesn’t have much of a tour-level track record, but was -5% in 2014; Muller has been up and down throughout his career; and Cilic almost always underperformed until 2013.

Adding to the “fluke” argument is the case of Ivo Karlovic. His -14% TBOR this year was one of the worst among players who contested 20 or more tiebreaks, and he’s been exactly neutral over the last decade.

Let’s take a closer look at a few players.

Stan Wawrinka: For the second year in a row, he won at least 15% more tiebreaks than expected. Whether it’s clutch, focus, or dumb luck, the shift in his tiebreak fortunes dovetails nicely with his upward career trajectory. From 2006-13, he only posted one season at neutral or better, and his overall TBOR of -9% was one of the worst in the game for that span.

Cilic’s story is similar. Before 2013, he posted only one season above expectations. Since then, he’s won 19%, 16%, and 14% more tiebreaks than expected.

While only anecdotes, these two cases contradict an idea I’ve heard quite a bit, that players weaken in the clutch as they get older. The subject often comes up in the context of Karlovic’s tiebreak futility or Federer’s break point frustrations. It’s tough to prove one way or the other, in part because there’s no generally accepted measure of clutch in tennis. (If indeed there is any persistent clutch skill.) Using a measure like TBOR is dangerous, both because it is so noisy, and because of survivorship bias–players who get worse as they get older are more likely to fall in the rankings and play fewer tour matches as a result.

Another complicating factor is worthy of further study. To estimate how many tiebreaks a player should win, we need to take our expectation from somewhere. I’m using each player’s overall rates of serve and return points won. But if a player is trying harder in tiebreaks (assuming more effort translates into better results), we would expect that he would win more points in tiebreaks.

Isner has admitted to coasting on unimportant points, and for someone with his game style, a whole lot of return points can be classified as unimportant. Very generally speaking, the more one-dimensional the player, the more reason he has to take it easy during return games, and the more he does so, the more we would observe that he outperforms expectations in tiebreaks–simply because he sets expectations artificially low.

That might be an explanation for Isner’s consistent appearance on these leaderboards. And if we assume that players become more strategically sound as they age–or simply better at tactically conserving energy–we might have a reason why older players score higher in this metric.

Two more players worth mentioning are Milos Raonic and Kei Nishikori. They were 5th and 6th on the 2014 leaderboard, outperforming expectations by 15% and 14%, respectively. In 2015, Raonic fell to neutral, and Nishikori (in far fewer tiebreaks) dropped to -14%, nearly the bottom of the rankings. Taken together, it’s a good reminder of the volatility of these numbers. In Raonic’s case, it’s a warning that relying too much on winning tiebreaks (which, by extension, implies relying too little on one’s return game) is a poor recipe for long-term success.

Finally, some notes on the big four. Novak Djokovic and Andy Murray have never figured heavily in these discussions, both because they don’t play a ton of tiebreaks, and because they don’t persistently out- or underperform expectations. Federer and Nadal, however, were long among the best. Both have returned to the middle of the pack: Federer hasn’t posted a TBOR above 5% since 2011, and Nadal underperformed by 8.5% in 2014 before bouncing back to neutral last season.

Whatever tiebreak skill Roger and Rafa once had now eludes them. On the other hand, ten months of good tiebreak luck can happen to anyone, even a legend. If either player can recapture that tiebreak magic–even if it’s mere luck that allows them to do so–it might translate into a few more wins as they try to reclaim the top spot in the rankings.