Italian translation at settesei.it
I’ve never understood the fixation that some fans and commentators seem to have with tiebreak winning percentage. Sure, winning tiebreaks is nice, but it seems obvious that the main cause of exemplary tiebreak performance is being good at tennis. Though some players may in fact be better than others at this facet of the game, a big part of what tiebreak winning percentage tells us is about general tennis skill.
In other words, Roger Federer is very good at tiebreaks because he is very good at serving and returning, the same skills that get him so many wins, regardless of whether any of the sets go to tiebreaks.
If we ignore tiebreak winning percentage, what are we left with? It’s still tempting to wonder whether some players have a kind of special skill–calm under pressure, a particularly consistent serve–that leads them to outperform expectations in breakers.
The key word there is “expectations.” Given Federer’s general ability on the tennis court, we should expect him to win most tiebreaks–for example, two of the last three breakers he’s played came against Stanislas Wawrinka, who he should beat regardless of the format. But our intuition will fail us if we look at Federer’s match record and try to estimate how many tiebreaks he should have won, then compare the “should” to the “did.”
Expected tiebreaks
Sounds like something computers do better than humans. Given a player’s percentage of service and return points won in a certain match, we can estimate how likely he was to win a tiebreak–on the assumption that his performance level stayed the same throughout the match.
If two players are equally matched, each one would be “expected” to win 0.5 tiebreaks. That’s nonsensical for a single match, but over the course of this season, we see that of John Isner‘s 53 tiebreaks, the algorithm would expect him to win 29. In fact, he has won 38, exceeding expectations (in raw terms, anyway) more than anyone else on tour this year.
This gives us two stats that offer more insight into a player’s tiebreak performance than “tiebreaks won” and “tiebreak winning percentage.” The raw number, the difference between actual tiebreaks won and expected tiebreaks won, tells us how many additional sets a player has taken because of his tiebreak performance. Call it TBOE: TieBreaks Over Expectations. A similar rate stat is derived by dividing TBOE by the number of tiebreaks, allowing us to compare players regardless of how many tiebreaks they played. Call that one TBOR: TieBreak Outperformance Rate.
As we’ve seen, Isner is the 2012 king of TBOE, performing well in tiebreaks and playing far more of them than anyone else on tour. Yet three players–Steve Darcis, Andy Murray, and Jurgen Melzer–have done better by TBOR, exceeding expectations at a greater rate than Isner has. Darcis is particularly remarkable, winning 16 of his 19 tiebreaks through last week, despite his serve and return rates in those matches suggesting he should have won only 10 of them.
(And in Vienna on Monday, he won another one, extending his already untouchable lead over the pack.)
I’ll have more to say about this tomorrow, including a look at just how much meaning we can extract from TBOE and TBOR. In the meantime, look after the jump for the current 2012 leaderboard–through Shanghai, sorted by TBOR, minimum 15 tiebreaks.
Player TBs TBWon ExpW TBOE TBOR Steve Darcis 19 16 9.8 6.2 0.33 Jurgen Melzer 17 12 8.3 3.7 0.22 Andy Murray 24 17 12.1 4.9 0.20 John Isner 53 38 28.5 9.5 0.18 Tommy Haas 16 11 8.4 2.6 0.16 Kevin Anderson 32 19 15.3 3.7 0.12 Janko Tipsarevic 32 21 17.4 3.6 0.11 David Ferrer 30 20 17.1 2.9 0.10 Pablo Andujar 18 11 9.3 1.7 0.10 Julien Benneteau 20 12 10.3 1.7 0.08 Radek Stepanek 18 11 9.7 1.3 0.07 Sam Querrey 28 16 14.2 1.8 0.06 Andy Roddick 21 12 10.7 1.3 0.06 Jarkko Nieminen 20 11 9.8 1.2 0.06 Paul Henri Mathieu 15 8 7.2 0.8 0.06 Andreas Seppi 23 13 11.8 1.2 0.05 Jeremy Chardy 17 9 8.1 0.9 0.05 Philipp Kohlschreiber 38 22 20.6 1.4 0.04 Denis Istomin 28 15 14.1 0.9 0.03 Milos Raonic 45 26 24.6 1.4 0.03 Roger Federer 28 18 17.3 0.7 0.03 Jo Wilfried Tsonga 31 18 17.3 0.7 0.02 Marcos Baghdatis 22 12 11.5 0.5 0.02 Gilles Muller 28 14 13.4 0.6 0.02 Yen Hsun Lu 16 8 7.7 0.3 0.02 Olivier Rochus 17 7 6.7 0.3 0.02 Ivo Karlovic 28 14 13.6 0.4 0.01 Nicolas Mahut 17 9 8.8 0.2 0.01 Ryan Harrison 19 9 8.8 0.2 0.01 Juan Monaco 18 10 10.2 -0.2 -0.01 Juan Martin Del Potro 35 20 20.5 -0.5 -0.01 Lukasz Kubot 18 8 8.4 -0.4 -0.02 Viktor Troicki 18 9 9.5 -0.5 -0.03 Tomas Berdych 28 15 15.7 -0.7 -0.03 Fernando Verdasco 21 10 10.6 -0.6 -0.03 Bernard Tomic 15 7 7.5 -0.5 -0.03 Thomaz Bellucci 17 8 8.7 -0.7 -0.04 Xavier Malisse 19 9 9.7 -0.7 -0.04 Benoit Paire 24 11 12.2 -1.2 -0.05 Mikhail Youzhny 20 10 11.0 -1.0 -0.05 Kei Nishikori 16 8 8.8 -0.8 -0.05 Grigor Dimitrov 18 9 10.0 -1.0 -0.06 Alexandr Dolgopolov 22 10 11.4 -1.4 -0.06 Sergiy Stakhovsky 28 12 13.8 -1.8 -0.07 Alejandro Falla 15 6 7.1 -1.1 -0.07 Marin Cilic 25 11 12.9 -1.9 -0.08 Albert Ramos 28 11 13.1 -2.1 -0.08 Edouard Roger Vasselin 15 6 7.3 -1.3 -0.09 Novak Djokovic 25 14 16.3 -2.3 -0.09 Nicolas Almagro 35 16 19.4 -3.4 -0.10 Igor Andreev 19 8 10.0 -2.0 -0.10 Mardy Fish 16 8 9.8 -1.8 -0.11 Lukas Rosol 17 5 7.1 -2.1 -0.12 Gilles Simon 21 8 10.8 -2.8 -0.13 Feliciano Lopez 34 13 17.6 -4.6 -0.13 Richard Gasquet 18 8 10.5 -2.5 -0.14 Stanislas Wawrinka 27 10 14.0 -4.0 -0.15
I like the new stats you’ve invented, because they do seem to carry meaning. Even so, I’m going to venture a math question – even though in doing so I expose my ignorance: If we build stats from aggregate data such as you have done here, doesn’t that ignore valuable contextual data, in particular, how tough the opponent was, what set it was, etc.? Somehow I have this hazy notion that rather than a list of stats, what I want to see are charts (curves) or some other presentation of data that shows some of these influences which I am calling context. E.g. Isner might be the king on one level, but if level of opponent were factored in, we might find that his kingdom is smaller than it appears.
I admit I can’t support this gut notion; I only put it forth as a rather uninformed, not very useful, but possibly relevant comment.
As far as opponent difficulty is concerned, it’s partly built in, because the probability of a player winning each tiebreak is estimated from his and his opponent’s performance *that day*–each player’s service winning percentage in that match. Sure, maybe X played Y on a day that Y was slumping, but the tiebreak was on that day, too, and presumably Y’s slump didn’t end right before the tiebreak and start again right after it.
(If the slump did end and begin like that, then he’d show up as someone who excels in this stat — as it should be, because he played the tiebreak better than the rest of the match. Do that consistently, and you’re Steve Darcis.)
So no, there’s no consideration of how good the opponent was over the course of the season, but that’s more important if you’re evaluating a player’s general ability. What I’m trying to get at here is the difference between his general ability and his tiebreak skills, if any.
It would interesting to consider one set separate from the others, but there just isn’t the data to do that with.
Interesting. Lot of players with great serving stats(&poor retruning) up there.
Hi Jeff,
New to this blog and thoroughly enjoyed reading a few of your interesting articles. I wonder if you have a greater wealth of these statistics? I just notice that once you get a little way down your list of players (only as far as beyond Isner) the numbers are small enough to be heavily influenced by a small number of instances. Take Haas for instance, expected to win 8.4 of 16, he actually won 11. An impressive year for the old guard in breakers for sure, but this doesn’t really constitute a pattern to me. It’s true that the players below Haas have a greater # of TB’s played. Nevertheless, these statistics would gain much from having far larger data set. Other than the bizarre case of Darcis, I would personally say only Isner and Murray have the golden combination of: 1) an impressive TBOR with, 2) enough tie-breaks played, such that it indicates a genuine ‘skill’ in breakers.
Glad you found me, and thanks for reading.
Yes, I’m writing something up for my next post that touches on the longer-term data, and I may find a way to post year-by-year tables in the near future. As we’ll see in the next post, the longer-term data make things more confused as much as it clarifies, as very few players demonstrate anything close to consistency above or below what is expected.
Not that it makes a huge difference, but have you removed the tiebreak points when calculating the expected number of TB wins?
Nope. Wish I could, but I only have total serve and return points won for the match.
If I were able to remove them, it should amplify the effect a bit. If the player who won the tiebreak won, say, 75% of serve points and 30% of return points throughout the match, it would usually be the case that he did one or the other better during the tiebreak. So his non-tiebreak percentages would be a little worse, which would lead to a slightly lower estimate of his winning the tiebreak.