Dominic Thiem played Davis Cup in Barcelona. Sort of…

This is a guest post by Peter Wetz.

Last week Dominic Thiem fought his way into the finals of the Barcelona Open by winning against Kyle Edmund, Daniel Evans, Yuichi Sugita, and Andy Murray. Three of these four players play for the same flag and Thiem won against each of them. Thiem is not exactly a champion of the current Davis Cup format–he has opted out of playing for Austria several times and has a rather poor record of 2-3 when he does compete–but in Barcelona he has, at least, shown that he can beat several players from the same country over a short amount of time. And that’s what Davis Cup is about, right?

In this post my goal is to put this statistical hiccup into some context. It is not the first time the Austrian defeated three players of the same nationality at one event: In 2016 at Buenos Aires Thiem already beat three players from Spain. However, given that Spanish players appear much more frequently in draws than Britons do, I will take a closer look.

Since 1990, there have only been three tournaments where a single player faced three players from Great Britain. And only one of these players who faced three Britons won each encounter. The following table shows the three tournaments and each of the matches where a player from Great Britain was faced by the same player. Wally Masur is the only player since 1990 who defeated three players from Great Britain in a single tournament. Thiem remains the only player who achieved this in a tournament outside of the island.

Tournament     Round Winner        Loser           Score
'93 Manchester R32   Wally Masur   Ross Matheson   6-4 6-4
'93 Manchester R16   Wally Masur   Chris Wilkinson 6-3 6-7(4) 6-3
'93 Manchester QF    Wally Masur   Jeremy Bates    6-4 6-3

'97 Nottingham R32   Karol Kucera  Martin Lee      6-1 6-1
'97 Nottingham SF    Karol Kucera  Tim Henman      6-4 2-6 6-4
'97 Nottingham F     Greg Rusedski Karol Kucera    6-4 7-5

'01 Nottingham R32   Martin Lee    Lee Childs      6-4 5-7 6-0
'01 Nottingham R16   Martin Lee    Arvind Parmar   6-4 6-3
'01 Nottingham QF    Greg Rusedski Martin Lee      6-3 6-2

Obviously, there are not many chances to face three Britons in a single tournament. And when one of those opponents is likely to be Andy Murray, a player’s chances of beating all three are even slimmer.

Let’s broaden the perspective a bit and take a look at how often a player defeated three (or more) players from the same country without looking only at Great Britain. The following table displays the results of this analysis. The first column contains the country, the second column (3W) shows how often a player defeated three players of this country, the third column (3WL) shows how often a player defeated two players of this country and then lost to a player of the same country, and so on.

Country  3W  3WL  4W  4WL  5W  5WL
USA      119 179  19  30   1   4
ESP      98  157  17  18   3   2
FRA      28  45   5   2    1   0
ARG      22  26   5   3    0   0
GER      15  18   1   1    0   0
AUS      13  9    0   0    0   0
SWE      9   16   1   0    0   0
CZE      4   5    0   0    0   0
NED      4   4    0   0    0   0
RUS      4   3    0   0    0   0
ITA      2   3    1   0    0   0
BRA      1   3    1   0    0   0
GBR      1   2    0   0    0   0
CHI      1   1    0   0    0   0
SUI      1   1    0   0    0   0

As we could have imagined, USA, ESP, and FRA come out on top here, simply, because for years they have had the highest density of players in the rankings. These are also the only countries of which a player was faced five times at a single tournament. Facing a player of the same country six or more times never happened according to the data at hand. The following table shows the most recent occasions of the entries printed in bold in the above table (5W).

Tournament    Round Winner        Loser             Score
'91 Charlotte R32   Jaime Yzaga   Chris Garner      7-6 6-3
'91 Charlotte R16   Jaime Yzaga   Jimmy Brown       6-4 6-4
'91 Charlotte QF    Jaime Yzaga   Michael Chang     7-6 6-1
'91 Charlotte SF    Jaime Yzaga   M. Washington     7-5 6-2
'91 Charlotte F     Jaime Yzaga   Jimmy Arias       6-3 7-5
                                                 
'07 Lyon      R32   Sebastien Gr. Rodolphe Cadart   6-3 6-2
'07 Lyon      R16   Sebastien Gr. Fabrice Santoro   4-6 6-1 6-2
'07 Lyon      QF    Sebastien Gr. Julien Benneteau  6-7 6-2 7-6
'07 Lyon      SF    Sebastien Gr. Jo Tsonga         6-1 6-2
'07 Lyon      F     Sebastien Gr. Marc Gicquel      7-6 6-4
                                                  
'08 Valencia  R32   David Ferrer  Ivan Navarro      6-3 6-4
'08 Valencia  R16   David Ferrer  Pablo Andujar     6-3 6-4
'08 Valencia  QF    David Ferrer  Fernando Verdasco 6-3 1-6 7-5
'08 Valencia  SF    David Ferrer  Tommy Robredo     2-6 6-2 6-3
'08 Valencia  F     David Ferrer  Nicolas Almagro   4-6 6-2 7-6

Finally, we take a look at the big four. Did they ever eliminate three or more players from the same country in a single tournament? Yes, they did. In 2014 Roger Federer beat three Czech players in Dubai. In 2005, 2008, and 2013 he beat three German players in Halle. In 2009 Andy Murray beat three Spanish players in Valencia. In 2007 Novak Djokovic beat three Spanish players in Estoril. In 2013 Rafael Nadal beat three Argentinian players both in Acapulco and Sao Paolo. In 2015 he even beat four Argentinian players in Buenos Aires. And there are many other examples where Rafa beat three of his countrymen at the same tournament.

We can see that this happens fairly often, specifically for countries where the tournament is organized, because more players of this country appear in the draw due to wild cards and qualifications. If we exclude these cases, Federer’s streak in Dubai stands out, as does Thiem’s streak in Barcelona.

Peter Wetz is a computer scientist interested in racket sports and data analytics based in Vienna, Austria.

Measuring the Performance of Tennis Prediction Models

With the recent buzz about Elo rankings in tennis, both at FiveThirtyEight and here at Tennis Abstract, comes the ability to forecast the results of tennis matches. It’s not far fetched to ask yourself, which of these different models perform better and, even more interesting, how they fare compared to other ‘models’, such as the ATP ranking system or betting markets.

For this, admittedly limited, investigation, we collected the (implied) forecasts of five models, that is, FiveThirtyEight, Tennis Abstract, Riles, the official ATP rankings, and the Pinnacle betting market for the US Open 2016. The first three models are based on Elo. For inferring forecasts from the ATP ranking, we use a specific formula1 and for Pinnacle, which is one of the biggest tennis bookmakers, we calculate the implied probabilities based on the provided odds (minus the overround)2.

Next, we simply compare forecasts with reality for each model asking If player A was predicted to be the winner ($latex P(a) > 0.5$), did he really win the match? When we do that for each match and each model (ignoring retirements or walkovers) we come up with the following results.

Model		% correct
Pinnacle	76.92%
538		75.21%
TA		74.36%
ATP		72.65%
Riles		70.09%

What we see here is how many percent of the predictions were actually right. The betting model (based on the odds of Pinnacle) comes out on top followed by the Elo models of FiveThirtyEight and Tennis Abstract. Interestingly, the Elo model of Riles is outperformed by the predictions inferred from the ATP ranking. Since there are several parameters that can be used to tweak an Elo model, Riles may still have some room left for improvement.

However, just looking at the percentage of correctly called matches does not tell the whole story. In fact, there are more granular metrics to investigate the performance of a prediction model: Calibration, for instance, captures the ability of a model to provide forecast probabilities that are close to the true probabilities. In other words, in an ideal model, we want 70% forecasts to be true exactly in 70% of the cases. Resolution measures how much the forecasts differ from the overall average. The rationale here is, that just using the expected average values for forecasting will lead to a reasonably well-calibrated set of predictions, however, it will not be as useful as a method that manages the same calibration while taking current circumstances into account. In other words, the more extreme (and still correct) forecasts are, the better.

In the following table we categorize the set of predictions into bins of different probabilities and show how many percent of the predictions were correct per bin. This also enables us to calculate Calibration and Resolution measures for each model.

Model    50-59%  60-69%  70-79%  80-89%  90-100% Cal  Res   Brier
538      53%     61%     85%     80%     91%     .003 .082  .171
TA       56%     75%     78%     74%     90%     .003 .072  .182
Riles    56%     86%     81%     63%     67%     .017 .056  .211
ATP      50%     73%     77%     84%     100%    .003 .068  .185
Pinnacle 52%     91%     71%     77%     95%     .015 .093  .172

As we can see, the predictions are not always perfectly in line with what the corresponding bin would suggest. Some of these deviations, for instance the fact that for the Riles model only 67% of the 90-100% forecasts were correct, can be explained by small sample size (only three in that case). However, there are still two interesting cases (marked in bold) where sample size is better and which raised my interest. Both the Riles and Pinnacle models seem to be strongly underconfident (statistically significant) with their 60-69% predictions. In other words, these probabilities should have been higher, because, in reality, these forecasts were actually true 86% and 91% percent of the times.3 For the betting aficionados, the fact that Pinnacle underestimates the favorites here may be really interesting, because it could reveal some value as punters would say. For the Riles model, this would maybe be a starting point to tweak the model.

In the last three columns Calibration (the lower the better), Resolution (the higher the better), and the Brier score (the lower the better) are shown. The Brier score combines Calibration and Resolution (and the uncertainty of the outcomes) into a single score for measuring the accuracy of predictions. The models of FiveThirtyEight and Pinnacle (for the used subset of data) essentially perform equally good. Then there is a slight gap until the model of Tennis Abstract and the ATP ranking model come in third and fourth, respectively. The Riles model performs worst in terms of both Calibration and Resolution, hence, ranking fifth in this analysis.

To conclude, I would like to show a common visual representation that is used to graphically display a set of predictions. The reliability diagram compares the observed rate of forecasts with the forecast probability (similar to the above table).

The closer one of the colored lines is to the black line, the more reliable the forecasts are. If the forecast lines are above the black line, it means that forecasts are underconfident, in the opposite case, forecasts are overconfident. Given that we only investigated one tournament and therefore had to work with a low sample size (117 predictions), the big swings in the graph are somewhat expected. Still, we can see that the model based on ATP rankings does a really good job in preventing overestimations even though it is known to be outperformed by Elo in terms of prediction accuracy.

To sum up, this analysis shows how different predictive models for tennis can be compared among each other in a meaningful way. Moreover, I hope I could exhibit some of the areas where a model is good and where it’s bad. Obviously, this investigation could go into much more detail by, for example, comparing the models in how well they do for different kinds of players (e.g., based on ranking), different surfaces, etc. This is something I will spare for later. For now, I’ll try to get my sleeping patterns accustomed to the schedule of play for the Australian Open, and I hope, you can do the same.

Peter Wetz is a computer scientist interested in racket sports and data analytics based in Vienna, Austria.

Footnotes

1. $latex P(a) = a^e / (a^e + b^e) $ where $latex a $ are player A’s ranking points, $latex b $ are player B’s ranking points, and $latex e $ is a constant. We use $latex e = 0.85 $ for ATP men’s singles.

2. The betting market in itself is not really a model, that is, the goal of the bookmakers is simply to balance their book. This means that the odds, more or less, reflect the wisdom of the crowd, making it a very good predictor.

3. As an example, one instance, where Pinnacle was underconfident and all other models were more confident is the R32 encounter between Ivo Karlovic and Jared Donaldson. Pinnacle’s implied probability for Karlovic to win was 64%. The other models (except the also underconfident Riles model) gave 72% (ATP ranking), 75% (FiveThirtyEight), and 82% (Tennis Abstract). Turns out, Karlovic won in straight sets. One factor at play here might be that these were the US Open where more US citizens are likely to be confident about the US player Jared Donaldson and hence place a bet on him. As a consequence, to balance the book, Pinnacle will lower the odds on Donaldson, which results in higher odds (and a lower implied probability) for Karlovic.

Andrey Kuznetsov and Career Highs of ATP Non-Semifinalists

When following this week’s ATP 250 tournament in Winston-Salem and seeing Andrey Kuznetsov in the quarterfinals the following question arose: Will he finally make it into the first ATP semifinal of his career? As shown here Andrey – with a ranking of 42 – is currently (by far) the best-ranked player who has not reached an ATP SF. And it looks as if he will stay on top of this list for some time longer after losing to Pablo Carreno Busta 4-6 3-6 on Wednesday.

With stats of 0-10 in ATP quarterfinals, he is still pretty far away from Teymuraz Gabashvili‘s streak of 0-16. Despite having lost six more quarterfinals before winning his first QF this January against a retiring Bernard Tomic, Teymuraz climbed only to a ranking of 50. Still, we could argue that the QF losing-streak of Teymuraz is not really over after having won against a possibly injured player.

Running the numbers can answer questions such as “Who could climb up highest in the rankings without having won an ATP quarterfinal?” Doing so will put Andrey’s number 42 into perspective and will possibly reveal some other statistical trivia.

Player                Rank            Date   On
Andrei Chesnokov        30      1986.11.03    1
Yen Hsun Lu             33      2010.11.01    1
Nick Kyrgios            34      2015.04.06    1
Adrian Voinea           36      1996.04.15    1
Paul Haarhuis           36      1990.07.09    1
Jaime Yzaga             40      1986.03.03    1
Antonio Zugarelli       41      1973.08.23    1
Bernard Tomic           41      2011.11.07    1
Omar Camporese          41      1989.10.09    1
Wayne Ferreira          41      1991.12.02    1
Andrey Kuznetsov        42      2016.08.22    0
David Goffin            42      2012.10.29    1
Mischa Zverev           45      2009.06.08    1
Alexandr Dolgopolov     46      2010.06.07    1
Andrew Sznajder         46      1989.09.25    1
Lukas Rosol             46      2013.04.08    1
Ulf Stenlund            46      1986.07.07    1
Dominic Thiem           47      2014.07.21    1
Janko Tipsarevic        47      2007.07.16    1
Paul Annacone           47      1985.04.08    1
Renzo Furlan            47      1991.06.17    1
Mike Fishbach           47      1978.01.16    0
Oscar Hernandez         48      2007.10.08    1
Ronald Agenor           48      1985.11.25    1
Gary Donnelly           48      1986.11.10    0
Francisco Gonzalez      49      1978.07.12    1
Paolo Lorenzi           49      2013.03.04    1
Boris Becker            50      1985.05.06    1
Brett Steven            50      1993.02.15    1
Dominik Hrbaty          50      1997.05.19    1
Mike Leach              50      1985.02.18    1
Patrik Kuhnen           50      1988.08.01    1
Teymuraz Gabashvili     50      2015.07.20    1
Blaine Willenborg       50      1984.09.10    0

The table shows career highs (up until #50) for players before they won their first ATP QF. A 0 in the last column indicates that the player can still climb up in this table, because he did not win a QF, yet. There may also be retired players being denoted with a 0, because they never managed to get past a QF during their career.

I wonder, who had Andrei Chesnokov on the radar for this? Before winning his first ATP QF he pushed his ranking as far as 30. He later went on to have a career high of 9. Nick Kyrgios could also improve his ranking quickly without the need to go as deep as a SF. His Wimbledon 2014 QF, Roland Garros 2015 R32, and Australian Open 2015 QF runs helped him to get up until #34 without a single win at an ATP QF. Also, I particularly would like to highlight Alexandr Dolgopolov who reached #46 before having even played a single QF.

Looking only at players who are still active and able to up their ranking without an ATP SF we get the following picture:

Player                 Rank            Date
Andrey Kuznetsov         42      2016.08.22
Rui Machado              59      2011.10.03
Tatsuma Ito              60      2012.10.22
Matthew Ebden            61      2012.10.01
Kenny De Schepper        62      2014.04.07
Pere Riba                65      2011.05.16
Tim Smyczek              68      2015.04.06
Blaz Kavcic              68      2012.08.06
Alejandro Gonzalez       70      2014.06.09

Andrey seems to be relatively alone with Rui Machado being second in the list having reached his highest ranking already about five years ago. Skimming through the remainder of the table, we would be surprised if anyone soon would be able to come close to Andrey’s 42, which doesn’t mean that a sudden unexpected streak of an upcoming player would render this scenario impossible.

So what practical implications does this give us for analyzing tennis? Hardly any, I am afraid. Still, we can infer that it is possible to get well within the top-50 without winning more than two matches at a single tournament over a duration that can even range over a player’s whole career. Of course it would be interesting to see how long such players can stay in these ranking areas, guaranteeing direct acceptance into ATP tournaments and, hence, a more or less regular income from R32, R16, and QF prize money. Moreover, as the case of 2015-ish Nick Kyrgios shows, the question arises how one’s ranking points are composed: Performing well at the big stage of Masters or Grand Slams can be enough for a decent ranking while showing poor performance at ATP 250s. On the other hand, are there players whose ATP points breakdown reveals that they are willing to go for easier points at ATP 250s while never having deep runs at Masters or Grand Slams? These are questions which I would like to answer in a future post.

Peter Wetz is a computer scientist interested in racket sports and data analytics based in Vienna, Austria. I would like to thank Jeff for being open-minded and allowing me to post these surface-scratching lines here.