What I Should’ve Known About Playing Styles and Upsets

In the podcast Carl Bialik and I recorded yesterday, I mentioned a pet theory I’ve had for awhile, that upsets are more likely in matches between players with contrasting styles. The logic is fairly simple. If you have two counterpunchers going at it, the better counterpuncher will probably win. If two big servers face off, the better big server should have no problem. But if a big server plays a counterpuncher … then, all bets are off.

We’ve seen Rafael Nadal struggle against the likes of John Isner and Dustin Brown, and and we’ve seen big servers neutralized by their opposites, as in Marin Cilic’s 1-6 record against Gilles Simon. There are upsets when similar styles clash, as well, but as untested theories go, this one is appealing and not obviously flawed.

Then, to kick off the 2019 Australian Open, Reilly Opelka knocked out Isner. Playing styles don’t come much more evenly matched, and the veteran was the heavy favorite. It was a perfect example of the kind of match I would expect to follow the script, yet the underdog came out on top. They played four tiebreaks and there were only two breaks of serve, but Opelka didn’t even need the Australian Open’s new fifth-set 10-point tiebreak. While it’s just one match, of course, it suggested that I ought to look more closely at my assumptions.

After a couple of hours playing with data this afternoon, my theory is no longer untested … and it turned out to be flawed. Fortunately, it isn’t just another negative result. Playing style is related to upset likelihood, but not in the way I predicted.

Measuring predictability

Let me explain how I tested the idea, and we’ll work our way to the results. First, I used used Match Charting Project data to calculate aggression score for every ATP player with at least 10 charted matches since 2010. Aggression score is, essentially, the percentage of shots that end the point (by winner, unforced error, or inducing a forced error), as will serve as our proxy for playing style. That gives us a group of 106 players, from the conservative Simon and Yoshihito Nishioka with aggression scores around 13%, to the freewheeling Brown and Ivo Karlovic, with scores nearing 30%. I divided those 106 players into quartiles (by number of matches, not number of players, so each quartile contains between 21 and 31 players) so we could see how each general playing style fares against the others. Here are the groups:

(Aggression score conflates two things: big serving/big hitting and tactical aggression. Isner is sometimes not particularly aggressive, but because of his size and serve skill, he is able to end points so frequently that, statistically, he appears to be extremely aggressive. Accordingly, I’ll refer to “big servers” and “aggressive players” interchangeably, even though in reality, there are plenty of differences between the two groups.)

Limiting our view to these 106 men, I found just over 11,000 matches to evaluate and divided them into groups based on which quartiles the two players fell into. Each of the ten possible subsets of matches, like Q1 vs Q2, or Q4 vs Q4, contains at least 400 examples.

For every match, I used surface-adjusted Elo ratings to determine the likelihood that the favorite would win. That gives us pre-match odds that aren’t quite as accurate as what sportsbooks might offer, though they’re close.

Those pre-match odds are key to determining whether certain groups are more predictable than others. If there are 100 matches in which the favorite is given a 60% chance of winning, and the favorites win 70 of them, we’d say that the results were more predictable than expected. If the favorites win only 50, the results were less predictable.

Goodbye, pet theory

For the matches in each of the ten quartile-vs-quartile subsets, I calculated the average favorite’s chance of winning (“Fave Odds”), then compared that to the frequency with which the favorites went on to win (“Fave Win%”). The table below shows the results, along with the relationship between those two numbers (“Ratio”). A ratio of 1.0 means that matches within the subset are exactly as predictable as expected; higher ratios mean that the favorites were even better bets than the odds gave them credit for, and lower ratios indicate more upsets than expected.

[table id=1 /]

There’s a striking finding here: The largest ratio, marking the most predictable bucket of matches, is for the most conservative pairs of players, while the smallest ratio, pointing to the most frequent upsets, is for the most aggressive players.

Before analyzing the relationship, let’s check one more thing. The very best players aren’t evenly divided throughout the quartiles, since Q1 has two of the big four. Elo-based match predictions–one of the building blocks of these results–are tougher to get right for the best players and the most uneven matchups, so we need to be careful whenever the elites might be influencing our findings. Therefore, let’s look at the same numbers, but this time for only those matches in which the favorite has a 50% to 70% chance of winning. This way, we exclude many of the best players’ matchups and all of their more lopsided contests:

[table id=2 /]

We discard about 40% of our sample, but the predictability trend remains the generally the same. In both the overall sample and the narrower 50%- to 70%-favorite subset, the strongest relationship I could find was between the predictability ratio and the quartile of the less aggressive player. In other words, a counterpuncher is likely to have more predictable results–regardless of whether he faces a big server, a fellow counterpuncher, or anyone in between–than a more aggressive player.

Back to basics

My initial theory is clearly wrong. I expected to find that Q1 vs Q1 matches were more predictable than average, and I was right. But by my logic, I also guessed that Q4 vs Q4 matches went according to script, and that other pairings, like Q1 vs Q4, would be more upset-prone. I would have done better had I let the neighbor’s cat make my predictions for me.

Instead, we find that that matches with more aggressive players are more likely to result in surprises. That doesn’t sound so groundbreaking, and it’s something I should’ve seen coming. Big servers tend to hold serve more often and break serve less frequently, meaning that their matches end with narrower margins, opening the door for luck to play a larger role, especially when sets and matches are determined by tiebreaks.

After all this, you might be thinking that I’ve squandered my afternoon, plus another few minutes of your attention, arriving at something obvious and unremarkable. I agree that it’s not that exciting to proclaim that big servers are more influenced by luck. But there’s still a useful–even surprising–discovery buried here.

Exponential upset potential

We know that the most one-dimensional players are more subject than others to the ups and downs of luck, thanks to the narrow margins of tiebreaks. For a man who rarely breaks serve, no match is a guaranteed win; for a man who rarely gets broken, no opponent is impossible to beat. However, I would have expected that the unpredictability of big servers was already incorporated into our match predictions, via the Elo ratings of the big servers. If a player has unusually random results, we’d expect his rating to drift toward tour average. That’s one reason that it’s very difficult for poor returners to reach the very top of the rankings.

But apparently, that isn’t quite right. The randomness-driven Elo ratings of our big servers do a nearly perfect job of predicting match outcomes against counterpunchers, and they’re only a little bit too confident against the more middle-of-the-road players in Q2 and Q3. Against each other, though, upsets run rampant. That extremely volatile fraction of results–the tiebreak-packed outcomes when the biggest servers face off–only accounts for part of these players’ ratings.

We’re accustomed to getting unpredictable results from the most aggressive players, with their big serves, inconsistent returns, and short rallies. Today’s findings give us a better idea of when these do and do not occur. Against counterpunchers, things aren’t so unpredictable after all. But when big servers play each other, we expect the unexpected–and the results are even more unpredictable than that.

Discover more from Heavy Topspin

Subscribe now to keep reading and get access to the full archive.

Continue reading