Italian translation at settesei.it
A few years ago, I posted one of my most-read and most-debated articles, called The Mirage of Surface Speed Convergence. Using the ATP’s data on ace rates and breaks of serve going back to 1991, it argued that surface speeds aren’t really converging, at least to the extent we can measure them with those two tools.
One of the most frequent complaints was that I was looking at the wrong data–surface speed should really be quantified by rally length, spin rate, or any number of other things. As is so often the case with tennis analytics, we have only so much choice in the matter. At the time, I was using all the data that existed.
Thanks to the Match Charting Project–with a particular tip of the cap to Edo Salvati–a lot more data is available now. We have shot-by-shot stats for 223 Grand Slam finals, including over three-fourths of Slam finals back to 1980. While we’ll never be able to measure anything like ITF Court Pace Rating for surfaces thirty years in the past, this shot-by-shot data allows us to get closer to the truth of the matter.
Sure enough, when we take a look at a simple (but until recently, unavailable) metric such as rally length, we find that the sport’s major surfaces are playing a lot more similarly than they used to. The first graph shows a five-year rolling average* for the rally length in the men’s finals of each Grand Slam from 1985 to 2015:
* since some matches are missing, the five-year rolling averages each represent the mean of anywhere from two to five Slam finals.
Over the last decade and a half, the hard-court and grass-court slams have crept steadily upward, with average rally lengths now similar to those at Roland Garros, traditionally the slowest of the four Grand Slam surfaces. The movement is most dramatic in the Wimbledon grass, which for many years saw an average rally length of a mere two shots.
For all the advantages of rally length and shot-by-shot data, there’s one massive limitation to this analysis: It doesn’t control for player. (My older analysis, with more limited data per match, but for many more matches, was able to control for player.) Pete Sampras contributed to 15 of our data points, but none on clay. Andres Gomez makes an appearance, but only at Roland Garros. Until we have shot-by-shot data on multiple surfaces for more of these players, there’s not much we can do to control for this severe case of selection bias.
So we’re left with something of a chicken-and-egg problem. Back in the early 90’s, when Roland Garros finals averaged almost six shots per point and Wimbledon finals averaged barely two shots per point, how much of the difference was due to the surface itself, and how much to the fact that certain players reached the final? The surface itself certainly doesn’t account for everything–in 1988, Mats Wilander and Ivan Lendl averaged over seven shots per point at the US Open, and in 2002, David Nalbandian and Lleyton Hewitt topped 5.5 shots per point at Wimbledon.
Still, outliers and selection bias aside, the rally length convergence we see in the graph above reflects a real phenomenon, even if it is amplified by the bias. After all, players who prefer short points win more matches on grass because grass lends itself to short points, and in an earlier era, “short points” meant something more extreme than it does today.
The same graph for women’s Grand Slam finals shows some convergence, though not as much:
Part of the reason that the convergence is more muted is that there’s less selection bias. The all-surface dominance of a few players–Chris Evert, Martina Navratilova, and Steffi Graf–means that, if only by historical accident, there is less bias than in men’s finals.
We still need a lot more data before we can make confident statements about surface speeds in 20th-century tennis. (You can help us get there by charting some matches!) But as we gather more information, we’re able to better illustrate how the surfaces have become less unique over the years.