I got an email this week, from Peter S., asking this question. There are a few reasons.
Peter zeroed in on a first-rounder at the Temuco Challenger, between Milledge Cossu and Alafia Ayeni. Ayeni is ranked 731, Cossu in the 1600s. Betting markets had Ayeni as a heavy favorite; my forecast gave Cossu the edge. Sportsbooks, unsurprisingly, had this one right, as Ayeni needed just 68 minutes to advance.
My forecasts are based on my Elo ratings, and my Elo ratings take into account all tour-level and tour-level qualifying, plus all Challenger main-draw results. (For women, I consider ITFs down to the $50K level.) For most players that you’ve heard of, that means that my Elo ratings are looking at every match they play. But for the likes of Ayeni and Cossu, it’s the opposite. The majority of Ayeni’s results this year have come at the ITF level. Cossu doesn’t have many pro results, period.
Point being, my forecast for a match like that is based on too little information to be anywhere near reliable. It might agree with the betting market, but only by happenstance. Ayeni has a poor recent record in Challengers, while Cossu hasn’t played any. By the logic of Elo, even starting a newbie at a fairly low rating, that makes Cossu the favorite. Give them both a dozen more matches, and the kinks would be ironed out, but we don’t get to do that simply for the sake of science.
Maybe I should indicate that more clearly on the forecast pages. (Or maybe I should include ITF results, too. Lots of stuff I should probably do.) In the meantime, you can check my Elo ratings leaderboard. Neither Ayeni or Cossu even appears, indicating that neither has played ten matches in the last 52 weeks that contribute to their rating. Ayeni is close to that threshold, so his rating (1166) is probably in the ballpark. Cossu is an unknown quantity. If players aren’t on that list, their forecasts aren’t going to be as accurate as those for players who are.
Outside the model
For extreme gaps between my forecasts and betting odds, limited data is usually the answer. You’ll often find smaller–but still puzzling–gaps, even between players with extensive track records.
It’s worth considering what’s “in” the model. Elo looks at match results–period. Has the player won or lost lately, and against whom? My single substantial tweak to that is an injury/absence penalty, so if someone misses a lot of time (minimum eight weeks during the season), they get docked. The assumption is that they’ll come back rusty or still physically compromised. The size of the penalty is based on player results after past absences. Though of course not all absences are alike, and players differ in how they handle them.
For player who haven’t missed time lately, any “news” isn’t going to show up in the forecast. If word leaks out that Alcaraz is dealing with a bum ankle, betting markets will adjust, but my forecast will not.
If players have missed enough time to trigger the penalty, their Elo ratings are less reliable until they’ve gotten several matches under their belt. When Sinner came back from his doping ban, he was surely in better shape than the typical guy who had just sat out three months. Same story with Djokovic’s layoffs-by-choice this season. On the flip side, a player who comes back too soon, perhaps treating a 250 as a mere trial run, is less likely to win than his adjusted Elo rating suggests.
Surfaces
Another major factor outside my Elo model is the specifics of surface. Not all hard courts are created equal, and I don’t even differentiate between indoor and outdoor. (I know, another thing you want me to do.) My forecasts probably underestimated Sinner’s chances of breezing through the last several weeks of the season because they did not recognize that indoor Sinner is reliably better than outdoor-hard-court Sinner.
Even among outdoor courts, speed varies enormously. Some players are considerably better on faster or slower courts, even if they are the same type. When Rafael Nadal was winning 1.2 million consecutive matches at Roland Garros, my model always considered him the favorite, but not by the overwhelming margin that bettors (rightly) did. Part of the reason was that the Paris clay is reliably slow, while Nadal was more vulnerable at, say, Madrid. So my Elo ratings, tossing all of his clay results into the same bucket, saw Rafa as (barely) beatable, even though it took an act of God to dislodge him at the French.
In short, if a player is particularly well-suited to the conditions at a certain tournament, Elo isn’t going to pick that up. He’ll be underrated in the forecasts. The degree depends on the player, and on just how well-suited he is.
There can also be an issue with limited surface data, most often–but not always–during grass season. Young players on the rise might show up at Wimbledon qualies without ever having played a professional match on grass. Their overall (surface-agnostic) rating will tell us something about their level, and my model makes an adjustment for their grass inexperience. But that sort of player could be anything from a grass natural to a hopeless case. Some of that might be predictable to a savvy fan, but Elo doesn’t have a clue.
Financial advice
If you’re using my forecasts as betting advice, stop doing that. C’mon, man.
If you check my forecasts for entertainment purposes, it’s good to know exactly what you’re looking at, and what the numbers are based on. Hope this helps!








