Quantifying Fantasy Football Skill: Sleeper vs ESPN, which managers are better at Fantasy Football

Alex Cates
Oct 14, 2023
8 min read

How do we know if our league is competitive?

Those of us that have played in multiple leagues know that some leagues are more competitive than others. Sometimes we have to constantly monitor waivers and news to gain the slightest edge, other times, we know league mates only check their teams once a week, making waivers a sinch. While this may not matter too much at a base level, your league is your league after all, from a fantasy football analysis standpoint, understanding which leagues are competitive and which are not is vital.

I have previously written a number of blog posts looking at what strategies and habits improve championship odds (such as draft strategies, activity, or lineup decisions). In each of these, I made some basic attempts to control for manager skill, whether that was removing managers who start injured/bye week players or removing leagues with outlier scoring. However, I have long thought that there had to be a better way based on how "good" a manager or an entire league is at fantasy football. If I had a quantifiable league skill number, I could not only filter out leagues with skill levels too low or too high, but I could even explore if different strategies make sense at different skill levels.

As an example, I have previously written that stacking a QB and a WR on the same team does not matter in season long fantasy football. However, in a highly competive league, where waivers are thinner, you may want to play for the volatility of a stack (which is the conclusion Mike Leone at Establish the Run came to). I don't know.

Similarly, I have written about how different league settings affect both positional value and draft strategies. It would be great to know if different league settings make a league easier or harder to be a manager in. Or how different positions are valued at different competiveness levels (so that you can try to "Moneyball" your league based on which positions are over/undervalued).

So today, I am going to present 3 different metrics that attempt to quantify a league's skill level at drafting, roster management, and lineup decisions, respectively. All of these are at the level of the entire league, not any individual manager. And to keep things interesting, I will be comparing how competitive public leagues are on ESPN and Sleeper. I often hear that Sleeper is the more competitive platform, but is that true? Let's find out.

The Data

I will be using a combination of data taken from the 2020, 2021, and 2022 seasons for calculating and comparing these metrics. From ESPN, I have ~5600 leagues that will be quantified, while for Sleeper I have ~6500 leagues. All of the metrics are calculated in a way that tries be agnostic to scoring and roster settings so we should be able to compare all of them directly. I will describe how each metric is calculated as we go through them below.

Note: The metrics try to be agnostic, they are not perfect and I will discuss how they may be gamed as well below.

Quantifying The Draft: League Draft vs ADP

First up is the draft. While no draft is going to be perfect, I think a useful comparison is looking at the performance of who the league drafted vs who would have been drafted if everyone just followed ADP (Average Draft Position). So if the league drafted 150 players, we can compare those players to the top 150 players based on ADP. To make this comparison, I will use the season total VORP (value over replacement player) of each set of 150 players. Now, ADP may not perfectly align with each league's settings (I know I just said they should be setting agnostic, but its a start), but it does allow us to compare decisions based off of the same amount of information (For instance, both sides would have taken JK Dobbins this year even though he was hurt and out for the season after week 1). We can then just divide the leagues total draft VORP from ADP's total draft VORP to create our draft skill metric.

Comparing Sleeper vs ESPN

Boxplots of each platform's draft skill. The red line marks the median league performance for each platform.

When we look at drafting abilities, there is a clear advantage among sleeper managers. The average ESPN league is only drafting about 83% of the value it should be, whereas Sleeper leagues are getting ~95% of the value. Interestingly, both platforms are below 100% suggesting that just following ADP may be a better strategy during the draft, though as I have previously shown, collecting ADP value is no guarantee of success. There is also an argument to be made for taking high risk players late regardless of ADP on the off chance they become a winner for your team. Most of these players won't pay off, contributing negative VORP to the league total, and lowering our draft skill metric, but you are playing for that chance of a huge positive outcome.

Quantifying Roster Management Skill: Rostering the Best Performers Each Week

To understand how I quantify roster management, let's work backwards. In a perfect league, the best performers of the week will be started and rostered each and every week. Say this perfect league has 12 teams with 1 starting QB slot. Then each week we would expect the top 12 QBs of the week to have been started. That includes starting the nobody QB who rushes for 3 TDs or the backup QB who takes over for the starter in the first quarter and ends up going off.

Now, none of us are in a perfect league with the ability to see the future. Additionally, with bench slots, even in the perfect league, the top 2 QBs on the week may be rostered by the same team, making the QB2 unavailable as a starting player for another team. Additionally, we are trying to separate the roster management skill vs the starting decision skill (that will come later).

So to quantify Roster Management Skill, we can determine what the highest possible total score (across all the teams) would have been if all the best performers on the week were started. We can then compare this to the sum of the best possible score of all teams given who was on the roster. This would be the "bestball" score of each team, taking out any start decisions. Again, this won't be perfect, because we cannot see the future and some players who should have been started will be on waivers, but better leagues should be closer to the ideal.

Comparing Sleeper vs ESPN

Boxplots of roster management skill between ESPN and Sleeper leagues. The red line represents the median of each group.

Again, we find that the Sleeper leagues are better at roster management, with roughly 93% of the ideal points available on their rosters. ESPN leagues are closer here, with around 88% of the ideal points available on their rosters. This means that the waivers will be more valuable in ESPN leagues than sleeper leagues just because there are going to be more week winners available each week (though whether we can identify them beforehand is a different question). Similarly, this suggests that Sleeper managers are more skilled, with more of the best performing players rostered each week.

Quantifying Lineup Decision Skill: Coach vs Projections

Finally, to quantify lineup decision skills we will compare how the starting lineup the manager chose compares to the starting lineup that the projected points would have put out. Both of these will be compared based on the final, real results. For example, consider the following options for who to start at QB:

Player	Projected Points	Actual Points	Started By
Patrick Mahomes	20	18	Manager
Anthony Richardson	21	24	Computer

The manager plays Mahomes, trusting the past performance despite the lower projection and only scores 18. The computer, purely following projections, plays Richardson, who was projected for 21 and scored 24. We would divide the actual points scored in the manager's lineup (18) by the computer's lineup's score (24) and get an overall coaching skill metric (in this case 75%). This is done across the entire roster to create our overall coach skill for the manager for the week. Of note, by dividing the coach's lineup score from projection's lineup score, we can normalize for scoring and roster settings. To get the performance of the whole league, we can just average across all of the managers. Note, for those of you who use Fantasy League Report, this is one of the 2 metrics that go into your weekly coaching grade.

Comparing Sleeper vs ESPN

Boxplots of ESPN and Sleeper leagues coaching performance. The red line represents the median coaching skill of each group.

Surprisingly, here we see a slight advantage of ESPN leagues. When it comes to which players on your roster to start, ESPN leagues get 98% of the points as the projected lineup, while Sleeper is around 97%. This may be differences in projection accuracy, it could be an effect of the poorer rosters in ESPN leagues making decisions easier, or it could be a real difference. Regardless, it seems there is one place that ESPN is equal to, or more competitive than Sleeper.

One other note, neither of these platforms are consistently above 100% (which would be equal to just following the projections). As I have shown over and over, and over again, we average human managers are not better than the projections our platforms provide, use your fantasy football management energy elsewhere.

Limitations

As always, there are some limitations. First and foremost, while I tried to make these metrics setting agnostic, some settings will game the metrics. For example, a team with no bench players, or where their bench is full of injured players will be scored a "better" coach because they will consistently align with the projected team simply because they do not have other options (This is only true as long as the projections outperform the average manager). For GM ability, having a large bench will naturally make it more likely that the relevant players are rostered and their bestball lineup is closer to the ideal. Same for simply drafting more players, as the differences from ADP will matter less and less. We will dive into the actual effect of these and other settings in a future post, but these weaknesses of the proposed metrics should be kept in mind during these comparisons.

Second, with how the data was collected, there is a chance that the sleeper data is biased in a way the ESPN data is not. The ESPN data was collected by guessing successive league ID numbers and saving the ones that work. There is no reason why one league may be related to another in the dataset. In contrast, each sleeper league is guaranteed to be related to another in the dataset, specifically that every league shares a manager with at least one other league in the dataset. That is because I made the dataset by cycling between the managers in a league and the leagues those managers are a part of. While the dataset is large enough that I doubt this substantially affected the results, there is some chance it may have.

Conclusions

The headline conclusion is that Sleeper leagues tend to be more skilled than ESPN leagues, especially when it comes to the draft and roster management. ESPN leagues tended to make slightly better decisions come gameday (though both left about 3% of their points off their starting lineup by not following the projections). Since most of us play with friends or people we know, I am not sure there is anything actionable here, but it does set up the possibility for a wide range of future analysis, so stay tuned.

Do you have other ideas for quantifying league or manager skill? Or do you have critiques of the metrics proposed? Please reach out and let me know!

Questions? Comments? Let me know at ac@alexcates.com. Want to read more breakdowns like this? sign up for my newsletter. Finally, like what I do? Consider supporting me on buy me a coffee or by signing up for Fantasy League Report.