ESPN Coverage: How ESPN's Tweets reflect American Sports Society
Updated: a day ago
Sports are an essential part of our society. They as a reflection of society and what society values. The super bowl is the most-watched event in America every year and for many (including myself), the NBA suspending its season last year marked the beginning of the covid 19 pandemic. They can also be a catalyst for society. Just look at the impact of Colin Kaepernick taking a knee during the national anthem, or the Atlanta Dream coming out in support of Reverend Warnock.
For better or worse, ESPN is the go-to source for all things sports. While there are newcomers appearing on the scene now, ESPN has always touted itself as the worldwide leader in sports and, at least in the United States, that is an undisputed title. Given the impact of ESPN on the public conversations that surround sports, it is important to look at how ESPN covers sports in general and to look at what that coverage can tell us about ESPN's own biases, our biases as a society, and possibly suggest ways of going forward. In this initial post, I want to establish that what ESPN tweets out is an accurate representation of the state of sports and reflects society's interests in sports on multiple time scales. In future posts, I will then try to use this tweet data to ask some more interesting questions.
For this post, we will be using all of the tweets the @espn account has sent. Since its inception on Twitter back in 2007, ESPN has tweeted out over 96,000 tweets which we will use for our dataset. Specifically, using snscrape and tweepy, we can collect the date of the tweet, the actual text (including any tags and hashtags), the number of likes and retweets the tweet received, and whether the tweet itself was a retweet by the official @espn account. In order to be able to say anything, we have to determine the topic of each tweet. For this post, I will attempt to assign a league to the tweet (specifically either NBA, NFL, MLB, NHL, MLS). We will do this by matching strings within the text of the tweet. We will look for a number of things including official league accounts (e.g. @nba) and the official names and accounts of each team in the league (e.g. @patriots). Additionally, many of the tweets only reference a player, so we will search for any player that has made an all-star team in their respective league since 2007. With all of this, we get a sample of almost 47,000 tweets, or about half of all of espn's tweets, with an assigned league.
The assignment of tweets to leagues and specifically the nearly 49,000 tweets that are unassigned will obviously be the biggest limitation of the following analysis. Many tweets are likely in reference to leagues or sports I am not looking at (such as tennis or the English premier league). Some of these are likely grabbed in my sample (such as a basketball player who gets mentioned both in college and in the NBA), but it shouldn't be enough to sizeably change the analysis. Additionally, many tweets are just like "buzzer-beater for 3" which may be NBA, but also may be international or college basketball or even a field goal in football or a hat trick in soccer or hockey. They include the highlight so it's easy for us as humans to figure it out but for a machine, it's just not possible. There may be a way to use machine learning to work these out, but I will leave that for a future post. Importantly, there is no reason to suspect that the tweets that are unassigned should be biased towards one league or another. Every league had the same inputs for assignment and so I expect and will continue under the assumption that it is an accurate sample.
Which league reigns supreme?
The first thing to look at is which is the most popular league. An easy first look is which league does ESPN tweet about the most.
Clearly, the NBA and the NFL are dominating here (and in fact make up almost 2/3 of all the tweets identified). This is both expected (they are the two most popular sports in America) and surprising (I expected NFL to be on top, but I may be biased). I am actually surprised by the strength of the MLB with a solid third place and but clearly distinguishing itself from the NHL and MLS.
But this is just volume, what about the responses to the tweets?
Median number of retweets and likes per tweet per league over the entire timeframe
This was the first real surprise of this analysis. The NBA gets far and away a larger response than other leagues. This aligns with the volume (ESPN should tweet more about the league that gets the largest response), but it is impressive that even with the volume, they are able to maintain such a large response. I do think this reflects twitter's userbase which skews younger similar to the NBA's fanbase. The other shocker here is the response that the MLS is receiving. Despite having nearly one-tenth the number of tweets as the NFL it is outpacing the NFL in likes and is about even in the number of retweets on average. This suggests that ESPN should be tweeting out more MLS content since there is a market there. A third point of interest is the lack of response to MLB and NHL. This suggests that the average fan of these sports is not on Twitter, or at the very least does not care about espn's tweets on Twitter.
Again, these are averages, what about the best performing tweet for each league?
Bar plot of the response to each league's best performing tweet
This is why using median before was important, the NBA is in a different class when it comes to their highest performing tweet. Additionally, the max performance actually lines up with the volume of tweets as well.
And what was that max tweet that broke this graph?
Unfortunately, it is a sad tweet, specifically a memorial to Kobe Bryant after he passed away last year. He was a legend, both on and off the court, and his death had a significant impact on society. But let's not dwell on the sad points and continue to explore the dataset.
Is there a seasonality to tweets?
Let's look at how common tweets are throughout the year. I would expect there to be a significant effect depending on whether the sport is in-season or not. I also am curious to how the number of offseason tweets can speak to the non-stop sports culture.
Line plot of average tweet volume per month, showing the difference of in-season and out-of-season tweet volume.
Immediately we can see some interesting trends. First, the NBA, NFL, and MLB have a clear seasonality effect (with the NFL: September - January, NBA: November - June, and MLB: April - October). There seems to be a playoffs bump for NHL (April and May) but it is not apparent for the entire season, and there is no real increase for the MLS when they are in-season or not. Additionally, the dominance of the NBA and NFL are clear as even in the offseason the NBA gets as many tweets as the MLB gets in season, while the NFL is not far behind. one final note is that you can see the impact of the NFL draft with a clear spike in the number of NFL-related tweets occurring in April every year.
Now let's break it down by league and year to see if these trends hold for different years
Line plots of the number of tweets about each league per month each line represents a different year with darker colors being more recent. Note the y axis is different in each graph to emphasize league trends rather than between league volume
Here again, we can see some interesting trends. The first one that jumps out to me is that there is a huge spike in MLS tweets in June of 2014. This is likely a World Cup bump from the one year the US National Team qualified over this span. The second thing that jumps out is the spike in NBA tweets in August and September of this past year (2020). This represents when the bubble season occurred during the covid-19 pandemic and therefore should be looked upon as the NBA being in-season. The third interesting aspect is that while the number of tweets about the NFL remains pretty steady year to year, there is an increase in tweets about the NBA year to year and a clear decrease in tweets about the MLB and NHL. This makes sense given the lack of interactions (likes and retweets) with tweets about the MLB and NHL receive, as we saw earlier. It also reflects the general popularity of these sports in America, the NBA is certainly on the rise while the MLB's popularity is fading. We can see this relationship better if we plot the number of tweets about each league year to year.
Line plot of the number of tweets per year per league. Note I removed 2021 from this graph because it is not complete yet.
Nothing new here, but we can clearly see the decline in tweets about the MLB (blue) and NHL (dark gray). I also think it is interesting that there was a clear spike in Twitter activity in 2015 and 2016 for both the NFL and NBA, though to be honest, I do not know why that may be.
How about tweets by day of the week?
Some sports (like the NFL) have almost all their games on Sundays while others (like the NBA), spread their games out over the week. Will we be able to see that reflected in the tweet data?
Total volume of tweets per day of the week
The short answer is yes. The NFL (brown) has a clear spike on Sunday (game day) and Monday (the day after the game). For the other leagues when a game can occur on any day there isn't really an effect. I will admit I am intrigued by the NBA peaking on Wednesdays, but it's not too crazy of a peak.
As I stated in the beginning, the biggest limitation of this is simply that I am failing to categorize some tweets. While some of the tweets I am missing are going to be for other sports or events (tennis, golf, Nascar, the Olympics, etc.), some will be tweets that were about the 5 leagues examined here. I still don't think there is a good reason why the missing tweets would be biased differently than what is shown here, but it is always possible, and if anyone ideas on how to remedy this (without me manually going through the 96,000 tweets) please let me know.
The main conclusion here is that @espn's tweets do reflect society's general interest in sports. This holds in terms of overall league, time of year, and in response to key events (such as the World Cup or the NBA bubble). with that established, future posts will try to ask more interesting questions.
All code related to this post is available on my github.