This site is built from data on the closing lines available at https://www.covers.com/. I scraped their data using python/BeautifulSoup, processed it somewhat in python, then processed further and formatted using PHP/MySQL.
NOTE: The current default setting is for JUST regular season and conference tournament (i.e. NOT March Madness) data. The presence of Cinderella teams at the top of some of these rankings is much more interesting than it might otherwise be! I do have the March Madness data, and will include it eventually as an option.
The specific code I used for scraping and then the website code is availble at https://github.com/oreagan/respectindex .
Is this data any good? Let's check how it looks overall, in terms of how often teams beat the spread:
This looks basically like we might expect. For 2017-18, two standard deviations from the mean is a little under 10 (~9.8) wins or losses vs. the spread.
9 teams (~3%) had 10 or more beats, 10 (~3%) lost 10+ net times to the spread.
Since 2011-2012, overall stats (with averages) include:
Games played w/ lines | 28,352 |
Home team beat the spread | 13,793 |
Home team lost to the spread | 13,913 |
Pushes | 646 |
Losses if bet $110 on every game | $151,130 ($21,590 per year) |
Games both won & beat spread | 2,990 (38%) |
Teams beating spread more than not | 1051 (43.4%) (150 per year) |
Teams losing to spread more than not | 1083 (44.7%) (154 per year) |
Stand-outs
All Time Best Against the Spread: The 2013-14 Wichita State Shockers, who also set any number of records with their 31-0 season. They beat the spread an astonishing 18 times in the regular season.
All Time Worst Against the Spread: the 2011-2012 Louisiana-Lafayette Ragin Cajuns, who managed an astonishing 16 losses vs. the spread in the regular season. They went 16-16 in the season, 10-6 in conference, so they weren't awful - but they were consistently worse than expected.
Is this just for fun, or does this metric actually tell us anything about NCAA tournament success? Surprisingly, yes, percentage of net wins vs. the spread is correlated with tournament success in a statistically significant way!
If what we care about is NCAA tournament success (and we shouldn't care about that exclusively for team success!), then a relevant metric is the number of games you qualified to play. If you get into the tournament, you qualify for 1 game; if you win, 2 games; if you win the tournament, 7 games (though of course you're the only team left and won't play game 7). We'll call this desired result games_played
Not all games get lines, meaning not all teams get equal chances to beat the spread. We're interested in ability to beat the spread, not how high-profile teams are (and thus how often there's money actually at stake). Thus, we care about the percentage of wins vs. the spread. We'll call this variable perc_beat
Our simple question, then, is this: is the percentage of times beating the spread pre-tournament correlated with NCAA tournament games played/won?
We're looking at count varibales (you can only win a full game, not a percentage of one), so let's use a negative binomial regression with a simple model games_played = perc_beat.
Year | Estimate | ChiSq | ProbChiSq | StdErr | LowerWaldCL | UpperWaldCL |
Overall | 2.6123 | 263.29 | <.0001 | 0.1610 | 2.2968 | 2.9279 |
2012 | 1.6707 | 21.47 | <.0001 | 0.3606 | 0.9640 | 2.3774 |
2013 | 2.4975 | 36.80 | <.0001 | 0.4117 | 1.6905 | 3.3044 |
2014 | 3.7635 | 57.38 | <.0001 | 0.4968 | 2.7897 | 4.7372 |
2015 | 3.6797 | 54.06 | <.0001 | 0.5005 | 2.6988 | 4.6606 |
2016 | 1.9941 | 26.78 | <.0001 | 0.3853 | 1.2389 | 2.7493 |
2017 | 2.5976 | 39.86 | <.0001 | 0.4115 | 1.7912 | 3.4041 |
2018 | 3.1457 | 47.86 | <.0001 | 0.4547 | 2.2545 | 4.0370 |
So, how do we interpret these results? Obviously beating the spread doesn't cause tournament wins. However, it does seem like there is some combination of factors that have a meaningful effect on both how often you beat the spread, and how often you win in the NCAA tournament. The effect is small, but you would expect it to be, if only because luck is such a large factor in a single-elimination tournament.
What are these things being captured by the Disrespect Index? Probably lots of intangibles: pluck, determination, being frustrated by being underestimated, the effect of media narratives and conference/team prejudices keeping fans (and gamblers) from re-evaluating teams more objectively.