College Football Data Analysis: How to Argue that Your Team Would Have Won

Posted by Brian Wong on 12/4/14 1:20 PM

Find me on:

--------------------------------------------------------

RefereeFor the last part of our college football playoff series, we will discuss the results from our simulations and draw some conclusions about the effects of the playoff.  If you haven’t tuned in to the first two posts, you can find the first post here and the second one here

In this post, we look at the potential impact of adding an additional playoff game to the BCS system. In a one-game playoff scenario, such as the old BCS title game system, it is more likely that the best team will indeed become the national champion. An extra game will introduce more randomness to the playoff, and is, potentially, worse at truly determining the best team in the country. However, this assertion depends on two assumptions: 1) the BCS always selects the best team in the playoff, and 2) there is such a thing as a universal “best team.” 

A true “best team” may not exist because teams are built differently to exploit different matchups, and to hide their weaknesses. A strong passing team will perform very well against a weak passing defense, for example. During the regular season, when a good number of any given team’s competition has glaring weaknesses, strong teams may be able to hide their own weaknesses by expertly exploiting the weaknesses of others. The four-team playoff increases the likelihood that the eventual champion will be tested for various weaknesses, because we can be fairly certain that they will compete against two of the best teams in the country.   

In order to look at the impact of an additional game, we took a look at the last four college football seasons and crowned the “national champion” based on 1,000 playoff simulations. Each simulation represented a random draw from a distribution based on the win probability of each team, which was based on a team’s F/+ rating and a team’s variability in performance (calculated using a variance of the probability of winning a team’s bowl game based on offensive and defensive statistics, see part two). The tables below show our results. The bolded team in each table is the actual BCS champion for that season.

2013 Simulated Playoff Results:

Final_-_Table_1

In 2013, Florida State was the 2nd most dominant team compared to its peers in the past four college football seasons (the most dominant was the 2012 Alabama squad). The Seminoles survived the 1st round 58.2% of the time. This may seem small, but in model simulations such as these, regression to the mean is common. Florida State’s significant margin in winning percentages given selection for the playoff (last column), shows that Florida State was indeed the best team this year. In this last column, the percentages of winning may seem low, however we are projecting the probability of winning among four teams, so each team should have a probability of winning 25% of the time if they were all equal. Therefore, probabilities well above and well below 25% are pretty significant (as is the case with Florida State’s 34% chance of winning a playoff). It makes sense that they and Stanford did proportionally well, given their high F/+ ratings. 

Final_-_Table_2

Despite Notre Dame being undefeated and Alabama having one loss heading into last year’s title game, our results show that Alabama was clearly the better team, based on our model. They even won their 1st round game at a higher percentage than Notre Dame, despite being seeded 2nd and facing a “stronger” 3rd seed in the 1st round. In fact, Notre Dame, due to its poor F/+ rating, made the title game less than 50% of the time. Texas A&M and South Carolina do well once in the title game, though this can be attributed to the limited number of times they were in the title game.   

Final_-_Table_3

2011 had an interesting set of 1st round results. Since Alabama had to take on a tough Oklahoma State team in every simulation, while LSU beat up on weaker teams, LSU had a much easier time of getting to the title game at a whopping 65%. However, due to Alabama being a stronger team in general, they were able to win the championship game at a slight plurality over LSU. Oklahoma State underperforms because of our selection criteria framework, and they end up taking on Alabama every. single. time. 

Final_-_Table_4

Our last commentary comes with the year 2010, probably the best of them all when it comes to chaos, which is a bit ironic since it’s the only year of the last four with three unbeaten teams heading into the playoff. According to the F/+ rating, Boise State was actually the best team, however its BCS rating was so bad that our fake committee only picked them twice. Besides that, the results are all over the place, with Auburn being our most common playoff champion and Oklahoma showing the best playoff performance (at least in the title game, winning an unbelievable 85% of the time, while only winning 37% of semi-final games), albeit with only 40 playoff appearances. The parity in 2010 shows the weakness of the BCS model compared to the four-team model. However, our model shows Auburn winning a plurality of the championships, much to Oregon’s chagrin, because in the real matchup Michael Dyer was totally down.

We can see from above that the year-to-year results can change dramatically. The inclusion of a playoff allows teams with “fluke” losses to come back and have a chance at winning it all. If nothing else, the playoff system increases the pool of teams that can possibly win the championship. However, the two game playoff system introduces significant randomness to our outcomes. Teams that were truly statistically dominant throughout the season, like 2012 Alabama, have their championship odds essentially cut in half. If we assume that the BCS system generally included the best team in the country, then we could argue that under the BCS system, the best team would win more frequently, and thus the BCS system is better than the playoff system, which introduces significant randomness to the result. 

As a final point of discussion, here are our simulation results going all the way back to 2005, the first year that the F/+ ratings were available to us, showing each team’s odds of winning the championship given selection by the college football playoff committee. In 2009, there were five undefeated teams, which triggers our automatic playoff selection criteria, so we only have a five-team table for that year. For the rest, we show 10 teams as we did for 2010-2013.

Final_-_Table_5

Final_-_Table_6

Final_-_Table_7

Final_-_Table_8

Final_-_Table_9

This post was written with the help of Matt Duffy and Amy Deora.

Talk to Our Team

Topics: data analytics

About the Summit Blog

Complexity simplified.

Summit is a specialized analytics advisory firm that guides clients as they decode their most complex analytical challenges. Our blog highlights the strategies and techniques we use, as well as relevant topics in current events.

Subscribe to Email Updates

Recent Posts