College Football Data Analysis: Why Your Team Should Have Won

Posted by Brian Wong on 11/4/14 11:38 AM

Find me on:

--------------------------------------------------------

Football_playThe college football champion at the end of the season will now be determined in a four-team playoff, where the teams are selected by the newly created College Football Playoff Selection Committee. This 13-member committee will evaluate potential playoff teams based on each team's on-field performance throughout the season. In previous years, the National Collegiate Athletic Association (NCAA) used the Bowl Championship Series (BCS) system to determine which two teams would play for the championship, which led to significant controversy.

The BCS system used the average team ranking from two polls (the Harris Interactive Poll, and the Coaches' Poll), as well as the average ranking from a set of six BCS computer ranking systems. The weighted average of the polls and the computer rankings produced the final BCS rankings, and at the end of the season, the top two teams in the BCS rankings would play in the national championship game. Before the BCS system was in place, the national champion was generally determined using an unscientific combination of the AP Poll and the Coaches’ Poll.

We recently wondered, “What would previous playoff brackets have looked like, if this new playoff format had been used, as opposed to the BCS methodology?”  We took a look back at the past four seasons and ran 1,000 simulations to see how the inclusion of more subjective evaluation of teams may have impacted the national championship game. 

Determining Which Teams the Committee “Selects”

We started off by defining our possible playoff contenders as the top 10 BCS-ranked teams in each year.  Then, we determined which set of teams we thought would always have been considered for the playoff. These teams we made “immune” to any subjective evaluation, and therefore assigned them a 100% chance of being selected for the playoff. We made the assumption that teams that exhibit one of the following traits would automatically be included in the playoff:

  • Undefeated teams from major conferences;
  • Major conference champions ranked 2nd or better in both the AP and Harris polls;
  • Major conference champions ranked no lower than 3rd in the BCS computer models, the AP poll, and the Harris poll; or
  • Undefeated teams ranked in the top 4 of every poll. 

In the past four years, two to three teams have been in this “automatic inclusion” category each year. Rankings for the playoff seeds were made by ranking the #1 seed as the team that hit the most of our automatic inclusion criteria. If any other teams triggered our automatic inclusion criteria, they were ranked subsequently based on how many of the criteria they hit.

For the other teams that might be in contention, we assume that their BCS ranking is a key indicator of their possible inclusion in a playoff.  However, we wanted to introduce some randomness to the selection process so as  to account for our uncertainty of the vote of any one committee member.  To mimic the voting process, which is influenced by the BCS score, but also subject to the judgement of the committee member, we proxied the voting process by selecting 13 samples from the set of top 10 teams, each sample drawn to reflect the vote of an individual committee member. Obviously, teams with higher BCS scores are generally more likely to be selected by the committee members. Therefore, we took the BCS scores of each team and scaled them to “selection probabilities,” and selected our samples with unequal selection probabilities, proportional to each team’s BCS score.  Therefore, teams with higher BCS scores have a greater chance of selection, but there is still some chance of selecting a lower-ranked team, as can be seen in the tables below.

In most years, two teams made the playoff through our automatic selection criteria. For the last two playoff slots, our “committee” first selected a “3rd seed” from  the top 10 teams.  Then, the “3rd seed” team was removed from the pool, and we rescaled the selection probabilities of the remaining teams to allow the committee to select a “4th seed” for the playoff.

Creating the Summit Hypothetical College Playoff Championship (SHCPC) Matchups

While it’s interesting to see the variation in playoff selections through our synthetic “committee” process, it’s more interesting to see who would actually win the SHCPC each year.  In the tables, below, we list each teams’ probability of being selected for the playoff. We generated a playoff bracket for each season by taking one random committee selection, based on our selection probabilities, for each year. We will use the random committee selection to determine our SHCPC champion in our next posts!

2013 Season:

2013_season_-_Table_1

2013 Playoffs:

2013_playoff_-_Figure_1

2012 Season:

2012_season_-_Table_2

2012 Playoffs:

2012_playoff_-_Figure_2

2011 Season:

2011_season_-_Table_3

2011 Playoffs:

2011_playoff_-_Figure_3

2010 Season:

2010_season_-_Table_4

2010 Playoffs:

2010_playoff_-_Figure_4

This post was written with the help of Matt Duffy and Amy Deora.

 Talk to Our Team

Topics: data analytics

About the Summit Blog

Complexity simplified.

Summit is a specialized analytics advisory firm that guides clients as they decode their most complex analytical challenges. Our blog highlights the strategies and techniques we use, as well as relevant topics in current events.

Subscribe to Email Updates

Recent Posts