Notes From This Year's Data Intelligence Conference

7/7/17 8:23 AM

Last week, I attended the Data Intelligence Conference hosted by Capital One in McLean, Virginia. The three-day conference focused on a community of practitioners in the field of machine learning and artificial intelligence modeling. The talks at the conference centered on predictive modeling methods, real-life case studies involving machine learning and predictive analytics, data governance and security issues, and other data cleaning and visualization methods.

Read More

Topics: data analytics

College Football Data Analysis: How to Argue that Your Team Would Have Won

12/4/14 1:20 PM

For the last part of our college football playoff series, we will discuss the results from our simulations and draw some conclusions about the effects of the playoff.  If you haven’t tuned in to the first two posts, you can find the first post here and the second one here

Read More

Topics: data analytics

College Football Data Analytics: Who Wins the Playoff Matchup?

11/18/14 3:14 PM

Once we selected the teams for the playoffs, we took a look at how the actual games would be decided.  First, we examined what factors determine whether a team wins or loses. Then, using these factors, we built a probabilistic model based on composite ratings (see below) for the current season’s worth of games. This model relates team attributes to their chance of winning. We used this model to determine how the teams in our simulated playoffs would fare. 

Read More

Topics: data analytics

College Football Data Analysis: Why Your Team Should Have Won

11/4/14 11:38 AM

The college football champion at the end of the season will now be determined in a four-team playoff, where the teams are selected by the newly created College Football Playoff Selection Committee. This 13-member committee will evaluate potential playoff teams based on each team's on-field performance throughout the season. In previous years, the National Collegiate Athletic Association (NCAA) used the Bowl Championship Series (BCS) system to determine which two teams would play for the championship, which led to significant controversy.

Read More

Topics: data analytics

Predictive Analytics: Can We Predict a Star's Success?

10/22/14 9:30 AM

Today, we’ll take a last look at some movie data and apply predictive analytics to predict a big Hollywood actor’s earning power. The top grossing movie for the weekend of September 26th, 2014 was "The Equalizer," whose main star is Denzel Washington. Because he is a prolific actor (i.e. lots of data available), we chose him as our subject to see if this latest release will follow major trends for his releases.

We limited our analysis dataset to movies where Denzel had a starring role (i.e. was the first person listed in the credits on IMDB), and to movies that have been out of theaters long enough to obtain an accurate estimate of box office earnings. In the end, we selected 29 movies released between 1990 and 2013. We also looked at data gathered for the previous two posts, including the genre of the movie, movie budget, earnings, profit percentage, and total profit.

Read More

Topics: data analytics

Zombie Wars, Part 2: Is Horror a Holiday Spirit? Statistics Say No

10/13/14 2:06 PM

In the last post, we used data analytics to figure out that marketing Zombie Wars as a horror flick will give us the most bang for our buck. This time around we’ll do some analysis to determine the best release date for our aspiring blockbuster.

Read More

Topics: data analytics

Zombie Wars (and Applied Statistics): Coming to a Theater Near You!

10/6/14 10:35 AM

For today’s post, we’ve decided to switch gears from beer and toy data to examine some messier real-world data that is in the public domain. With fall approaching, the film industry shifts into “awards mode.” Studios break out their Oscar contenders and their holiday family films. To join in the movie frenzy, we are examining when and how we would release Zombie Wars (A CGI-filled adaption of the popular simulation from our 1st blog post).

Before we jump head-first into production mode, we thought it would be a good idea to look at movies released in 2013 and see if there were any trends in movie goers’ preferences. After scraping data online and doing some cleaning, we ended up with a dataset of 148 movies in six different genres (horror, thriller/suspense, comedy, action, drama, and adventure). In addition to genre, we also pulled data on each movie’s production budget and domestic box office revenue. We used these to calculate profit percentage and total profit for each movie in our dataset. Looking at how movies sales trend  in relation to their genre could help us determine how we want to market Zombie Wars, whether it be as a high-action, zombie-killing adventure or a scary, twisting horror movie. 

Read More

Topics: data analytics

How Stratified Sampling Can Make You a Better Wedding Host

9/29/14 11:26 AM

In the previous post we talked about sampling, a common way to collect data, and oftentimes the first step in a predictive analytics process. This time, we’ll describe how to apply stratified sampling to a problem many soon-to-be newlyweds have encountered: How much and what kind of beer should I serve on the big day? We can use statistical sampling to help.

Let’s say you’re getting married and have invited 300 people from three different states to your wedding. Because you’re a gracious host, you want to optimize the beer selection to your guests’ tastes. Calling all 300 people to ask their preference is impractical. So, obviously, you decide to pull a statistical sample of 30 guests (10% of the population), stratified by their state of residence, because you have reason to believe that geography influences beer preference.

Read More

Topics: data analytics

Predictive Analytics: How Many Humans Would Survive a Zombie Attack?

9/15/14 10:58 AM

Summit staff, including Principal Albert Lee, will be attending the Predictive Analytics World for Government conference in Washington, DC this week. Our staff are excited to learn about the many applications of predictive analytics in government. But the depth of analysis provided by predictive analytics also got us thinking about its more off-beat applications.

Read More

Topics: data analytics, R

About the Summit Blog

Complexity simplified.

Summit is a specialized analytics advisory firm that guides clients as they decode their most complex analytical challenges. Our blog highlights the strategies and techniques we use, as well as relevant topics in current events.

Subscribe to Email Updates

Recent Posts