It’s that time of the year again! An intense couple of weeks where basketball fans are pulled into a whirlwind period of fun and excitement, where moments are defined by gasps and bated as they wait to see who will take a step closer to winning the bracket. In fact, this excitement prompts thousands of sports commentators, machine learning models, and professional statisticians to weigh in on the winners of each and every round.
Although we know predicting sports outcomes is a fool’s errand, here at SAP, the DataGenius team is still joining in on the fun ,but by representing the perspective of the average person.
The average person probably doesn’t know much about the nuances of basketball nor the complicated world of statistics. That holds true for us too! Our DataGenius #ViztheMadness team (consisting of only one person who knows anything about basketball) wanted to weigh in as that “average person.” By using SAP technology, we’re opening the closed-off world of statistics and demonstrating that anyone can formulate a knowledgeable opinion of who will win the tournament.
Using Smart Predict in SAP Analytics Cloud
As complete amateurs to both machine learning and the finer points of basketball, we decided to approach the problem using SAP Analytic Cloud and a tool we didn’t have access to last year: Smart Predict. Designed to make machine learning accessible to business users without the need for a data scientist, Smart Predict augments existing business intelligence capabilities by learning from historical data to create recommendations on the next best action.
Smart Predict allows for three different predictive scenarios. Reading over the example description of each scenario, we thought that picking Classification would be the most suitable as we were looking at who would win between two teams (Team 1 or Team 2). This is a binary result.
The idea behind our model was relatively simple—The better team will win, so which team is better? Let’s decide by seeing who has the better stats! Given that our goal was to figure out which team would win when pitted against each other in the bracket, we decided to manipulate the data to look at the difference between the two selected team’s statistics. Then, for our historical data, we used game statistics dating back to 2007to build our model.
However, it is a given that out of the endless list of statistical variables we’ve collected, only a few of those variables would have a strong impact on the results. In order to identify these critical variables, we fed our model through Smart Predict, looking at the difference between the opposing team’s statistics for each game since 2007 and noting the winners.
By training our model through this process, the embedded AI in SAP Analytics Cloud was able to pick out the trends over all the games and we were able to narrow the top contributing variables to the following.
The basketball guru on our team came over, commenting that it seemed pretty obvious that the difference in Wins and Losses would affect who would win and lose. Therefore, we decided to try excluding the two variables:
Afterwards, we tested our model by trying to predict the 2018 tournament results and found that the model predicted only 9 games wrong for 86% accuracy! Furthermore, some of the incorrect picks were complete upsets that other statisticians could not foresee, reassuring us that we’ve made quite a reliable model. Not bad for a group of beginners that don’t know basketball or machine learning, even if we do have the benefit of hindsight.
SAP DataGenius #ViztheMadness Predictions
Now that the bracket has been set, here are the official SAP DataGenius predictions for the 2019 tournament.
We expect to see Michigan face-off against Kentucky in the finals with Michigan coming out in victory!
We will keep you posted on how the DataGenius model does as the tournament progresses so make sure to check back every week. We’ll also investigate how we can further improve the model as the 2019 results roll in.
Try It Yourself!
If you have predictions of your own you’d like to share, tweet @SAPAnalytics and use the hashtag #vizthemadness. We recommend gathering up your own data and using the embedded machine learning features in SAP Analytics Cloud, such as Smart Predict, to do your analysis.