Marketers are surrounded by predictive modeling and machine learning. Whether it’s the underlying algorithm(s) powering campaigns in DSPs, suggested subject lines in marketing automation platforms, or custom models built specifically for various business objectives, predictive modeling is everywhere.
While a marketer doesn’t have much control over the stock algorithms in their platforms, they do have a say once they start entertaining custom solutions. And it wasn’t until recently that anyone outside of analytics-related roles really questioned what types of algorithms were being used in their solutions. I applaud this curiosity. Non-technical marketers upping their algorithm game will help with solution evaluation, foster more strategic discussions with a broad group of teams, and even impress a few teammates. But to really make an impact, marketers should understand if and how these different approaches impact outcomes for their brands.
Before digging into if they impact outcomes (spoiler – they totally do), let’s do a quick crash course on some of the different algorithms. First, algorithms fall into one of two main categories, supervised or unsupervised. Supervised learning methods aim to find a specific target which exists in the data. Conversely, unsupervised learning methods do not require a specified target, rather they make observations of the data and group together similar points.
Some popular machine learning methodologies include:
- Logistic Regression: Estimates the log-odds of the probability of binary response (bought/didn’t buy). A classic and commonly used algorithm, but one that doesn’t capture more complex, nonlinear effects or interactions among variables.
- Decision Trees: Predicts the likelihood of an action in an upside-down tree pattern. The algorithm chooses one predictor and its splitting point which results in the purest nodes below. This method can identify distinct groups, capturing nonlinearity and interactions. There are several variations, including:
- Random Forests – A collection of many (hundreds) of decision trees. Each tree is trained independently, considering a small subset of randomly selected predictors at each branch. Results of the trees are averaged together, providing a very stable solution.
- Gradient Boosted Trees – Instead of building each tree independently, this method builds a succession of trees, each trying to improve the results of the previous tree. This is a very strong method, requiring less code and providing high accuracy.
- Support Vector Machines (SVM): Separates behaviors by constructing a hyper plane which slices through the data and provides the best separation between the two groups. SVMs produce significant accuracy with less computation power.
This is by no means an extensive list but provides a good starting point. It is most important to grasp that each method aims to solve the same question or problem in slightly different ways, therefore providing slight differences in the predicted outcomes. In most cases, each will iteratively refine itself, until the model can no longer be improved. With a baseline understanding, skilled data scientists can further guide you on the nuances of each when it comes time to build your custom solution.
At this point you may be wondering, do these differences matter? The short answer, yes. So, the question then shifts to how does a marketer ensure the best outcome? Understand that a data scientist won’t always know upfront what will work best, but discovery will light the way. The algorithms chosen by a modeler are influenced by two primary considerations; 1. What are we trying to predict/what question are we asking? And 2. What data do we have access to? The answers to each of these may limit or expand the options available to them. Modelers may have a gut instinct or belief about which methodologies might work best, but the data may tell a different story than anticipated. The marketers understanding of the business objective, the marketplace and more can empower data scientists to make more informed decisions throughout the development process.
Since methodologies matter, and the data holds the key, often the best approach is to explore more than one approach or algorithm. The Alliant Data Science team uses advanced workflows to simultaneously build multiple models using different methodologies. This approach provides several benefits:
- Reduced subjectivity: the ability to see predicted performance, model fit metrics, and more across each method allow the team to see which will strongest
- Less limitations: can bring together supervised and unsupervised methods, like using clustering and dimension reduction to inform subsequent supervised learning steps
- More creativity: data scientists can test new and interesting ways to combine multiple methodologies into a unique solution
Whether you start with just one method or test an ensemble approach like the one introduced above, your marketing goals, data and infrastructure will illuminate which is best for you. Maintain a human element and partner cross-functionally to obtain the best results, gathering input and alignment across business, marketing, data science and technology teams.
Interested in learning more about how advanced custom modeling can improve your multichannel marketing? Feel free to reach out to us and our team will be in touch!
ABOUT THE AUTHOR
Malcolm Houtz, VP of Data Science
As head of Alliant’s data science team, Malcolm balances a high volume of complex model development and analytic projects every month — while maintaining a laser focus on innovation and excellence. Malcolm is a critical thinker with an insatiable curiosity for new statistical techniques. His background as a master statistician and as an entrepreneur gives him a unique, business-oriented perspective on data mining and modeling. Prior to Alliant, Malcolm held data analyst and model development positions with Time Warner Cable, Pitney Bowes and Reed Exhibitions.