Model Development is a Team Sport

Model Development is a Team Sport

Modern marketers are no stranger to using modeling to help them understand, target, and extend their audiences. What modeling or analytics means to people though may differ by their company or which channel their team is focused on. Traditional direct marketers have leveraged customized predictive modeling for many years, while digitally focused teams have leaned largely on the out-of-the-box modeling tools within their DSPs, DMPs, CDPs and social platforms. Whichever camp you fall within, the growing need to harness 1st, 2nd and 3rd party data for competitive advantage means that more bespoke and forward-thinking solutions will be required for future success.

While hearing the term “modeling” or “analytics” may invoke thoughts of a data scientist developing algorithms in a quiet room somewhere, I promise you there is much more involved. As audience modeling becomes more customized, all business teams play a role in the development of successful solutions, even non-technical teams. This does not mean workflows have to be more complex or have too many cooks in the kitchen, but the best outcomes will be realized when all sides understand their role.

The Model Dev Teams

Below is a quick breakdown of the teams that influence the model development process and how they impact the outcomes. These groups may all be a part of one company, or represent a mix of in-house and external teams.

Data Integration

Integration specialists are the front line of a successful solution. They ensure that data flowing in from internal and external sources is transformed, cleansed and matched appropriately. Additionally, they can call out any potential issues with the data sets and provide useful metrics or reporting for all other teams. As consumers have fragmented across channels, data integration has become more complex, but when done accurately it creates powerful analytic data sets. The data integration team may also be responsible for rolling-up granular data into useful predictors. The accuracy of this aggregation, and the types of predictors created, can make or break a modeling project.

Data Analysts

Analysts are an extension of the data science and business teams, using their understanding of the business objective(s) to compile the best data for model development. They prepare the development samples, keeping an eye out for suspicious or insufficient counts, incomplete data, unexpected match rates, data recency, undefined dependent variables and more to ensure a successful build.

Compliance/Data Governance

Data Governance teams provide guidance on the types of data that is available for specific use cases. They are an essential partner to all teams, especially when building solutions that need to adhere to specific regulations such as FHA/FLA. Consulting with data governance teams early will provide assurance that the modeled solutions are compliant and applied ethically.

Account Teams/Strategists

Business teams communicate the objectives of the campaign or initiative and serve as a central point of contact between all teams. The more they share about the target audience, what has and hasn’t worked historically, current state of their business/marketplace and any other influencing factors will arm the analysts and data scientists with the knowledge they need to build the best solution. And if you are on one of these teams, know that you can also influence the data science team by sharing different ideas during consultations.

Data Science/Analytics

Ultimately the team that builds the models. Quality data scientists seek out opportunities to speak with all teams to understand the data, the product/offer, objectives, campaign creative and success metrics. Along the way they will provide transparent details about their approach, identifying areas that require additional attention, and recommend the best path forward. This consultative approach reinforces cross-team ownership, ensuring the strongest possible solution that aligns with the goals outlined upfront. The data will lead the way, but if a data scientist does not factor in all other team inputs, there will be missed opportunities.

Campaign Managers/Media Buyers

The teams activating the data assess performance in the real-world. They are a critical piece of the feedback loop that influences future model iterations. They can provide a different perspective on where things worked/didn’t work, and call out other interesting factors that may have influenced performance.

As you can see, modeling is more than just a data scientist manipulating data they are provided. If you are someone who interfaces with analytic teams, hopefully this provides you a different perspective on your role in the process. Remember, teamwork makes the dream work!

Interested in learning more about Alliant’s consultative model development process? Reach out to us and we’ll be in touch!

ABOUT THE AUTHOR

Malcolm Houtz, VP of Data Science

As head of Alliant’s data science team, Malcolm balances a high volume of complex model development and analytic projects every month — while maintaining a laser focus on innovation and excellence. Malcolm is a critical thinker with an insatiable curiosity for new statistical techniques. His background as a master statistician and as an entrepreneur gives him a unique, business-oriented perspective on data mining and modeling. Prior to Alliant, Malcolm held data analyst and model development positions with Time Warner Cable, Pitney Bowes and Reed Exhibitions.

 

The Power in Marketers Understanding Predictive Modeling Methodologies

The Power in Marketers Understanding Predictive Modeling Methodologies

Marketers are surrounded by predictive modeling and machine learning. Whether it’s the underlying algorithm(s) powering campaigns in DSPs, suggested subject lines in marketing automation platforms, or custom models built specifically for various business objectives, predictive modeling is everywhere.

While a marketer doesn’t have much control over the stock algorithms in their platforms, they do have a say once they start entertaining custom solutions. And it wasn’t until recently that anyone outside of analytics-related roles really questioned what types of algorithms were being used in their solutions. I applaud this curiosity. Non-technical marketers upping their algorithm game will help with solution evaluation, foster more strategic discussions with a broad group of teams, and even impress a few teammates. But to really make an impact, marketers should understand if and how these different approaches impact outcomes for their brands.

Before digging into if they impact outcomes (spoiler – they totally do), let’s do a quick crash course on some of the different algorithms. First, algorithms fall into one of two main categories, supervised or unsupervised. Supervised learning methods aim to find a specific target which exists in the data. Conversely, unsupervised learning methods do not require a specified target, rather they make observations of the data and group together similar points.

Some popular machine learning methodologies include:

  1. Logistic Regression: Estimates the log-odds of the probability of binary response (bought/didn’t buy). A classic and commonly used algorithm, but one that doesn’t capture more complex, nonlinear effects or interactions among variables.
  2. Decision Trees: Predicts the likelihood of an action in an upside-down tree pattern. The algorithm chooses one predictor and its splitting point which results in the purest nodes below. This method can identify distinct groups, capturing nonlinearity and interactions. There are several variations, including:
    1. Random Forests – A collection of many (hundreds) of decision trees. Each tree is trained independently, considering a small subset of randomly selected predictors at each branch. Results of the trees are averaged together, providing a very stable solution.
    2. Gradient Boosted Trees – Instead of building each tree independently, this method builds a succession of trees, each trying to improve the results of the previous tree. This is a very strong method, requiring less code and providing high accuracy.
  3. Support Vector Machines (SVM): Separates behaviors by constructing a hyper plane which slices through the data and provides the best separation between the two groups. SVMs produce significant accuracy with less computation power.

This is by no means an extensive list but provides a good starting point. It is most important to grasp that each method aims to solve the same question or problem in slightly different ways, therefore providing slight differences in the predicted outcomes. In most cases, each will iteratively refine itself, until the model can no longer be improved. With a baseline understanding, skilled data scientists can further guide you on the nuances of each when it comes time to build your custom solution.

At this point you may be wondering, do these differences matter? The short answer, yes. So, the question then shifts to how does a marketer ensure the best outcome? Understand that a data scientist won’t always know upfront what will work best, but discovery will light the way. The algorithms chosen by a modeler are influenced by two primary considerations; 1. What are we trying to predict/what question are we asking? And 2. What data do we have access to? The answers to each of these may limit or expand the options available to them. Modelers may have a gut instinct or belief about which methodologies might work best, but the data may tell a different story than anticipated. The marketers understanding of the business objective, the marketplace and more can empower data scientists to make more informed decisions throughout the development process.

Since methodologies matter, and the data holds the key, often the best approach is to explore more than one approach or algorithm. The Alliant Data Science team uses advanced workflows to simultaneously build multiple models using different methodologies. This approach provides several benefits:

      • Reduced subjectivity: the ability to see predicted performance, model fit metrics, and more across each method allow the team to see which will strongest
      • Less limitations: can bring together supervised and unsupervised methods, like using clustering and dimension reduction to inform subsequent supervised learning steps
      • More creativity: data scientists can test new and interesting ways to combine multiple methodologies into a unique solution

Whether you start with just one method or test an ensemble approach like the one introduced above, your marketing goals, data and infrastructure will illuminate which is best for you. Maintain a human element and partner cross-functionally to obtain the best results, gathering input and alignment across business, marketing, data science and technology teams.

Interested in learning more about how advanced custom modeling can improve your multichannel marketing? Feel free to reach out to us and our team will be in touch!

ABOUT THE AUTHOR

Malcolm Houtz, VP of Data Science

As head of Alliant’s data science team, Malcolm balances a high volume of complex model development and analytic projects every month — while maintaining a laser focus on innovation and excellence. Malcolm is a critical thinker with an insatiable curiosity for new statistical techniques. His background as a master statistician and as an entrepreneur gives him a unique, business-oriented perspective on data mining and modeling. Prior to Alliant, Malcolm held data analyst and model development positions with Time Warner Cable, Pitney Bowes and Reed Exhibitions.

 

5 Reasons to Maintain a Human Element in Marketing Data

5 Reasons to Maintain a Human Element in Marketing Data

Analytic and modeling features permeate the many SaaS platforms in the marketing industry and advances in technology have made machine learning tools widely accessible. As the industry leans into data-driven marketing and predictive modeling, marketers often find themselves relying heavily on automated analytic tools without realizing the potential hazards. Whether starting off with a “magic box” lookalike modeling solution, or implementing advanced analytic workflows leveraging multichannel data, maintaining a human touch is critical to success.

 

Having a human element will also empower you to build and test several different solutions, ultimately choosing the best.

A useful analogy for approaching automated analytic tools handle and activate data similar is to consider how a Tesla owner might use some of the advanced features of the vehicle. Auto-pilot is a very real feature that can assist them down the road, and fully-automated self-driving is on the horizon, but for now the driver still need to be present to helm the wheel. It may be tempting to look away for an extended time, but keeping human eyes on the road is a better guarantee of a safe arrival. Similarly, a ‘set it and forget it’ approach to data activation will never deliver the marketing results we desire. Expert data scientists and strategists can add unmatched value to your marketing execution — and ultimately, to your bottom line.

Whether you have in-house data experts, lean -n partners, or are wondering if you should add a data scientist to your team, here are five important reasons why you need human involvement in your data workflows:

Reason #1: Algorithms aren’t consultative

Machine learning tools are great at ingesting enormous quantities of data and making sense of it. However, an algorithm can only make assessments based on the data it is supplied. ML tools can’t look outside of that data set and consider evolving regulations or cultural changes. They can’t help you develop the right objectives or success metrics. And they won’t be able to look at the impact of your unique business processes — a data scientist who understands the nuances of your business, from product, to creative and compliance, will be able to maximize the value of the data.

The consultation should start before a model is even estimated, beginning with defining the model development data set and the appropriate dependent variable for the chosen KPI. Aligning available data sources like past campaign responders or various lead streams with the campaign objective will help narrow the development data set. Consulting with a data expert can help you isolate the right dependent variable while also providing guidance on what may happen when you choose one over another. For example, how modeling for higher LTV customers will lower short-term response rates. Data scientists can also add value by suggesting a screen — or secondary model for other lagging indicators. Doing so can help balance results, protecting one KPI from tanking while another thrives. Without a human element to your analytics, the opportunity to have strategic conversations may be missed.

Reason #2: Creativity isn’t reserved solely for Marketing teams

Plug and play solutions are one-size fits all and often lack imagination when it could be beneficial. What if you don’t have the exact data points in your model development sample to create the desired dependent variable? If you only have a “magic box” modeling tool, then you’ve hit a wall and are out of luck. A qualified analyst can evaluate the situation and potentially construct a proxy using available data. While having rich model development data is ideal, a creative approach can push you forward when you would otherwise be stuck.

A human element can also empower you to build and test several different solutions. Various test cases can evaluate different algorithms, dependent variables, screens, input data sets and more. Quick and easy modeling tools don’t synthesize new ideas or applications. If you are in need of something different or additive to an existing solution, rerunning within the same template is not going to generate different results.

Reason #3: QA won’t happen by itself

Remember the old adage, garbage in – garbage out. If flawed input data flows into the system it will be subject to all sorts of issues, and essentially rendered useless. Worse, bad data may go unnoticed and any models generated would be sub-optimal. Having a team to manage data hygiene and identify potential errors will save you many headaches during development and execution. This is especially true if you are matching data sets from different databases or silos within your organization. Being hands-on with the data early on will also provide an opportunity to evaluate which data sets will drive the best results, and which might just be noise.

Similarly, having a team that monitors and validates model results is necessary as well. This might be a new concept for those that have only used platform-based modeling where you don’t have a chance for QA. Even with clean and correct input data, it is possible for things to go awry in processing. A trained eye will be able to evaluate QA reports to validate model outputs, and further investigate any outliers or anomalies. Let’s say you are modeling for digital buying behavior, but you decide to include customers who had also ordered offline in the model development sample to bolster the seed size. A human would assess if the model became too biased towards the offline behavior and adjust as needed. All of this will provide you further assurance when it comes time to activate.

Reason #4: More advanced models require analytic expertise

Lookalike modeling is a powerful tool and one that Alliant often deploys for clients. But with constant evolution of technologies and strategies, there are many powerful new data analysis and modeling techniques available. As your business evolves you will likely want to take advantage of these and begin predicting performance for specific KPIs, or leveraging ensemble methods. For instance multi-behavioral can optimize for multiple consumer actions. Innovative applications like these require more than a simple upload of data into a lookalike modeling solution.

Reason #5: Things won’t always go as planned

If 2020 has taught us anything, it is that you can never be 100% sure of what will happen once you go live. Having resources available to assess the situation and make adjustments on the fly can turn potential errors into positives. In uncertain times it is unlikely you will have a data set to assist with prediction. It is ultimately up to humans to figure out how to adjust — and bring machine learning tools along for the ride.

Interested in learning more about how you can partner with Alliant’s data scientists to build custom data solutions? Contact us at any time! Our team has been on an analytic evolution, enabling the data scientists to take predictive modeling to new places and ultimately creating stronger solutions for our partners.

ABOUT THE AUTHOR

Malcolm Houtz, VP Data Science

As head of Alliant’s data science team, Malcolm balances a high volume of complex model development and analytic projects every month — while maintaining a laser focus on innovation and excellence. Malcolm is a critical thinker with an insatiable curiosity for new statistical techniques. His background as a master statistician and as an entrepreneur gives him a unique, business-oriented perspective on data mining and modeling. Prior to Alliant, Malcolm held data analyst and model development positions with Time Warner Cable, Pitney Bowes and Reed Exhibitions.

The Need for Speed

The Need for Speed

Driven by our own machine learning experience, the Alliant Data Science team has found ample opportunity to create efficiencies in custom model development, recently upgrading its SAS platform to include SAS VIYA. VIYA’s multi-threaded processing, new SAS procedures, and efficient sampling methods allow Alliant’s team to deliver custom models in about ¼ the time of traditional workflows, in some cases reducing processing time from days to hours, and hours to minutes.   Hours and minutes drag on compared to machine learning speeds, but the human touch and hand crafted nature of these predictors and models, produce incomparable results — and now, faster than ever.

Dimension Reduction — VIYA replaces the iterative and time consuming process of deciding which of the tens of thousands of candidate predictors to include in a model with a single procedure. Processing time for dimension reduction dropped from four hours to about 30 minutes.

Final Variable Selection — Choosing a model’s final variables requires a complex analysis of the relationship of each to the dependent variable — Viya’s reduced sample sizes and parallel processing capabilities shorten this step by 85%. 

Hours and minutes drag on compared to machine learning speeds, but the human touch and hand crafted nature of these predictors and models, produce incomparable results — and now, faster than ever.

Model QA — Before a model is transferred into the Alliant production library, data scientists stage a dummy production run. All data points in the test results are compared to the development file. More efficient sampling records, new SAS procedures and leveraging VIYA’s multi-threaded environment reduces this critical step by 88%.

The new development environment not only offers superior data mining results, but it is dramatically improving productivity and model efficacy for DataHub Members.

ABOUT THE AUTHOR

Malcolm Houtz, VP Data Science

As head of Alliant’s data science team, Malcolm balances a high volume of complex model development and analytic projects every month — while maintaining a laser focus on innovation and excellence.  Malcolm is a critical thinker with an insatiable curiosity for new statistical techniques. His background as a master statistician and as an entrepreneur gives him a unique, business-oriented perspective on data mining and modeling. Prior to Alliant, Malcolm held data analyst and model development positions with Time Warner Cable, Pitney Bowes and Reed Exhibitions.