# Towards optimal personalization: synthesisizing machine learning and operations research

August 30, 2016 · 17 minute read

Last post I talked about how data scientists probably ought to spend some time talking about optimization (but not too much time - I need topics for my blog posts!). While I provided a basic optimization example in that post, that may have not been so interesting, and there definitely wasn’t any machine learning involved.

Right now, I think that the most exciting industrial applications of optimization are those that synthesize machine learning and optimization in order to obtain optimal personalization at scale.

Here, I’ll talk about a more concrete use case of this synthesis that you might see at a company.

## All the ML and nowhere to go

Let’s say you start working at a Software-As-A-Service (SAAS) company, and you end up in a meeting with the Marketing team. Everybody’s talking about churn. Marketing has been trying all sorts of things - they’ve sent coupons, they’ve called customers, they’ve sent emails, and everything else in order to decrease churn. Some things work, some things are expensive, and there are lots of questions. Nobody knows SQL, so you offer to look at the data.

It turns out that it seems like there might be some clear differences in customers who eventually churn and customers who do not. You offer to build an algorithm to predict customer churn broken out by intervention medium (e.g. email, phone call, no intervention, etc…).

You get the greenlight to hack away. Of course, this takes much longer than you or Marketing expects (because pretty much all machine learning does), but in the end you’re left with multiple classification models that are well-tuned with a bunch of features.

You’re in a great place. You actually built machine learning models that work.

But what now?

You can go the common route. You write a long script that will run the churn model every so often and populate a database with the results. You tell Marketing and everybody else that this information is now available, and you hope that they will use it.

And they might.

Or, those numbers will sit there.

Or, Marketing will randomly target the top X% of people most likely to churn with their expensive intervention (say, phone call) and email the rest.

None of this is optimal.

## Optimization to the rescue

When there’s lots of decisions to make and there’s a clear goal, then optimization is a great friend to have. The goal here is to prevent churn. We will have some constraints (mainly money). Let’s make up some data and walk through how to solve this in python.

## Defining (making up) the problem

We’ll assume that we have 4 different types of churn prevention messages at 4 different prices:

Media Price
Email 0.25