Marketers are constantly seeking scalable and intelligent solutions when trying to gain an edge in the increasingly competitive marketing conditions. It is no wonder artificial intelligence (AI) and machine learning (ML) are now being adopted en masse by brands and their marketing organizations. (To learn more on the basics of ML, check out Machine Learning 101.)

For the uninitiated, AI can be generally considered as a technology when a computer automates the defined tasks a human would otherwise do. Machine learning, as a functional area within AI, is when a computer is given an end goal, but needs to calculate the best route on its own.

Today, we are seeing these technologies – especially machine learning – deployed across many areas of marketing, including ad fraud detection, forecasting consumer behavior, recommendation systems, creative personalization and more.

While that’s all well and good, there’s a new offshoot technology that, for marketers, is going to truly deliver on the demand that machine learning is creating. It’s called “reinforcement learning” (RL).

What Is Reinforcement Learning?

The step-change from ML to RL is more than just a letter. Most tasks handed off to machine learning involve using a single step, such as “recognize this image,” “understand book content” or “catch fraud.” For a marketer, a business goal such as “attract, retain and engage users” is inherently a multi-step and long-term one, not easily achieved with machine learning.

This is where reinforcement learning comes in. RL algorithms are all about optimizing for an unfolding and ever-changing journey – one where dynamic problems occur. By employing a mathematical “reward function” to calculate the outcome of each permutation, RL can see into the future and make the right call.

Today, the best embodiments of this cutting-edge technology can be seen in games and self-driving cars. When Google’s AlphaGo system beat the world’s best player of the board game Go last year, their secret sauce was reinforcement learning. Whilst games have set rules, a player’s options for the route toward victory changes dynamically based on the state of the board. With reinforcement learning, the system accounts for all possible permutations that might change based on each next move.

Similarly, a self-driving car goes on a journey in which the rules of the road and the location of the destination remain fixed, but the variables along the way – from pedestrians to road blocks to cyclists – change dynamically. That’s why OpenAI, the organization founded by Tesla’s Elon Musk, employs advanced RL algorithms for its vehicles.

Machines for Marketers

What does any of this mean for marketers?

Many marketers’ core challenges are created by the fact that the business condition changes all the time. A winning campaign strategy can become unfavored over time, while an old strategy can gain new traction. RL is a step toward mimicking the true human intelligence where we learn from the success and/or failure of multiple outcomes, and form a winning strategy of the future. Let me give some examples:

1. User engagement enhancement

Let’s focus on customer engagement for a restaurant chain, and a goal to multiply it tenfold over the next year. Today, a marketing campaign might involve sending a birthday greeting with a discount offer, perhaps even based on food preferences. This is linear thinking where the marketer has defined a start and end point.

In a busy world, customers’ lives are constantly changing in real-time – sometimes they are more engaged, sometimes less. In reinforcement learning, a system would constantly be recalibrating which tactics in the marketing armory, at any given moment, stand the best chance of moving the recipient toward the ultimate goal of 10x engagement.

2. Dynamic budget allocation

Now imagine an advertising scenario in which you have a $1 million budget and need to spend some every day until the month’s end, allocated across four different channels: TV, loyalty promotions, Facebook and Google. How can you ensure you’re spending the budget in the most optimal way? The answer depends on the day, the target users, the inventory price and a host of other factors.

In reinforcement learning, algorithms would use historical ad outcome data to write reward functions that score certain spending decisions. But it also accounts for real-time factors like pricing and the likelihood of positive reception from the target audience member. Through iterative learning, the allocation of ad spend throughout the month would dynamically change. Though the ultimate goal is set, RL will have allocated budget in the best way possible through all scenarios. (For more on AI in marketing, see How Artificial Intelligence Will Revolutionize the Sales Industry.)

Coming Soon

Reinforcement learning acknowledges complexity and recognizes that people are heterogeneous and accounts for these truths, improving each next action over time as the pieces of your game board change around it.

Reinforcement learning is still largely the preserve of research projects and leading-edge adopters. The mathematics concept and technique has been around for over 40 years, but has not been possible for deployment until relatively recently, thanks to three trends:

  1. Proliferation of computing power through high-powered graphics processing units (GPUs).

  2. Cloud computing makes high-end processor power available at a fraction of the cost of buying the GPUs themselves, allowing third parties to rent a GPU to train their RL model for several hours, days or weeks at a relatively bargain-basement price.

  3. Improvement in either numerical algorithms or smart heuristics. A few critical numerical steps in an RL algorithm are now able to converge at a much faster pace. Without these magical numerical tricks, they would still not be feasible even with today's most powerful computers.

Thinking Bigger

All of this means the new powers of reinforcement learning are soon going to be available at scale to brands and marketers. However, embracing it is going to require a shift in mindset. For a marketing manager, this technology means the ability to take their hands off the wheel.

Every business has a goal, but when you are deep in the trenches, the daily actions taken toward that goal can become fuzzy. Now RL technology will allow decision-makers to set the goal, having more confidence that the systems will plot their best course toward it.

In advertising, for instance, these days many people realize that metrics like click-through rate (CTR) are merely proxies for true business outcomes, counted only because they are countable. RL-driven marketing systems will de-emphasize such intermediary metrics and all the heavy lifting that is associated with them, allowing bosses to focus on objectives.

This will require businesses to think about their big problems in a much more proactive and long-term way. When the tech is mature, they will achieve their goal.

Path to Adoption

Reinforcement learning isn’t ready for full-scale use by brands yet; however, marketers should take time to understand this new concept that could revolutionize the way brands do marketing, making good on some of the early promises of machine learning.

When the power arrives, it will come in marketing software with a user interface, but the tasks required by that software will be radically simplified. For staff, there will be less moving switches and inputting numbers, as well as less reading analytics reports and acting on them. Behind the dashboard, the algorithm will be handling most of that.

It is unlikely that RL can match human intelligence right out of the gate. Speed of its development would depend on the feedback and suggestions from the marketers. We must ensure that we are asking a computer to solve the right problem, and penalizing it when it does not. Sounds like how you would teach your own child, doesn’t it?