From the company’s creation, Netflix has relied on the scalability and accuracy of machine learning to deliver content and turn profits. One way Netflix uses machine learning is to recommend movies to its users. A model that provides accurate and tailored recommendations at scale is valuable because it increases the value of Netflix subscriptions at low cost. The company decided to host a competition to find a better recommendation model, offering $1 million for a submission that reduces errors by at least 10%. Three years and 44,000 submissions later, they found a winner.
The Netflix Prize led to improvements in Netflix’s recommendation system, technical developments, friendships and collaborations, and even new companies. Yet it failed to deliver a model that Netflix could use. The competition incentivized performance on static data, rather than performance in deployment. The winning model was so complex and difficult to update that Netflix decided not to use it.
Netflix’s experience sounds familiar. We’ve deployed multiple projects, and each time we find new challenges. We’d like to know how others have dealt with these issues, but to our disappointment, there isn’t much out there. Try searching Google. You’ll see what we mean.
It’s important to get deployment right. An otherwise good model can fail and, more importantly, do serious harm if the deployment is not handled well. Here are just a few issues to consider:
|Area||Example Issues||Potential Consequences|
|Cost to Use||
Successful organizations already develop policies, assign responsibilities and authorities, develop and depend on trust, make cost-sensitive decisions, and monitor performance when it comes to their employees. But these things happen too rarely with machine learning. A deployed machine learning system is essentially a living thing and should be treated as such, providing it with constant care and attention.
Given how little has been written about deploying machine learning models, we decided to write about our experience deploying the first data-driven early intervention system for police officers. We plan to release blog posts outlining lessons we’ve learned in the coming weeks, covering human and technical aspects. We hope you find them useful, and we look forward to hearing how others build on them. Please check back regularly to catch the latest.