Arnab Bhadury ML Engineer @ Flipboard, ML meetups co-organizer Interested in Topic Modeling/Extraction, Recommender Systems, Bayesian Inference, Counterfactual Evaluation and endless sarcasm.

A Case For Embeddings In Recommendation Problems

Once you have worked on different machine learning problems, most things in the field start to feel very similar. You take your raw input data, map it to a different latent space with fewer dimensions, and then perform your classification/regression/clustering. Recommender systems, new and old, are no different. In the classic collaborative filtering problem, you factorize your partially filled usage matrix to learn user-factors and item-factors, and try to predict user ratings with a dot-product of the factors.

Matrix Factorization

Simulating A/B tests offline with counterfactual inference

Random uncorrelated graph

While developing machine learning (ML) algorithms in production environments, we usually optimize a function or a loss that has nothing to do with our business goals. We generally care about metrics such as click-through-rate or diversity or ad-revenue but our machine learning algorithms often minimize log-loss or root mean squared error (RMSE) of arbitrary quantities for computational ease. It is often seen that these ML metrics don’t correlate with the original targets, which is why running online A/B tests is so vital in production because it lets us verify our ML models against the true objectives of a task.

However, running A/B tests is expensive because it requires some productionized version of an experiment, which needs to run for a significant amount of time to get reliable results during which you risk potentially exposing a poorer system to the users. This is why reliable offline evaluation is absolutely critical – it encourages more experimentations in a sandbox environment as well as help in gaining intuitions on what is worth launching/testing in the wild.