How to Learn from Data (with a Wrong Model)

Machine learning (ML) and data analytics are rapidly changing today’s business landscape.  With the power of big data, Amazon can find products you like based on your purchase history, Netflix can make recommendations from a library of thousands of movies, and Uber can adjust prices instantly to ensure there is enough driver supply during peak hours.  ML’s success is mostly due to its ability to make accurate predictions, which is made possible by a combination of smart machine learning algorithms, massive datasets, and a lot of computational power.

However, while making good predictions are important, real-world business problems also require ML algorithms to make good decisions, which ultimately affect companies’ bottom line.  In applications such as displaying products, recommending movies and setting prices, an ML algorithm needs to make a sequence of decisions to interact with customers. 

For example, suppose you are an online fashion retailer who wants to use ML to set prices for men’s shirts. When a customer arrives to the website, a pricing algorithm suggests prices for all the shirt styles. The customer may choose to buy one shirt, multiple shirts of different styles, or leave without any purchase.  Because the customer’s choice tells us something about your prices (e.g., perhaps they are set too high) as well as something about seasonal and market trends, the ML algorithm must take the customer’s response into account. It is also important to notice that the customer’s response may change the behavior of ML algorithm, thus the prices received by future customers. The whole sequence can be thought of as a “closed loop feedback” process shown below.

The Caveat of Wrong Models

When an ML algorithm makes a decision, it often relies on some model, which is a way for computers to understand our world.  In practice, managers or business software often use incorrect models that don’t precisely reflect the real business problems. This issue is known as model misspecification.

There are several reasons for model misspecification: perhaps the manager does not fully understand the problem, or perhaps the software needs to use a simplifying model to save computation time.  Another reason for model misspecification is caused by the sheer size of data we have today: large-scale dataset may contain various kinds of information such as product characteristics, customer types, and economic conditions of the market. These features can affect customer choice in a complex way that we may never fully understand, so we inevitably use some wrong model.

The caveat of using a wrong model is that it will affect the “closed loop feedback” process we just showed. The new process is shown in the picture below. The computer algorithm will keep updating the wrong model, unaware that it may be far away from the truth.  In the meantime, of course, data are still generated by the true model (i.e., the real world). The discrepancy between the two models will cause bias in the learning process, which leads to poor decisions made by ML algorithms.

New closed loop process

Solving the Model Misspecification Challenge

In a recent paper appeared in Management Science, Dynamic learning and pricing with model misspecification, Nambiar, Simchi-Levi and Wang try to tackle the task of learning with misspecified models.  Even though sometimes a wrong model is unavoidable, the key question is: can an algorithm learn from data and get as close as possible to the true model?  If so, this implies that the decisions produced by the algorithm are almost as good as the optimal decision, the decisions that we would have made if we knew the true model.

Using dynamic pricing as problem context, the authors have proposed to inject random price shocks to their algorithm’s pricing decision. The intuition behind these price shocks is related to the concept of “instrumental variables” that are widely used in econometrics to correct for biased estimations.

Their analysis shows that random shocks does indeed exhibit strong numerical and theoretical performance. Moreover, the authors have also shown that the algorithm is versatile and can be adapted to a number of common business settings: for example, the seller may specify some business constraints on what prices are allowed to be offered.

The authors also demonstrated the real-world applicability of their algorithm through a case study in collaboration with Oracle Retail, involving a large fashion retail dataset.  Using historical data, they have performed tests gauging the performance of the random price shock algorithm.  The results are very promising and show that the algorithm can earn 8-20% more revenue than competing algorithms over a period of 35 weeks. 

Read the full article at

Nambiar M, Simchi-Levi D, Wang H (2019). Dynamic learning and pricing with model misspecification. Management Science 65(11):4980-5000.