Debiased Balanced Interleaving at Amazon Search

Nan Bi; Pablo Castells; Daniel Gilbert; Slava Galperin; Patrick Tardif; Sachin Ahuja

doi:10.1145/3511808.3557123

What is it about?

This paper studies an online evaluation method called Interleaving for search ranking algorithms, which has shown to be orders of magnitude more sensitive than traditional A/B tests. The simple interleaving method can produce biased and incorrect result. Other methods that propose to fix the bias are either less sensitive or not implementable in large scale systems such as Amazon. We introduce a novel interleaving method that is unbiased, sensitive, and simple to implement. We report a 60x sensitivity gain of our new method over A/B testing, based on 10 large-scale e-commerce experiments on billions of search queries. We analyze the theoretical and empirical properties of our method, and compare with alternative interleaving techniques in the context of large-scale experiments.

Photo by Marques Thomas on Unsplash

Why is it important?

A/B testing has become a bottleneck to search ranking innovations because online experiments typically take several weeks and large chunks of search traffic to reach meaningful (i.e. statistically significant) conclusions. Our novel interleaving method achieves a 60x sensitivity gain over A/B testing, so that we can effectively evaluate the same ranking innovations with much shorter time and less search traffic. That also means less user exposure to potentially suboptimal innovations. Additionally, our method is implementable for large-scale experiments, making it a game changer in speeding up search innovations in the e-commerce setting.

This page is a summary of: Debiased Balanced Interleaving at Amazon Search, October 2022, ACM (Association for Computing Machinery),
DOI: 10.1145/3511808.3557123.
You can read the full text:

Read

Contributors

The following have contributed to this page

Fast online evaluation method for search ranking algorithms

What is it about?

Why is it important?

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Fast online evaluation method for search ranking algorithms

What is it about?

Featured Image

Why is it important?

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management