Enhancing A/B Testing at DoorDash with Multi-Armed Bandits

"While experimentation is essential, traditional A/B testing can be excessively slow and expensive, according to DoorDash engineers Caixia Huang and Alex Weinstein. To address these limitations, they adopted a "multi-armed bandits" (MAB) approach to optimize their experiments. When running experiments, organizations aim to minimize the opportunity cost, or regret, caused by serving the less effective variants to a subset of the user base."

"For our purposes, this strategy allocates experimental traffic toward better-performing variants based on ongoing feedback collected during the experiment. The core idea is that an automated MAB agent continuously selects from a pool of actions, or arms, to maximize a defined reward, while simultaneously learning from user feedback in subsequent iterations. This strategy enables a balance between exploration, i.e., learning about all candidate options, and exploitation, i.e., prioritizing the best‑performing options as they emerge, until the experiment converges on the best option."

Traditional A/B testing uses fixed traffic splits and predetermined sample sizes that remain unchanged throughout experiments, causing continued exposure to inferior variants even after a clear winner emerges. Opportunity cost, or regret, compounds as the number of concurrent experiments increases, incentivizing sequential runs and slowing iteration. Multi-armed bandit (MAB) methods adaptively allocate traffic toward better-performing variants based on ongoing feedback. An automated MAB agent repeatedly selects among actions to maximize a defined reward while learning from user feedback, balancing exploration and exploitation until the experiment converges on the best option, reducing waste and accelerating learning.

#multi-armed-bandits #ab-testing #experiment-optimization #regret-minimization

Read at InfoQ

Unable to calculate read time

Collection

[

...

]

Enhancing A/B Testing at DoorDash with Multi-Armed BanditsEnhancing A/B Testing at DoorDash with Multi-Armed Bandits Briefly

Enhancing A/B Testing at DoorDash with Multi-Armed Bandits
Enhancing A/B Testing at DoorDash with Multi-Armed Bandits
Briefly