From Minutes to Seconds: Uber Boosts MySQL Cluster Uptime with Consensus Architecture
Briefly

From Minutes to Seconds: Uber Boosts MySQL Cluster Uptime with Consensus Architecture
"Previously, Uber ran MySQL clusters in a single-primary, asynchronous replica model. External systems detected failures and promoted replicas, resulting in failover times measured in minutes. To reduce downtime and improve reliability, Uber adopted MySQL Group Replication, a Paxos-based consensus protocol. The new architecture embeds consensus within the database itself, forming a three-node MGR cluster."
"One node serves as primary for writes, while the other two secondaries participate in consensus without accepting direct writes. This ensures that all nodes maintain up-to-date data and can automatically elect a new primary if needed. Scalable read replicas fan out from the secondaries, separating read scaling from write availability while preserving fault tolerance."
"Flow control within MGR monitors transaction queues on each secondary and signals the primary to pause, or throttle writes as needed, preventing nodes from falling behind. This mechanism avoids replication inconsistencies, reduces write downtime during failover, and prevents errant GTIDs from propagating outside the cluster."
Uber replaced its single-primary asynchronous MySQL replica model with MySQL Group Replication (MGR), a Paxos-based consensus protocol embedded within the database. The new architecture uses three-node clusters where one node serves as primary for writes while two secondaries participate in consensus and can automatically elect a new primary if needed. This eliminates external failover systems and their associated delays. Flow control mechanisms monitor transaction queues on secondaries and throttle primary writes to prevent replication inconsistencies. The fleet-wide implementation includes automated onboarding, node management, rebalancing, and safeguards ensuring quorum and operational reliability. Benchmarking shows a slight increase in write latency measured in hundreds of microseconds but dramatic reduction in total write downtime.
Read at InfoQ
Unable to calculate read time
[
|
]