OpenAI Scales Single Primary Postgresql to Millions of Queries per Second for ChatGPT
Briefly

OpenAI Scales Single Primary Postgresql to Millions of Queries per Second for ChatGPT
"As PostgreSQL load grew more than tenfold in the past year, OpenAI worked with Azure to optimize its deployment on Azure Database for PostgreSQL, enabling the system to serve 800 million ChatGPT users while maintaining a single-primary instance with sufficient headroom. Optimizations spanned both the application and database layers, including scaling up instance size, refining query patterns, and scaling out with additional read replicas. Redundant writes were reduced through application-level tuning, and new write-heavy workloads were directed to sharded systems such as Azure Cosmos DB, reserving PostgreSQL for relational workloads requiring strong consistency."
"The primary PostgreSQL instance is supported by nearly 50 geo-distributed read replicas on Azure Database for PostgreSQL. Reads are distributed across replicas to maintain p99 latency in the low double-digit milliseconds, while writes remain centralized with measures to limit unnecessary load. Lazy writes and application-level optimizations further reduce pressure on the primary instance, ensuring consistent performance even under global traffic spikes."
OpenAI scaled PostgreSQL to handle a tenfold increase in load and millions of queries per second to support hundreds of millions of users. The deployment on Azure Database for PostgreSQL was optimized through larger instance sizes, refined query patterns, additional geo-distributed read replicas, and application-layer tuning. Read traffic is distributed across nearly 50 replicas to keep p99 latency in the low double-digit milliseconds while writes remain centralized. Redundant writes were reduced and new write-heavy workloads were moved to sharded stores like Azure Cosmos DB to preserve PostgreSQL for strongly consistent relational workloads. Operational guardrails and timeouts were enforced to mitigate cache-miss storms and ORM-generated multi-table join patterns.
Read at InfoQ
Unable to calculate read time
[
|
]