Scaling an Embedded Database for the Cloud - Challenges and Trade-Offs
Briefly

Scaling an Embedded Database for the Cloud - Challenges and Trade-Offs
"I think the interesting part about this project which I did, was that we tried to scale an embedded database. If you think about it, scaling an embedded database for the cloud is a contradictory motion, because embedded database literally means your application runs in-process locally on your computer, but then we want to make it work in the cloud. How do we make that work? What's the motivation for that?"
"Currently, I'm a staff software engineer at MongoDB. Previously, I was a founding engineer at MotherDuck. That's mostly the topic that we're going to discuss is my experience working on building this data warehouse at MotherDuck. Before I went to build this data warehouse at MotherDuck, I was working at Google on the BigQuery product. BigQuery is Google's cloud data warehouse offering."
The project transformed an in-process, in-memory embedded database into a cloud-native data warehouse, addressing contradictory requirements between local execution and distributed cloud operation. Embedded databases execute inside application processes and optimize for local performance, while cloud services require remote access, multi-tenancy, persistence, and scalability. Key engineering decisions focused on tradeoffs for durability, memory footprint, distribution, and orchestration. Prior experience with BigQuery and the DuckDB architecture informed choices about execution model, storage, and scaling from zero to a production cloud service. The effort emphasized pragmatic design to preserve embedded strengths while enabling cloud-native features.
Read at InfoQ
Unable to calculate read time
[
|
]