
"Uber has redesigned its Apache Pinot query architecture to simplify execution, support richer SQL, and improve predictability for internal analytics workloads. The previous Neutrino system, which layered Presto and Pinot, has been replaced by a lightweight proxy called Cellar and uses Pinot's Multi-Stage Engine Lite Mode. The redesign aims to reduce complexity, enforce execution limits, and provide stronger isolation for multiple tenants."
"Previously, Neutrino ran as a stateless microservice combining Presto coordinator and worker processes. User-submitted PrestoSQL queries were partially pushed down to Pinot as PinotSQL, while the remaining query logic executed within Neutrino. Each query included default or user-defined limits to reduce the risk of full-table scans. Despite these safeguards, the layered architecture created complex semantics, made query plans harder to interpret, and limited isolation for tenants sharing the same proxy."
"Uber's Apache Pinot tables can reach hundreds of terabytes with billions of records, handling query rates from single digits to thousands of QPS. Multi-stage queries at this scale can easily exceed resources or latency expectations. Pinot 1.4 introduces the Multi-Stage Engine Lite Mode, which enforces configurable leaf stage record limits and uses a scatter-gather pattern. Leaf stages run on Pinot servers while other operators execute on brokers, ensuring predictable performance for complex queries."
Neutrino previously combined Presto coordinator and worker processes to partially push PrestoSQL queries into Pinot while executing remaining logic in Neutrino. Layering Presto over Pinot produced complex semantics, harder-to-interpret query plans, and limited tenant isolation despite query limits intended to reduce full-table scans. Pinot 1.4 adds a Multi-Stage Engine Lite Mode that enforces configurable leaf-stage record limits and employs a scatter-gather pattern, running leaf stages on servers and other operators on brokers. The new lightweight Cellar proxy forwards queries directly to Pinot brokers, enabling simpler execution paths, stronger isolation, transparent explain-plan limits, and more predictable performance at large scale.
Read at InfoQ
Unable to calculate read time
Collection
[
|
...
]