The new capabilities center on two integrated components: the Dynamo Planner Profiler and the SLO-based Dynamo Planner. These tools work together to solve the "rate matching" challenge in disaggregated serving. The teams use this term when they split inference workloads. They separate prefill operations, which process the input context, from decode operations that generate output tokens. These tasks run on different GPU pools. Without the right tools, teams spend a lot of time determining the optimal GPU allocation for these phases.
The company, which is based in San Francisco and has an office in Pune, India, is targeting up to $35 million this year as it builds a royalty-driven on-device AI business. That growth has buoyed the company, which now has post-money valuation of between $270 million and $300 million, up from around $100 million in its 2022 Series B, Kheterpal said.