#llm-aware-routing

[ follow ]
Artificial intelligence
fromInfoWorld
5 days ago

Evolving Kubernetes for generative AI inference

Kubernetes now includes native AI inference features including vLLM support, inference benchmarking, LLM-aware routing, inference gateway extensions, and accelerator scheduling.
[ Load more ]