
"IBM Cloud Code Engine, the company's fully managed, strategic serverless platform, has introduced Serverless Fleets with integrated GPU support. With this new capability, the company directly addresses the challenge of running large-scale, compute-intensive workloads such as enterprise AI, generative AI, machine learning, and complex simulations on a simplified, pay-as-you-go serverless model. Historically, as noted in academic papers, including a recent Cornell University paper, serverless technology struggled to efficiently support these demanding, parallel workloads,"
"The architecture of this capability was informed and driven a lot by running large real-world workloads with 100,000s of processors. It is built in such a robust way that it can run these workloads with essentially zero SRE staff. Serverless Fleets simplifies how data scientists and developers execute compute-intensive tasks by providing a single endpoint for submitting a large number of batch jobs. In a blog post, IBM mentions that Code Engine then automatically handles the infrastructure orchestration: The service automatically provisions the necessary compute resources,"
Serverless Fleets adds integrated GPU support to IBM Cloud Code Engine, enabling simplified, pay-as-you-go execution of large-scale, compute-intensive workloads like enterprise AI, generative AI, machine learning, and complex simulations. The offering targets historically challenging parallel workloads that required thousands or millions of simultaneous tasks and specialized hardware. The architecture was informed by running large real-world workloads with hundreds of thousands of processors and is designed to operate with essentially zero SRE staff. Serverless Fleets provides a single endpoint for large batch submissions, automatically provisions VMs and serverless GPUs (such as NVIDIA L40), scales worker instances elastically for run-to-completion tasks, and removes resources automatically after completion.
Read at InfoQ
Unable to calculate read time
Collection
[
|
...
]