Google Cloud Run Now Offers Serverless GPUs for AI and Batch Processing
Briefly

Google Cloud has launched general availability for NVIDIA GPU support on Cloud Run, aiming to enhance serverless computing by providing a flexible, cost-efficient environment for AI workloads. Key benefits include pay-per-second billing to reduce waste, automatic scaling of GPU instances to eliminate idle costs, rapid instance start-up times for meeting quick demand, and full streaming support. This change makes GPU resources accessible to all developers without quotas, promoting faster and more cost-effective AI application deployment, as emphasized by NVIDIA's director of accelerated computing products.
Serverless GPU acceleration represents a major advancement in making cutting-edge AI computing more accessible. With seamless access to NVIDIA L4 GPUs, developers can now bring AI applications to production faster and more cost-effectively than ever before.
Cloud Run automatically scales GPU instances down to zero when inactive, eliminating idle costs - particularly beneficial for sporadic or unpredictable workloads.
Read at InfoQ
[
|
]