#model-inference
#model-inference

[ follow ]

AI makes networking matter again

AI inference workloads are making infrastructure and networking strategic concerns again, breaking the cloud abstraction that previously allowed developers to ignore these details.

Artificial intelligence

fromInfoQ

5 months ago

KubeCon NA 2025 - Erica Hughberg and Alexa Griffith on Tools for the Age of GenAI

GenAI platforms require AI-native routing, token-level rate limiting, centralized credential management, and observability, resilience, and failover enabled by Kubernetes-based tools.

fromTheregister

8 months ago

Alibaba looks to end reliance on Nvidia for AI inference

First reported by the Wall Street Journal Friday, the ecommerce giant's latest chip is aimed specifically at AI inference, which refers to serving models as opposed to training them. Alibaba's T-Heat division has been working on AI silicon for some time now. In 2019, it introduced the Hanguang 800. However, unlike modern chips from Nvidia and AMD, the part was primarily aimed at conventional machine learning models like ResNet - not the large language and diffusion models that power AI chatbots and image generators today.

Artificial intelligence

[ Load more ]

#model-inference#model-inference

AI makes networking matter again

KubeCon NA 2025 - Erica Hughberg and Alexa Griffith on Tools for the Age of GenAI

Alibaba looks to end reliance on Nvidia for AI inference

#model-inference
#model-inference