Forget training, find your killer apps during AI inference

"Most organisations will never train their own AI models. Instead, most customer's key challenge in AI lies in applying it to production applications and inference, with fine tuning and curation of data the core tasks. Key here are use of retrieval augmented generation (RAG) and vector databases, the ability to reuse AI prompts, and co-pilot capabilities that allow users to question corporate information in natural language."

"Most organisations won't train their own AI models because it's simply too expensive at the moment. That's because GPU hardware is incredibly costly to buy and also because it is evolving at such a rapid pace that obsolescence comes very soon. So, most organisations now tend to buy GPU capacity in the cloud for training phases. It's pointless trying to build in-house AI training farms when GPU hardware can become obsolete within a generation or two."

"Naturally, the key tasks identified fit well with areas of functionality added recently to Pure's storage hardware offer - including its recently launched Key Value Accelerator - and also with its ability to provide capacity on demand. But they also illustrate the key challenges for organisations tackling AI at this stage in its maturity, which has been called a "post-training phase". In this article, we look at what customers need from storage in AI in production phases, and with ongoing ingestion of data"

Most organisations will not train AI models in-house because GPU hardware is costly to buy and becomes obsolete rapidly. Organisations typically acquire GPU capacity in the cloud for training and avoid building in-house training farms. Core AI tasks concentrate on production deployment and inference, with ongoing fine-tuning and curation of data as primary activities. Retrieval-augmented generation (RAG), vector databases, reusable prompts, and co-pilot capabilities support natural-language access to corporate information. Storage requirements include capacity on demand, low-latency key-value acceleration, continuous ingestion pipelines, and infrastructure that supports inference workloads during the post-training phase.

#ai-production #retrieval-augmented-generation-rag #vector-databases #gpu-cloud-training #storage-infrastructure

Read at ComputerWeekly.com

Unable to calculate read time

Collection

[

...

]

Forget training, find your killer apps during AI inference | Computer WeeklyForget training, find your killer apps during AI inference | Computer Weekly Briefly

Forget training, find your killer apps during AI inference | Computer Weekly
Forget training, find your killer apps during AI inference | Computer Weekly
Briefly