
"OpenAI is dissatisfied with the speed of Nvidia's AI chips for inference tasks and has been looking for alternatives since last year. The focus is on chips with more built-in memory for faster processing, especially for software development. This is according to Reuters, based on sources. OpenAI is said to have spoken with chip startups Cerebras and Groq about faster inference solutions. Inference is the process by which an AI model such as ChatGPT responds to user queries."
"However, Nvidia signed a $20 billion licensing deal with Groq, which halted talks between OpenAI and Groq. The ChatGPT maker is looking for hardware that will ultimately provide about 10 percent of its future inference computing power. Sources say OpenAI is dissatisfied with the speed at which Nvidia's hardware generates answers to specific problems, including software development and AI-to-AI communication."
"OpenAI's search for alternatives focuses on chips with large amounts of SRAM memory on the same chip. This architecture offers speed advantages for chatbots and other AI systems that process millions of user requests. Inference requires more memory than training because the chip spends relatively more time retrieving data from memory than performing mathematical operations. Within OpenAI, the problem was particularly evident with Codex, the code generation solution. The company is aggressively marketing this tool."
OpenAI is dissatisfied with the response speed of Nvidia's GPUs for inference and has pursued alternative chips since last year. The company prioritizes architectures with large amounts of on-chip SRAM to reduce memory retrieval latency for inference workloads, benefiting chatbots and code generation. OpenAI engaged with Cerebras and Groq for faster inference solutions, but Nvidia's $20 billion licensing deal with Groq halted those talks. OpenAI aims for alternative hardware to supply roughly ten percent of future inference computing to accelerate software development and AI-to-AI interactions.
Read at Techzine Global
Unable to calculate read time
Collection
[
|
...
]