#gpu-scaling

[ follow ]
Science
fromNature
2 days ago

Breakthrough computer chip tech could help meet 'monumental demand' driven by AI

A new light source enables the creation of 8 nm wide structures on silicon wafers, increasing transistor density for advanced computer chips.
#nvidia
Software development
fromArs Technica
2 days ago

Nvidia rolls out its fix for PC gaming's "compiling shaders" wait times

Nvidia's new Auto Shader Compilation feature allows automatic shader compilation during idle times to reduce load times for PC gamers.
Video games
fromEngadget
3 days ago

NVIDIA's DLSS 4.5 Multi Frame Generation tech is now available to boost your Hz

NVIDIA's DLSS 4.5 enhances frame rates on RTX 50 series GPUs, enabling smoother gaming experiences with advanced AI features.
Venture
from24/7 Wall St.
1 day ago

NVIDIA Just Made Another Big Bet-Are You Still Paying Attention?

Nvidia invested $2 billion in Marvell Technology, continuing its trend of significant investments in the AI sector.
Video games
fromGadgets 360
3 days ago

Nvidia Brings New AI Features With a New DLSS 4.5 Update

Nvidia's DLSS 4.5 update introduces 6X multi-frame generation and dynamic multi-frame generation for enhanced gaming performance.
Vue
fromThe Verge
3 days ago

Nvidia rolls out DLSS 4.5 update with new frame generation features

Nvidia's DLSS 4.5 update introduces AI-powered frame generation for RTX GPUs, enhancing performance and image quality in over 20 games.
Tech industry
from24/7 Wall St.
2 days ago

Nvidia vs Broadcom: Which AI Stock Will Make You More Money

Nvidia and Broadcom reported significant AI-driven revenue growth, with Nvidia focusing on GPUs and Broadcom on custom silicon.
Software development
fromArs Technica
2 days ago

Nvidia rolls out its fix for PC gaming's "compiling shaders" wait times

Nvidia's new Auto Shader Compilation feature allows automatic shader compilation during idle times to reduce load times for PC gamers.
Video games
fromEngadget
3 days ago

NVIDIA's DLSS 4.5 Multi Frame Generation tech is now available to boost your Hz

NVIDIA's DLSS 4.5 enhances frame rates on RTX 50 series GPUs, enabling smoother gaming experiences with advanced AI features.
Software development
fromTechzine Global
1 day ago

Cursor updates its platform with a focus on autonomous AI agents

Cursor 3 enhances software development by integrating AI agents for collaborative coding, reducing manual programming and streamlining workflows.
Scala
fromInfoQ
1 day ago

Beyond RAG: Architecting Context-Aware AI Systems with Spring Boot

Context-Augmented Generation (CAG) enhances Retrieval-Augmented Generation (RAG) by managing runtime context for enterprise applications without requiring model retraining.
Silicon Valley
fromSilicon Canals
1 day ago

Frugal AI wants to break the global compute hierarchy before it becomes permanent - Silicon Canals

The Soliga tribe's speech AI system exemplifies a new, decentralized approach to AI that challenges existing global tech hierarchies.
#openai
Artificial intelligence
fromwww.businessinsider.com
1 day ago

OpenAI's CFO says the company is passing on opportunities because it does not have enough compute

OpenAI is limiting opportunities due to insufficient computing power, impacting product decisions and prioritization of core AI initiatives.
Artificial intelligence
fromFuturism
5 days ago

OpenAI's Obsession With Data Centers Is Running Into Trouble

OpenAI has significantly reduced its AI infrastructure spending plans from $1.4 trillion to $600 billion amid financial pressures and market expectations.
DevOps
fromTheregister
1 day ago

IBM wants Arm software on its mainframes for AI support

IBM and Arm are collaborating to enhance enterprise systems for AI and data-intensive workloads using Arm chips.
#ai
fromEngadget
4 days ago
Artificial intelligence

Microsoft's research assistant can now use multiple AI models simultaneously

fromZDNET
4 days ago
Artificial intelligence

What Google's TurboQuant can and can't do for AI's spiraling cost

fromwww.sitepoint.com
4 days ago
Software development

I Built a Desktop Multi-Agent System That Outperforms Codex and Claude Code

A new open-source project enables the creation of customizable AI swarms for collaborative tasks across various industries.
fromTheregister
5 days ago
Software development

AI software development: It works, but it's finicky

AI can write code, but expert developers are essential to correct its errors and ensure quality.
Data science
fromTheregister
2 days ago

TurboQuant is a big deal, but it won't end the memory crunch

TurboQuant is an AI data compression technology that reduces memory usage for KV caches but may not significantly alleviate memory shortages.
Artificial intelligence
fromEngadget
4 days ago

Microsoft's research assistant can now use multiple AI models simultaneously

The upgraded Researcher tool combines ChatGPT and Claude models for improved research quality in Microsoft 365 Copilot.
Artificial intelligence
fromZDNET
4 days ago

What Google's TurboQuant can and can't do for AI's spiraling cost

Google's TurboQuant significantly reduces AI memory usage, making AI more efficient and accessible by lowering inference costs.
Gadgets
fromZDNET
4 days ago

Don't ignore your desktop PC's empty M.2 slots - they're more useful than you think

M.2 slots in desktop PCs can be utilized for various upgrades beyond storage, enhancing performance and connectivity.
European startups
fromTheregister
4 days ago

Rebellions eyes global expansion with rack-scale AI platform

Rebellions raised $400 million to expand globally with AI accelerators and a new compute platform for enterprises and sovereign clouds.
Tech industry
fromComputerWeekly.com
1 day ago

Marvell scales up networking to extend Nvidia AI ecosystem | Computer Weekly

Marvell Technology joins Nvidia AI ecosystem to enhance infrastructure development with a $2bn investment.
Data science
fromInfoWorld
2 days ago

Why 'curate first, annotate smarter' is reshaping computer vision development

Strategic data selection and curation reduce annotation costs and enhance development productivity in computer vision teams.
DevOps
fromTechzine Global
3 days ago

Harness adds four capabilities to close AI delivery gap

Harness is launching four new capabilities to enhance its Continuous Delivery platform, addressing the gap between code writing speed and release reliability.
Software development
fromArs Technica
3 days ago

Running local models on Macs gets faster with Ollama's MLX support

Ollama enhances local language model performance on Apple Silicon with MLX support and improved caching, catering to growing interest in local models.
Software development
fromZDNET
3 days ago

How AI has suddenly become much more useful to open-source developers

AI tools are becoming increasingly useful for open-source maintainers, but legal and quality issues remain.
DevOps
fromInfoWorld
1 week ago

An architecture for engineering AI context

AI systems must intelligently manage context to ensure accuracy and reliability in real applications.
Graphic design
fromKotaku
2 weeks ago

Nvidia Says DLSS 5 Haters Just Don't Get How The Gen AI Works

Nvidia CEO Jensen Huang defends DLSS 5 generative AI upscaling technology against backlash, asserting developers retain full artistic control through fine-tuning capabilities at the geometry level rather than post-processing.
fromTheregister
2 weeks ago

Nvidia GTC: We predict an agentic AI enterprise hype fest

gamers are probably going to feel left out since Nvidia seems to have decided renting cloud rigs to them is better than selling consumer hardware, small companies looking for AI chip compromises will be excited, and agentic AI is gonna be so hot that our Mann on the ground this week in San Jose isn't gonna need a jacket.
Silicon Valley
Gadgets
fromThe Verge
3 weeks ago

Nvidia's DLSS 4.5 with 6x Frame Generation is rolling out at the end of March

Nvidia launches DLSS 4.5 with 6x Multi Frame Generation on March 31st for RTX 50-series GPUs, enabling generation of five additional frames per natively rendered frame, alongside Dynamic Frame Generation for automatic multiplier adjustment.
#ai-efficiency
Artificial intelligence
fromInfoWorld
1 week ago

Google targets AI inference bottlenecks with TurboQuant

TurboQuant improves AI model efficiency by compressing key-value caches, reducing memory usage and runtime without accuracy loss.
Artificial intelligence
fromInfoWorld
1 week ago

Google targets AI inference bottlenecks with TurboQuant

TurboQuant improves AI model efficiency by compressing key-value caches, reducing memory usage and runtime without accuracy loss.
Tech industry
fromTechzine Global
1 week ago

Arm Launches 136-Core AGI CPU for Data Centers

Arm introduces the Arm AGI CPU, designed for AI data centers with significant performance improvements and capacity requirements.
Video games
fromThe Verge
2 weeks ago

DLSS 5 looks like a real-time generative AI filter for video games

Nvidia's DLSS 5 uses generative AI to alter game lighting and materials, creating more realistic visuals but raising concerns about artistic intent and AI-generated quality.
#ai-infrastructure
Artificial intelligence
fromTechzine Global
2 weeks ago

Dell gives AI Factory an Nvidia Vera Rubin upgrade

Dell's AI Factory has 4,000 customers achieving 2.6x ROI, addressing three critical requirements: data readiness, distributed AI infrastructure, and faster deployment through orchestration and automation.
Artificial intelligence
fromTechzine Global
2 months ago

Nvidia Blackwell successor Rubin releases in 2026: significant performance boost

Rubin is a six-chip AI infrastructure platform delivering up to 10× lower cost-per-token and faster training, available via major cloud providers in H2 2026.
Venture
fromTechCrunch
3 weeks ago

Thinking Machines Lab inks massive compute deal with Nvidia | TechCrunch

Mira Murati's Thinking Machines Lab signed a multi-year strategic partnership with Nvidia involving at least one gigawatt of Vera Rubin systems deployment starting in 2027, with Nvidia also making a strategic investment in the $12 billion-valued AI research company.
Artificial intelligence
fromComputerWeekly.com
2 weeks ago

HPE taps Nvidia to transform distributed AI factories into intelligent AI grid | Computer Weekly

HPE launches AI Grid infrastructure powered by Nvidia GPUs to enable distributed, low-latency AI inference at edge locations for real-time applications across retail, manufacturing, healthcare, and telecommunications.
Tech industry
fromZDNET
2 weeks ago

Nvidia wants to own your AI data center from end to end

Nvidia expanded its AI infrastructure portfolio with five rack types, including a new LPX inference rack using Groq technology, positioning itself to control all data center processing.
Tech industry
fromTechzine Global
2 weeks ago

Cisco and Nvidia lower barrier to secure, full-stack AI infrastructure

Cisco and Nvidia expanded the Cisco Secure AI Factory to deliver a complete, integrated, and secure AI stack enabling faster customer adoption of AI infrastructure.
Artificial intelligence
fromTechzine Global
2 weeks ago

Dell gives AI Factory an Nvidia Vera Rubin upgrade

Dell's AI Factory has 4,000 customers achieving 2.6x ROI, addressing three critical requirements: data readiness, distributed AI infrastructure, and faster deployment through orchestration and automation.
Gadgets
fromTechzine Global
3 weeks ago

AMD is giving its embedded chips 80 TOPS of AI compute

AMD's expanded Ryzen AI Embedded P100 Series delivers up to 12 Zen 5 cores and 80 system TOPS for industrial, robotics, and medical imaging applications with ROCm software support.
Tech industry
fromTheregister
2 weeks ago

A closer look at Nvidia's Groq-powered LPX rack systems

Nvidia acquired Groq for $20 billion primarily to accelerate time-to-market for SRAM-heavy inference chips rather than develop the technology independently, enabling faster token generation for AI reasoning workloads.
Artificial intelligence
fromMedium
1 week ago

Less Compute, More Impact: How Model Quantization Fuels the Next Wave of Agentic AI

Model quantization and architectural optimization can outperform larger models, challenging the belief that more GPUs equal greater intelligence.
Data science
fromTechRepublic
1 month ago

Inside the Gas Engine Strategy Powering AI's Next Wave

Gas reciprocating engines are emerging as a critical power solution for AI data centers, with manufacturers like Caterpillar securing multi-gigawatt orders to meet demand that exceeds grid and turbine capacity.
Artificial intelligence
fromTheregister
1 week ago

Arm rolls its own 136-core AGI CPU to chase AI hype train

Arm has unveiled its first homegrown silicon, the AGI CPU, designed for artificial general intelligence and set for deployment by Meta.
Tech industry
fromComputerworld
2 weeks ago

System-level 'coopetition': Why Nvidia's DGX Rubin NVL8 runs on Intel Xeon 6

Nvidia's flagship DGX Rubin NVL8 AI systems use Intel Xeon 6 processors as host CPUs to maintain x86 compatibility and meet enterprise deployment requirements.
Tech industry
fromTheregister
2 weeks ago

Nvidia slaps Groq into new LPX racks for faster AI response

Nvidia integrates Groq's language processing units into Vera Rubin systems to dramatically accelerate LLM inference, enabling hundreds to thousands of tokens per second per user.
#ryzen-ai-400-series
Gadgets
fromArs Technica
1 month ago

AMD will bring its "Ryzen AI" processors to standard desktop PCs for the first time

AMD's Ryzen AI 400-series desktop processors are repackaged laptop chips with up to 8 CPU cores and Radeon 860M GPUs, targeting business desktops rather than gaming due to high DDR5 memory costs.
fromZDNET
2 months ago
Artificial intelligence

AMD's new Ryzen chipset promises faster performance, better gaming, and smarter AI

Gadgets
fromArs Technica
1 month ago

AMD will bring its "Ryzen AI" processors to standard desktop PCs for the first time

AMD's Ryzen AI 400-series desktop processors are repackaged laptop chips with up to 8 CPU cores and Radeon 860M GPUs, targeting business desktops rather than gaming due to high DDR5 memory costs.
fromZDNET
2 months ago
Artificial intelligence

AMD's new Ryzen chipset promises faster performance, better gaming, and smarter AI

Artificial intelligence
fromTechCrunch
2 weeks ago

Niv-AI exits stealth to wring more power performance out of GPUs | TechCrunch

AI data centers waste significant power due to GPU demand surges, forcing operators to throttle performance by up to 30%, prompting startups like Niv-AI to develop precision power management solutions.
Artificial intelligence
fromComputerworld
2 weeks ago

Nvidia NemoClaw promises to run OpenClaw agents securely

Nvidia introduced NemoClaw with OpenShell security features to address OpenClaw's enterprise security vulnerabilities through sandbox isolation and policy enforcement.
#meta
Tech industry
from24/7 Wall St.
2 weeks ago

Nvidia GPU availability near zero, AI compute demand off the charts

GPU availability is near zero, indicating demand from hyperscalers and enterprises far exceeds supply, validated by Nvidia's 73% revenue growth and 75% data center revenue increase.
Artificial intelligence
fromTechzine Global
2 weeks ago

Nvidia's Groq 3 LPU targets agentic AI inference at GTC 2026

Nvidia's acquisition of Groq technology produces the Groq 3 LPU, a specialized inference chip delivering 40 petabytes per second bandwidth, significantly outpacing GPU inference speeds.
fromTechzine Global
2 months ago

DAWN supercomputer gets upgrade and swaps Intel for AMD

The British government is investing heavily in the national computing infrastructure. With an additional investment of approximately $49 million, the DAWN supercomputer at the University of Cambridge is being expanded. This is according to Neowin. This expansion will increase the total computing power of the system by a factor of six. The aim is to enable researchers and technology companies to compete more effectively with players from the United States and China.
UK politics
Artificial intelligence
fromInfoWorld
3 weeks ago

Nvidia launches Nemotron 3 Super to power enterprise AI agents

Nemotron 3 Super's hybrid architecture combining Mamba and Transformer technologies enables enterprises to run complex AI agents more efficiently with lower costs and faster execution on existing infrastructure.
#ai-agents
fromEngadget
3 weeks ago
Artificial intelligence

NVIDIA is reportedly working on its own open-source AI agent platform

fromWIRED
3 weeks ago
Artificial intelligence

Nvidia Is Planning to Launch an Open-Source AI Agent Platform

Artificial intelligence
fromEngadget
3 weeks ago

NVIDIA is reportedly working on its own open-source AI agent platform

NVIDIA is developing NemoClaw, an enterprise-focused open-source AI agent platform designed to work across non-NVIDIA hardware with enhanced security features.
Artificial intelligence
fromWIRED
3 weeks ago

Nvidia Is Planning to Launch an Open-Source AI Agent Platform

Nvidia is launching NemoClaw, an open-source AI agent platform enabling enterprise software companies to deploy AI agents for workforce task automation, accessible regardless of chip dependency.
Artificial intelligence
fromTNW | Insider
3 weeks ago

NVIDIA is reportedly building an enterprise AI agent platform

Nvidia is developing NemoClaw, an open-source enterprise AI agent platform, and pitching it to major software companies ahead of an official launch.
#intel
Artificial intelligence
fromComputerWeekly.com
4 weeks ago

Edge AI: What's working and what isn't | Computer Weekly

Edge AI deployment success depends on identifying efficient, narrow use cases with manageable risks rather than pursuing sophisticated, large-scale models across all applications.
#amd
fromEngadget
2 months ago

AMD's new Ryzen AI Max+ chips and Ryzen 7 9850X3D court desktop enthusiasts at CES 2026

"Many people in the PC industry said, well, if you want graphics, it's gotta be discrete graphics because otherwise people will think it's bad graphics," Macri said at last year's CES. "What Apple showed was consumers don't care what's inside the box. They actually care what the what the box looks like. They care about the screen, the keyboard, the mouse. They care about what it does."
Gadgets
Artificial intelligence
from24/7 Wall St.
1 month ago

3 NVIDIA Storylines That Matter

NVIDIA's Q1 FY2027 guidance explicitly excludes China Data Center revenue, signaling regulatory risks and balance sheet exposure from export controls totaling $95.2 billion in supply commitments.
Artificial intelligence
from24/7 Wall St.
1 month ago

NVIDIA Cements Its Role as the Backbone of AI Infrastructure

NVIDIA's networking revenue grew 162% year-over-year to $8.2 billion, nearly tripling GPU growth, signaling a shift from chip seller to integrated infrastructure provider selling complete AI data center systems.
Tech industry
fromTheregister
2 months ago

How Nvidia is using emulation to turn AI FLOPS into FP64

Nvidia achieves higher FP64 throughput through software emulation on Rubin GPUs, trading hardware FP64 for emulated matrix performance up to 200 TFLOPS.
#ryzen-ai-400
Tech industry
from24/7 Wall St.
2 months ago

Nvidia and Others Just Pulled the Curtain on New Chips

CES 2026 spotlighted physical AI and robotics, showcased Nvidia's Vera and Rubin hardware already in production, and intensified debate over AI-driven market valuations.
fromTheregister
2 months ago

Unpacking AMD's latest datacenter CPU and GPU announcements

AMD clarified those estimates are based on a comparison between an eight-GPU MI300X node and an MI500 rack system with an unspecified number of GPUs. The math works out to eight MI300Xs that are 1000x less powerful than X-number of MI500Xs. And since we know essentially nothing about the chip besides that it'll ship in 2027, pair TSMC's 2nm process tech with AMD's CDNA 6 compute architecture, and use HBM4e memory, we can't even begin to estimate what that 1000x claim actually means.
Artificial intelligence
fromCointelegraph
2 months ago

What Role Is Left for Decentralized GPU Networks in AI?

What we are beginning to see is that many open-source and other models are becoming compact enough and sufficiently optimized to run very efficiently on consumer GPUs,
Artificial intelligence
Artificial intelligence
fromTheregister
2 months ago

Nvidia says DGX Spark is now 2.5x faster than at launch

Nvidia's DGX Spark and GB10 systems gain significant software-driven performance improvements and broader software integrations, boosting prefill compute performance for genAI workflows.
Artificial intelligence
fromTechzine Global
1 month ago

OpenAI seeks faster alternatives to Nvidia chips

OpenAI seeks alternative inference chips with larger on-chip SRAM to improve response speed for coding and AI-to-AI communication, aiming for about 10% of future inference capacity.
fromInfoQ
2 months ago

NVIDIA Dynamo Planner Brings SLO-Driven Automation to Multi-Node LLM Inference

The new capabilities center on two integrated components: the Dynamo Planner Profiler and the SLO-based Dynamo Planner. These tools work together to solve the "rate matching" challenge in disaggregated serving. The teams use this term when they split inference workloads. They separate prefill operations, which process the input context, from decode operations that generate output tokens. These tasks run on different GPU pools. Without the right tools, teams spend a lot of time determining the optimal GPU allocation for these phases.
Artificial intelligence
Artificial intelligence
fromInfoQ
2 months ago

Intel DeepMath Introduces a Smart Architecture to Make LLMs Better at Math

DeepMath uses a Qwen3-4B Thinking agent that emits small Python executors for intermediate math steps, improving accuracy and significantly reducing output length.
Artificial intelligence
fromInfoWorld
2 months ago

Edge AI: The future of AI inference is smarter local compute

Edge AI shifts computation from cloud to devices, enabling low-latency, cost-efficient, and privacy-preserving AI inference while facing performance and ecosystem challenges.
fromTechzine Global
2 months ago

Neuromorphic computers prove suitable for supercomputing

Scientists are showing that neuromorphic computers, designed to mimic the human brain, are not only useful for AI, but also for complex computational problems that normally run on supercomputers. This is reported by The Register. Neuromorphic computing differs fundamentally from the classic von Neumann architecture. Instead of a strict separation between memory and processing, these functions are closely intertwined. This limits data transport, a major source of energy consumption in modern computers. The human brain illustrates how efficient such an approach can be.
Artificial intelligence
[ Load more ]