#llm-scaling
#llm-scaling

[ follow ]

How DeepSeek's new way to train advanced AI models could disrupt everything - again

Manifold-Constrained Hyper-Connections (mHCs) promise a low-cost method to scale large language models; DeepSeek delayed R2 due to performance and chip-access concerns.

Artificial intelligence

fromRealpython

6 months ago

Episode #264: Large Language Models on the Edge of the Scaling Laws - The Real Python Podcast

LLM scaling is reaching diminishing returns; benchmarks are often flawed, and developer productivity gains from these models remain modest amid economic hiring shifts.

[ Load more ]

#llm-scaling#llm-scaling

How DeepSeek's new way to train advanced AI models could disrupt everything - again

Episode #264: Large Language Models on the Edge of the Scaling Laws - The Real Python Podcast

#llm-scaling
#llm-scaling