#model-performance

[ follow ]
fromInfoQ
1 week ago

Unsloth Tutorials Aim to Make it Easier to Compare and Fine-tune LLMs

Qwen3-Coder-480B-A35B delivers SOTA advancements in agentic coding and code tasks, matching or outperforming Claude Sonnet-4, GPT-4.1, and Kimi K2. The 480B model achieves a 61.8% on Aider Polygot and supports a 256K token context, extendable to 1M tokens.
Artificial intelligence
Artificial intelligence
fromHackernoon
4 years ago

Mixture-of-Agents (MoA): Improving LLM Quality through Multi-Agent Collaboration | HackerNoon

The Mixture-of-Agents framework enhances large language model performance through collaboration among specialized models, achieving superior results without massive scaling.
#ai-models
fromTechCrunch
2 weeks ago
Artificial intelligence

Sam Altman addresses 'bumpy' GPT-5 rollout, bringing 4o back, and the 'chart crime' | TechCrunch

fromTechCrunch
2 weeks ago
Artificial intelligence

Sam Altman addresses 'bumpy' GPT-5 rollout, bringing 4o back, and the 'chart crime' | TechCrunch

#machine-learning
fromHackernoon
1 year ago
Artificial intelligence

The Link Between Concept Frequency and AI Performance, Seen Through Images and Words | HackerNoon

fromMedium
1 month ago
Artificial intelligence

Two Indispensable Tools for Measuring the Quality of AI Systems

fromHackernoon
1 year ago
Artificial intelligence

The Link Between Concept Frequency and AI Performance, Seen Through Images and Words | HackerNoon

fromMedium
1 month ago
Artificial intelligence

Two Indispensable Tools for Measuring the Quality of AI Systems

Artificial intelligence
fromHackernoon
1 year ago

How Dataset Diversity Impacts AI Model Performance | HackerNoon

Pretraining data diversity significantly influences model performance, particularly in generalization and predictive capabilities.
fromHackernoon
6 months ago

Contextualizing SUTRA: Advancements in Multilingual & Efficient LLMs | HackerNoon

Advancements in Large Language Models emphasize the importance of multilingual support to address global linguistic diversity.
#ai-evaluation
fromHackernoon
1 year ago
Artificial intelligence

AI Still Can't Explain a Joke-or a Metaphor-Like a Human Can | HackerNoon

Artificial intelligence
fromInfoWorld
4 months ago

Vector Institute aims to clear up confusion about AI model performance

DeepSeek and OpenAI's o1 models excel in performance, yet AI models still face significant challenges across various tasks.
fromHackernoon
1 year ago
Artificial intelligence

AI Still Can't Explain a Joke-or a Metaphor-Like a Human Can | HackerNoon

Artificial intelligence
fromInfoWorld
4 months ago

Vector Institute aims to clear up confusion about AI model performance

DeepSeek and OpenAI's o1 models excel in performance, yet AI models still face significant challenges across various tasks.
#ai
fromInfoQ
2 months ago
Artificial intelligence

Mistral AI Releases Magistral, Its First Reasoning-Focused Language Model

fromComputerworld
4 months ago
Artificial intelligence

Open AI's new models hallucinate more than the old ones

AI models increasingly produce hallucinations, with newer versions being more prone to inaccuracies.
fromHackernoon
4 months ago
Artificial intelligence

Reconstruction Evaluations Across Varying Amounts of Training Data: Mindeye2 | HackerNoon

Model performance improves with increased training data, particularly in specialized contexts such as medical AI.
fromInfoQ
2 months ago
Artificial intelligence

Mistral AI Releases Magistral, Its First Reasoning-Focused Language Model

fromHackernoon
4 months ago
Artificial intelligence

Reconstruction Evaluations Across Varying Amounts of Training Data: Mindeye2 | HackerNoon

Artificial intelligence
fromTechCrunch
2 months ago

DeepSeek may have used Google's Gemini to train its latest model | TechCrunch

DeepSeek's R1 model may have been trained on outputs from Google's Gemini, raising ethical concerns regarding data sourcing.
Scala
fromHackernoon
10 months ago

What Makes Code LLMs Accurate? | HackerNoon

Pass@1 rates for Lua programming tasks show that quantization level impacts model performance, particularly affecting lower bit models.
#quantization
fromHackernoon
10 months ago

The V-Shaped Mystery of Inference Time in Low-Bit Code Models | HackerNoon

Higher precision results in longer inference times, especially for incorrect solutions.
Longer inference times do not guarantee improved performance across different models.
Online learning
fromHackernoon
1 year ago

Fine-tuned GPT-3.5 Performance for Explanatory Feedback | HackerNoon

Fine-tuning GPT-3.5 enhances its ability to identify praise in tutoring responses even with limited data.
Artificial intelligence
fromHackernoon
4 months ago

How LightCap Sees and Speaks: Mobile Magic in Just 188ms Per Image | HackerNoon

LightCap model achieves real-time image processing on mobile devices, meeting efficiency demands for practical applications.
Software development
fromInfoQ
3 months ago

Windsurf Launches SWE-1 Family of Models for Software Engineering

Windsurf's SWE-1 models support diverse software engineering tasks while improving performance and user experience.
fromHackernoon
8 months ago

Where Glitch Tokens Hide: Common Patterns in LLM Tokenizer Vocabularies | HackerNoon

The study identifies a pattern of untrained tokens across various model families, revealing inefficiencies in tokenizer design.
[ Load more ]