#ai-performance

[ follow ]
#qualcomm
#openai
fromZDNET
2 weeks ago
Artificial intelligence

I tested GPT-5's coding skills, and it was so bad that I'm sticking with GPT-4o (for now)

fromTechCrunch
4 months ago
Artificial intelligence

OpenAI's o3 AI model scores lower on a benchmark than the company initially implied | TechCrunch

fromZDNET
2 weeks ago
Artificial intelligence

I tested GPT-5's coding skills, and it was so bad that I'm sticking with GPT-4o (for now)

fromTechCrunch
4 months ago
Artificial intelligence

OpenAI's o3 AI model scores lower on a benchmark than the company initially implied | TechCrunch

fromTechCrunch
1 month ago

A new AI coding challenge just published its first results - and they aren't pretty | TechCrunch

A Brazilian engineer won the K Prize AI coding challenge with only 7.5% correct answers.
fromTechzine Global
1 month ago

Thinking too long makes AI models dumber

Claude models showed a notable sensitivity to irrelevant information during evaluation, leading to declining accuracy as reasoning length increased. OpenAI's models, in contrast, fixated on familiar problems.
Artificial intelligence
fromIT Pro
1 month ago

Dedicated servers are back in vogue as IT leaders scramble to meet AI, compliance requirements

Organizations are increasingly migrating workloads from public cloud to dedicated servers, highlighting a significant revival in their use. AI performance requirements drive this trend.
fromZDNET
1 month ago

5 ways to be great AI agent manager, according to business leaders

Antony Hausdoerfer emphasized that successful AI managers must ensure AI agents deliver trusted value and safe operations, focusing on applications that yield meaningful outcomes.
Business
fromHackernoon
1 year ago

phi-3-mini's Triumph: Redefining Performance on Academic LLM Benchmarks | HackerNoon

The results for phi-3-mini on standard open-source benchmarks measure the model's reasoning ability, comparing it to phi-2 and several other notable models.
Artificial intelligence
fromTheregister
1 month ago

Microsoft Copilot falls Atari 2600 Video Chess

Copilot struggled to beat Atari 2600 Video Chess despite confidence in its chess capabilities.
fromGadgets 360
3 months ago

Should You Buy an AI PC? In Conversation With Asus' Arnold Su

"I think it's a positive signal to Asus and to the gaming industry that end users are really eagerly waiting for the new graphics card and new chassis coming to the market," said Arnold Su.
Artificial intelligence
fromIT Pro
3 months ago

Acer's new Swift Edge 14 AI is a Copilot+ MacBook Air killer

The Swift Edge 14 AI Copilot+ PC is one of the lightest devices in its category, weighing only 0.99kg, making it a highly portable laptop choice.
Apple
fromTechzine Global
3 months ago

Microsoft expands fine-tuning capabilities in Azure AI Foundry

Reinforcement Fine-Tuning (RFT) is a new method that uses chain-of-thought reasoning and task-specific evaluation to improve model performance in specific application domains.
Artificial intelligence
Artificial intelligence
fromInfoWorld
4 months ago

Learning how to measure genAI's impact

AI model improvements are often difficult to quantify accurately.
Smaller language models may outperform larger ones in practical applications.
The debate on AGI misdefines human intelligence benchmarks.
fromZDNET
4 months ago

OpenAI's Deep Research has more fact-finding stamina than you, but it's still wrong half the time

OpenAI's Deep Research technology surpasses other models and humans in web searches, but still fails nearly half the time.
fromTechCrunch
4 months ago

Meta's vanilla Maverick AI model ranks below rivals on a popular chat benchmark | TechCrunch

The incident prompted the maintainers of LM Arena to apologize, change their policies, and score the unmodified, vanilla Maverick.
Artificial intelligence
[ Load more ]