DeepSeek Open-Sources DeepSeek-V3, a 671B Parameter Mixture of Experts LLMDeepSeek-V3 achieves superior performance as an open-source MoE LLM with 671 billion parameters.It addresses efficiency in training through advancements in load balancing and mixed-precision.
AI benchmarking organization criticized for waiting to disclose funding from OpenAI | TechCrunchTransparency issues have arisen regarding funding disclosure of a math benchmark for AI developed by Epoch AI with backing from OpenAI.
Will Smith eating spaghetti and other weird AI benchmarks that took off in 2024 | TechCrunchBizarre benchmarks, such as AI-generated videos of Will Smith, resonate more with the public than traditional academic measures.
Evaluating Generative AI: The Evolution Beyond Public BenchmarksEvaluating generative AI requires a shift from public benchmarks to task-specific evaluations for better performance indication.
Anthropic looks to fund a new, more comprehensive generation of AI benchmarks | TechCrunchAnthropic is launching a program to fund the development of new AI benchmarks to evaluate models, focusing on safety and societal impact.
Evaluating Generative AI: The Evolution Beyond Public BenchmarksEvaluating generative AI requires a shift from public benchmarks to task-specific evaluations for better performance indication.
Anthropic looks to fund a new, more comprehensive generation of AI benchmarks | TechCrunchAnthropic is launching a program to fund the development of new AI benchmarks to evaluate models, focusing on safety and societal impact.
The AI industry is obsessed with Chatbot Arena, but it might not be the best benchmark | TechCrunchChatbot Arena has emerged as a crucial platform for evaluating AI models, emphasizing real-world user preferences over traditional benchmarks.
AI training data has a price tag that only Big Tech can afford | TechCrunchTraining data is the key to sophisticated AI systems over design or architecture.
Meta releases Llama 3, claims it's among the best open models available | TechCrunchLlama 3 models are a significant advancement with high parameter counts leading to improved performance in generative AI models.
AI training data has a price tag that only Big Tech can afford | TechCrunchTraining data is the key to sophisticated AI systems over design or architecture.
Meta releases Llama 3, claims it's among the best open models available | TechCrunchLlama 3 models are a significant advancement with high parameter counts leading to improved performance in generative AI models.
AI has hit human-level performance on some parameters: Stanford reportAI models in closed source outperform open source counterparts by 24.2% on select benchmarks.
Anthropic claims its latest model is best-in-class | TechCrunchClaude 3.5 Sonnet by Anthropic is a performance-improved AI model focusing on efficiency, particularly in text and image analysis.