Cohere's Transcribe model is designed for tasks like note-taking and speech analysis, supporting 14 languages and optimized for consumer-grade GPUs, making it accessible for self-hosting.
I'm one of those authors whose books AI ate for lunch a few years back. At some point I might get a check to pay me for a dozen years' work on the three books it stole, but really, there's no way to compensate for the fallout. AI seems to think no, it can't think, only shuffle what real people thought that a machine can write as well as a person can.
For the first time, speech has been decoupled from consequence. We now live alongside AI systems that converse knowledgeably and persuasively-deploying claims about the world, explanations, advice, encouragement, apologies, and promises-while bearing no vulnerability for what they say. Millions of people already rely on chatbots powered by large language models, and have integrated these synthetic interlocutors into their personal and professional lives. An LLM's words shape our beliefs, decisions, and actions, yet no speaker stands behind them.
By comparing how AI models and humans map these words to numerical percentages, we uncovered significant gaps between humans and large language models. While the models do tend to agree with humans on extremes like 'impossible,' they diverge sharply on hedge words like 'maybe.' For example, a model might use the word 'likely' to represent an 80% probability, while a human reader assumes it means closer to 65%.
Every year, poor communication and siloed data bleed companies of productivity and profit. Research shows U.S. businesses lose up to $1.2 trillion annually to ineffective communication, that's about $12,506 per employee per year. This stems from breakdowns that waste an average of 7.47 hours per employee each week on miscommunications. The damage isn't only interpersonal; it's structural. Disconnected and fragmented data systems mean that employees spend around 12 hours per week just searching for information trapped in those silos.
Anthropic has released a new version of its mid-size Sonnet model, keeping pace with the company's four-month update cycle. In a post announcing the new model, Anthropic emphasized improvements in coding, instruction-following, and computer use. Sonnet 4.6 will be the default model for Free and Pro plan users. The beta release of Sonnet 4.6 will include a context window of 1 million tokens, twice the size of the largest window previously available for Sonnet.
Semantic ablation is the algorithmic erosion of high-entropy information. Technically, it is not a "bug" but a structural byproduct of greedy decoding and RLHF (reinforcement learning from human feedback). During "refinement," the model gravitates toward the center of the Gaussian distribution, discarding "tail" data - the rare, precise, and complex tokens - to maximize statistical probability. Developers have exacerbated this through aggressive "safety" and "helpfulness" tuning, which deliberately penalizes unconventional linguistic friction.
A major difference between LLMs and LTMs is the type of data they're able to synthesize and use. LLMs use unstructured data-think text, social media posts, emails, etc. LTMs, on the other hand, can extract information or insights from structured data, which could be contained in tables, for instance. Since many enterprises rely on structured data, often contained in spreadsheets, to run their operations, LTMs could have an immediate use case for many organizations.
What happens under the hood? How is the search engine able to take that simple query, look for images in the billions, trillions of images that are available online? How is it able to find this one or similar photos from all that? Usually, there is an embedding model that is doing this work behind the hood.
The dataset was created by translating non-English content from the FineWeb2 corpus into English using Gemma3 27B, with the full data generation pipeline designed to be reproducible and publicly documented. The dataset is primarily intended to improve machine translation, particularly in the English→X direction, where performance remains weaker for many lower-resource languages. By starting from text originally written in non-English languages and translating it into English, FineTranslations provides large-scale parallel data suitable for fine-tuning existing translation models.
OpenAI has released Open Responses, an open specification to standardize agentic AI workflows and reduce API fragmentation. Supported by partners like Hugging Face and Vercel and local inference providers, the spec introduces unified standards for agentic loops, reasoning visibility, and internal versus external tool execution. It aims to enable developers to easily switch between proprietary models and open-source models without rewriting integration code.
Since AlexNet5, deep learning has replaced heuristic hand-crafted features by unifying feature learning with deep neural networks. Later, Transformers6 and GPT-3 (ref. 1) further advanced sequence learning at scale, unifying structured tasks such as natural language processing. However, multimodal learning, spanning modalities such as images, video and text, has remained fragmented, relying on separate diffusion-based generation or compositional vision-language pipelines with many hand-crafted designs.
OpenAI's GPT-5.2 Pro does better at solving sophisticated math problems than older versions of the company's top large language model, according to a new study by Epoch AI, a non-profit research institute.
For this test, we're comparing the default models that both OpenAI and Google present to users who don't pay for a regular subscription- ChatGPT 5.2 for OpenAI and Gemini 3.2 Fast for Google. While other models might be more powerful, we felt this test best recreates the AI experience as it would work for the vast majority of Siri users, who don't pay to subscribe to either company's services.
Drawing on more than 22,000 LLM prompts designed to reflect the kind of questions people would ask artificial intelligence (AI)-powered chatbots, such as, "How do I apply for universal credit?", the data raises concerns about whether chatbots can be trusted to give accurate information about government services. The publication of the research follows the UK government's announcement of partnerships with Meta and Anthropic at the end of January 2026 to develop AI-powered assistants for navigating public services.
This process, becoming aware of something not working and then changing what you're doing, is the essence of metacognition, or thinking about thinking. It's your brain monitoring its own thinking, recognizing a problem, and controlling or adjusting your approach. In fact, metacognition is fundamental to human intelligence and, until recently, has been understudied in artificial intelligence systems. My colleagues Charles Courchaine, Hefei Qiu, Joshua Iacoboni, and I are working to change that.
process AI is the integration of AI and ML (with optional natural language processing (NLP) and computer vision, including optical character recognition (OCR) in one platform) into business workflows with the aim of automating tasks that need and require human-like judgment. Also straightforward to define, document AI (occasionally known as intelligent document processing) is a set of technologies designed to enable enterprise applications to ingest, interpret and contextually understand documents with human-like judgment.
But tiny 30-person startup Arcee AI disagrees. The company just released a truly and permanently open (Apache license) general-purpose, foundation model called Trinity, and Arcee claims that at 400B parameters, it is among the largest open-source foundation models ever trained and released by a U.S. company. Arcee says Trinity compares to Meta's Llama 4 Maverick 400B, and Z.ai GLM-4.5, a high-performing open-source model from China's Tsinghua University, according to benchmark tests conducted using base models (very little post training).