Data science
fromInfoWorld
6 minutes agoAddressing the challenges of unstructured data governance for AI
Enterprises must enhance data governance for unstructured data as AI transforms data management practices.
Meta is working on two proprietary frontier models: Avocado, a large language model, and Mango, a multimedia file generator. The open-source variants are expected to be made available at a later date.
Buyers no longer open ten tabs, skim through blog posts, and slowly form an opinion over weeks. Instead, they ask a single question to an AI system and receive a shortlist in return, usually two or three companies that feel familiar, credible, and safe enough to justify internally. That shortlist often becomes the entire market in the buyer's mind.
If you want to narrow your options down to bags suitable for a trip to Portland, Oregon in May, Al Mode will start a query fan-out, which means it runs several simultaneous searches to figure out what makes a bag good for rainy weather and long journeys, and then use those criteria to suggest waterproof options with easy access to pockets.
That was a year or so ago, and my first brush with what generative AI could do. Like many, I started using it for fun: planning trips, finding nineteenth century authors I could recommend to fantasy-loving students (a genre I don't read), and making a holiday card starring my dog, Harry. But as work piled up, I didn't have time for new toys, so now I use AI for work.
Generative AI is now incorporated into the workflow for many scholars across many disciplines, but the broader scientific community would benefit from taking stock of how this technology could truly benefit our work and how it might distract. We hope the symposium can provide clarity.
What happens under the hood? How is the search engine able to take that simple query, look for images in the billions, trillions of images that are available online? How is it able to find this one or similar photos from all that? Usually, there is an embedding model that is doing this work behind the hood.
OpenAI has released Open Responses, an open specification to standardize agentic AI workflows and reduce API fragmentation. Supported by partners like Hugging Face and Vercel and local inference providers, the spec introduces unified standards for agentic loops, reasoning visibility, and internal versus external tool execution. It aims to enable developers to easily switch between proprietary models and open-source models without rewriting integration code.
By comparing how AI models and humans map these words to numerical percentages, we uncovered significant gaps between humans and large language models. While the models do tend to agree with humans on extremes like 'impossible,' they diverge sharply on hedge words like 'maybe.' For example, a model might use the word 'likely' to represent an 80% probability, while a human reader assumes it means closer to 65%.