Data science
fromInfoWorld
2 days agoWhy 'curate first, annotate smarter' is reshaping computer vision development
Strategic data selection and curation reduce annotation costs and enhance development productivity in computer vision teams.
Every year, poor communication and siloed data bleed companies of productivity and profit. Research shows U.S. businesses lose up to $1.2 trillion annually to ineffective communication, that's about $12,506 per employee per year. This stems from breakdowns that waste an average of 7.47 hours per employee each week on miscommunications. The damage isn't only interpersonal; it's structural. Disconnected and fragmented data systems mean that employees spend around 12 hours per week just searching for information trapped in those silos.
Since AlexNet5, deep learning has replaced heuristic hand-crafted features by unifying feature learning with deep neural networks. Later, Transformers6 and GPT-3 (ref. 1) further advanced sequence learning at scale, unifying structured tasks such as natural language processing. However, multimodal learning, spanning modalities such as images, video and text, has remained fragmented, relying on separate diffusion-based generation or compositional vision-language pipelines with many hand-crafted designs.
What happens under the hood? How is the search engine able to take that simple query, look for images in the billions, trillions of images that are available online? How is it able to find this one or similar photos from all that? Usually, there is an embedding model that is doing this work behind the hood.
The title "data scientist" is quietly disappearing from job postings, internal org charts, and LinkedIn headlines. In its place, roles like "AI engineer," "applied AI engineer," and "machine learning engineer" are becoming the norm. This Data Scientist vs AI Engineer shift raises an important question for practitioners and leaders alike: what actually changes when a data scientist becomes an AI engineer, and what stays the same? More importantly, what skills matter if you want to make this transition intentionally rather than by accident?
Agentic AI workflows sit at the intersection of automation and decision-making. Unlike a standard workflow, where data flows through pre-defined steps, an agentic workflow gives a language model discretion. The model can decide when to act, when to pause, and when to invoke tools like web search, databases, or internal APIs. That flexibility is powerful - but also costly, fragile, and easy to misuse.