""I think a lot of people think of data labeling as it relates to simplistic work, like labeling cat photos and drawing boundary marks around cars," Chen told Lenny Rachitsky on his "Lenny podcast." Chen, who previously worked at Google, Twitter, and Meta, said that he's "always hated the word data labeling." "Because it just paints this very simplistic picture when I think what we're doing is completely different," he said."
""Surge AI, which Chen founded in 2020, competes in the AI data labeling space with companies like Scale AI and Mercor. Surge also has a partnership with Anthropic and also runs DataAnnotation.tech, where freelancers can sign up to get paid for training AI models. These remote workers are often referred to as " ghost workers" for their behind-the-scenes labor that is critical to AI's development.""
Data labeling requires creative judgment and conveys values, not just simple tagging or boundary marking. The work involves teaching models nuances like creativity, beauty, and ethical subtleties that influence model behavior. Startups in the space compete to train leading AI models while partnering with research labs and platforms. Platforms connect freelancers to paid annotation work, and many remote annotators perform essential behind-the-scenes labor often described as ghost work. Annotation choices and workflows can therefore have broad impacts on model outputs and the societal norms those models reflect.
Read at Business Insider
Unable to calculate read time
Collection
[
|
...
]