OpenAI Secretly Trained GPT-4 With More Than a Million Hours of Transcribed YouTube Videos
OpenAI's latest text-to-video generator, Sora, may have been trained using publicly available and licensed data, including transcribed YouTube videos.
AI companies like OpenAI and Google are using murky and potentially copyright-infringing data to train their models, leading to lawsuits and accusations of misattributing practices. [ more ]
Microsoft Mocks NYT's AI Lawsuit As "Doomsday Futurology"
The New York Times filed a lawsuit against Microsoft and OpenAI over the use of news articles for AI training. Microsoft and OpenAI responded, claiming the lawsuit is without merit and stressing the transformative nature of using content for language models. [ more ]
Are OpenAI's deals with publishers edging out the competition? | TechCrunch
OpenAI partners with French and Spanish news publishers like Le Monde and Prisa Media to bring their content to ChatGPT users.
The Information reported that OpenAI is offering publishers between $1 million and $5 million a year for access to archives to train its AI models. [ more ]