OpenAI Secretly Trained GPT-4 With More Than a Million Hours of Transcribed YouTube Videos
OpenAI's latest text-to-video generator, Sora, may have been trained using publicly available and licensed data, including transcribed YouTube videos.
AI companies like OpenAI and Google are using murky and potentially copyright-infringing data to train their models, leading to lawsuits and accusations of misattributing practices. [ more ]
Microsoft Mocks NYT's AI Lawsuit As "Doomsday Futurology"
The New York Times filed a lawsuit against Microsoft and OpenAI over the use of news articles for AI training. Microsoft and OpenAI responded, claiming the lawsuit is without merit and stressing the transformative nature of using content for language models. [ more ]