#eleutherai

[ follow ]
VentureBeat
8 months ago
Artificial intelligence

One of the world's largest AI training datasets is about to get bigger and 'substantially better'

The organization EleutherAI, which created the diverse text corpora Pile, became a target of legal and ethical concerns regarding the use of AI training datasets.
Despite facing lawsuits, EleutherAI is collaborating with multiple organizations to build an updated version of the Pile dataset that is expected to be bigger and 'substantially better'. [ more ]
ReadWrite
2 months ago
Artificial intelligence

Apple denies using YouTube content to train Apple Intelligence

Apple denies using unethically sourced EleutherAI's 'Pile' for Apple Intelligence, confirms using it for OpenELM models.
EleutherAI scraps web for datasets like YouTube captions to democratize AI research, lower entry barrier for firms.
Apple's OpenELM created for research, not powering Apple Intelligence, no plans for expansion. [ more ]
[ Load more ]