LAION has been committed to removing illegal content from its datasets from the very beginning and has implemented appropriate measures to achieve this from the outset.
The release of Re-LAION-5B comes after an investigation in December 2023 by the Stanford Internet Observatory that found that LAION-5B included at least 1,679 links to illegal images.
LAION's datasets don't - and never did - contain images. Rather, they're indexes of links to images and image alt text that LAION curated.
It's available for download in two versions, Re-LAION-5B Research and Re-LAION-5B Research-Safe, both filtered for thousands of links to known - and 'likely' - CSAM.
Collection
[
|
...
]