We believe this practice is still lawful when collecting training data for generative AI, but the question of whether something should be illegal is different from whether it may be considered rude, gauche, or unpleasant.
Today, both academic and for-profit researchers collect training data for AI using bots that go out searching all over the web and 'scrape up' or store the content of each site they come across.
[
Collection
]
[
|
...
]