In my earlier post I said I am seeing an onslaught of web scrapers running through my web sites. A lot more than in the past – they scoop up everything – posts, uploaded images, links – everything.

It just occurred to me that this sudden increase in crawlers and scrapers might be AI systems that are harvesting immense amounts of text for training their language systems.

In effect, the AI systems are engaged in global intellectual property theft. That thought is not original with me but has been proposed by others.

There are related thoughts in this news report – who owns the data that is being used by AI systems for training?

Coldstreams