Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
-
@GossiTheDog Local models are getting good enough now (and uncensored) to make this trivial even for the inept pervert. Pandora's personal paedophillia producers' box is already open sadly.
@scottgal that doesn't mean using child sexual abuse material images to train AI is okay.
-
Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data@GossiTheDog this sounds pretty unbelievable tbh. LAION having "thousands" was a big public thing forcing re-release of the dataset. Others just piling on after this was discovered with no detection algorithms having been used??
Amazon should really publish this information.
-
Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data@GossiTheDog reminder that recursive pollution remains a HUGE open problem with ML models.
https://berryvilleiml.com/2026/01/10/recursive-pollution-and-model-collapse-are-not-the-same/
-
@scottgal that doesn't mean using child sexual abuse material images to train AI is okay.
@GossiTheDog @scottgal they say they're not training on it, it was detected before training. But that's not the point. Amazon got the stuff from somewhere, and a decent person would report where it came from so that the rozzers can trace it back upstream. I flat out don't believe Amazon's claim to not know where it came from, they must know, because they must have got copyright clearance for making a derivative work from all that content

-
Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data@GossiTheDog Can’t read the article so this is speculation: Amazon admitted having lots of CSAM but refuses to tell where they downloaded from? I thought holding on to CSAM is a crime in self, but as usual rules do not apply to big tech. And where did the material came from? Secret access to customer data they refuse to disclose?
-
Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data@GossiTheDog AI = CSAM
-
Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data@GossiTheDog wasnt that confirmed to be the case years ago when all this ai bullshit started.
Like even if you just scrape the clear web youll likely scrape some of that shit -
Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data -
Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-dataAs an aside, Microsoft had a publicly reported security incident a year or so ago where petabytes of data was left in a public Azure Storage Blob.
What they didn't say - that petabytes of data was customer photos of animals they'd classified and taken for AI work, t'was some grads just exporting stuff. Good job everybody is preaching about Responsible AI(tm).
-
@GossiTheDog @scottgal they say they're not training on it, it was detected before training. But that's not the point. Amazon got the stuff from somewhere, and a decent person would report where it came from so that the rozzers can trace it back upstream. I flat out don't believe Amazon's claim to not know where it came from, they must know, because they must have got copyright clearance for making a derivative work from all that content

@DrHyde @GossiTheDog Oh yeah I get that, sorry. I don't understand the ramifications of their possession, the originator's (presumably continued possession) of now identified CSAM material...which means they would be legally required to remove and report the user.
NO IDEA how they wouldn't have ANY moral qualms about NOT doing that nevermind what should be OBVIOUS legal liability (but corps are 'special' etc...)! -
@scottgal that doesn't mean using child sexual abuse material images to train AI is okay.
@GossiTheDog BUT certain types of AI it would be obviously. THOSE need to exist in a regulated way and made open source. Like current PII scrubbing models it's a public good but I don't know any commercial company who COULD do it. Orthogonal sorry but just occurred to me...how do you get those models?
-
Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-dataIn my country, the abbreviation CP only means cerebral palsy.
Med andra ord är GenAI-branschen fullständigt CP-skadad.
RE: https://cyberplace.social/@GossiTheDog/115978385132170439 -
Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data@GossiTheDog
Another headline here might be "Amazon admits in public to possessing a huge volume of child pornography". -
Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-dataWhat? Hand curation of trillions of issues didn't work?
I'm shocked ayes tell ya, shocked!
-
As an aside, Microsoft had a publicly reported security incident a year or so ago where petabytes of data was left in a public Azure Storage Blob.
What they didn't say - that petabytes of data was customer photos of animals they'd classified and taken for AI work, t'was some grads just exporting stuff. Good job everybody is preaching about Responsible AI(tm).
@GossiTheDog I would expect that they harvest open (no auth, indexable) S3 buckets for AI training.
And you probably know what you find there ....
-
Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data@GossiTheDog@cyberplace.social Sounds like police should be arresting and charging people at Amazon, then.
-
@GossiTheDog I would expect that they harvest open (no auth, indexable) S3 buckets for AI training.
And you probably know what you find there ....
@masek @GossiTheDog But have they plundered Amazon S3 customer data, that the customers had set as private ?
-
Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-datawell there's your Epstein files right there!
-
Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.
If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data@GossiTheDog I’m starting to worry that these insanely powerful black box systems have some flaws
-
@masek @GossiTheDog But have they plundered Amazon S3 customer data, that the customers had set as private ?
@imbrium_photography I would not rule it out. But there is already plenty "not set private but really private" data in open S3 buckets.
A colleague once found the financial data on a large part of a country in such bucket (plus a copy from their ID card.