Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

lyle@cville.online · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog I’m starting to worry that these insanely powerful black box systems have some flaws

masek@infosec.exchange

@imbrium_photography I would not rule it out. But there is already plenty "not set private but really private" data in open S3 buckets.

A colleague once found the financial data on a large part of a country in such bucket (plus a copy from their ID card.

mast0d0nphan@beige.party · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog Cool. If you continue to buy from Amazon, read off Kindle, buy from Whole Foods, and obtain AWS certifications, among other Amazon-owned things, YOU ARE SUPPORTING PEDOPHILIA AND PEDOPHILES!

photovince@mastodon.social · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog Sounds very illegal to me, knowing of a crime and keeping info from the law (who this concerns, not some vague ‘regulators’)

wall_e@ioc.exchange

@troed @GossiTheDog plot twist of the year would be if the "dataset" they're talking about turned out to be "any image file uploaded to an S3 bucket between 2022 and today"

troed@swecyb.com

@wall_e

_That_ I could believe!

@GossiTheDog

tehstu@hachyderm.io · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog I didn't have "CSAM at scale is unavoidable" on my 2026 bingo card.

atlovato@mastodon.social

@imbrium_photography @masek @GossiTheDog - I like the word that you have used: "Plundered" Private Data that was set to privacy.

atlovato@mastodon.social

@DrHyde @GossiTheDog @scottgal - Or Plundered Data.

atlovato@mastodon.social

@scottgal @GossiTheDog

cnx@awkward.place · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

If you’re using generative ~~AI tools~~ applied statistics, there’s a pretty good chance you’re ~~generating imagery with~~ supporting the distribution of child porn ~~training data behind the scenes~~.

FTFY, @GossiTheDog@cyberplace.social

ralph@hear-me.social · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog

ALT TEXT:

Bloomberg
Amazon Found 'High Volume' Of Child Sex Abuse Material in AI Training Data.
The tech giant reported hundreds of thousands of cases of Child Sex Abuse Material but won’t say where it came from.

jer@chirp.enworld.org · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog That article is full of red flags from Amazon. They claim they have a "lower threshold" so they're "overreporting" but not providing info on the source of the images?

That sounds like they're trying to break NCMEC's reporting system either through malice or incompetence.

Also it sounds like they're not keeping the provenance of the data they're using - which strongly suggests that they're not obtaining that data in a legal manner

nxskok@cupoftea.social · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog and, every one of those pictures has been seen and classified by a minimum-wage worker in the third world so that the user doesn't get to see it (at a predictable cost to said third-world worker's mental health).

syllopsium@peoplemaking.games · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog 'is refusing to tell regulators'?

Good luck with that if there are any datasets in the UK. Time for arrests and seizure of machines.

It should be the same in the US, but of course nothing comes before the 'mighty' dollar

ggmcbg@mstdn.plus · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog

The sets are what they stole from billionaires and senators' sons. Even themselves.

What the fuck is wrong with people? No one gets to convince me we ain't the worst disease this planet must suffer.

gkrnours@mastodon.gamedev.place · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog I wonder if they found the data crawling their user storage and they don't want to tell about it to keep the patient money money

landelare@mastodon.gamedev.place · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog "refusing to tell regulators which data sets"

In what world is this not criminal, and why are we living in that one?

corax42@mastodon.social

@Sassinake @GossiTheDog Scraped from a DoJ server left unsecured by DOGE? Everything's possible with these people

technicaladept@techhub.social · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog Famously, generative AI has been hilariously bad at producing a picture of a glass of wine that's anything other than about half full. Ask for one that's full or nearly empty and it can only show you ones that match it's training data: where all the glasses show a tasteful measure. And good luck asking for a clock face that doesn't show seven minutes past ten. It just can't extrapolate. However ask it what a naked child look like and it's remarkably good at it. Why? Well ask the people who tripped CSAM filters by downloading image training data. Dear Elon, why is Grok so good at making child porn. Did you train it on your own kids or ours? And telling the interface not to show you the filthy kiddie pics that it's gathered, is a bit like selling a porn magazine and asking customers not to look at pages 12-27 because you accidentally abused some kids when you made it.