Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

gossithedog@cyberplace.social · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

As an aside, Microsoft had a publicly reported security incident a year or so ago where petabytes of data was left in a public Azure Storage Blob.

What they didn't say - that petabytes of data was customer photos of animals they'd classified and taken for AI work, t'was some grads just exporting stuff. Good job everybody is preaching about Responsible AI(tm).

scottgal@hachyderm.io

@DrHyde @GossiTheDog Oh yeah I get that, sorry. I don't understand the ramifications of their possession, the originator's (presumably continued possession) of now identified CSAM material...which means they would be legally required to remove and report the user.
NO IDEA how they wouldn't have ANY moral qualms about NOT doing that nevermind what should be OBVIOUS legal liability (but corps are 'special' etc...)!

scottgal@hachyderm.io

@GossiTheDog BUT certain types of AI it would be obviously. THOSE need to exist in a regulated way and made open source. Like current PII scrubbing models it's a public good but I don't know any commercial company who COULD do it. Orthogonal sorry but just occurred to me...how do you get those models?

moses_izumi@fe.disroot.org · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

In my country, the abbreviation CP only means cerebral palsy.

Med andra ord är GenAI-branschen fullständigt CP-skadad.

RE: https://cyberplace.social/@GossiTheDog/115978385132170439

mrundkvist@archaeo.social · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog
Another headline here might be "Amazon admits in public to possessing a huge volume of child pornography".

jmcrookston@mastodon.social · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog

What? Hand curation of trillions of issues didn't work?

I'm shocked ayes tell ya, shocked!

masek@infosec.exchange

@GossiTheDog I would expect that they harvest open (no auth, indexable) S3 buckets for AI training.

And you probably know what you find there ....

driusan@doomscroller.social · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog@cyberplace.social Sounds like police should be arresting and charging people at Amazon, then.

imbrium_photography@mastodon.social

@masek @GossiTheDog But have they plundered Amazon S3 customer data, that the customers had set as private ?

sassinake@mastodon.social · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog

well there's your Epstein files right there!

lyle@cville.online · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog I’m starting to worry that these insanely powerful black box systems have some flaws

masek@infosec.exchange

@imbrium_photography I would not rule it out. But there is already plenty "not set private but really private" data in open S3 buckets.

A colleague once found the financial data on a large part of a country in such bucket (plus a copy from their ID card.

mast0d0nphan@beige.party · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog Cool. If you continue to buy from Amazon, read off Kindle, buy from Whole Foods, and obtain AWS certifications, among other Amazon-owned things, YOU ARE SUPPORTING PEDOPHILIA AND PEDOPHILES!

photovince@mastodon.social · 26-01-29/amazon-found-child-sex-abuse-in-ai-training-data

@GossiTheDog Sounds very illegal to me, knowing of a crime and keeping info from the law (who this concerns, not some vague ‘regulators’)

wall_e@ioc.exchange

@troed @GossiTheDog plot twist of the year would be if the "dataset" they're talking about turned out to be "any image file uploaded to an S3 bucket between 2022 and today"