Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

Planlagt Fastgjort Låst Flyttet Ikke-kategoriseret
40 Indlæg 32 Posters 35 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne tråd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • gossithedog@cyberplace.socialG gossithedog@cyberplace.social

    Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

    If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
    https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data

    jmcrookston@mastodon.socialJ This user is from outside of this forum
    jmcrookston@mastodon.socialJ This user is from outside of this forum
    jmcrookston@mastodon.social
    wrote sidst redigeret af
    #16

    @GossiTheDog

    What? Hand curation of trillions of issues didn't work?

    I'm shocked ayes tell ya, shocked!

    1 Reply Last reply
    0
    • gossithedog@cyberplace.socialG gossithedog@cyberplace.social

      As an aside, Microsoft had a publicly reported security incident a year or so ago where petabytes of data was left in a public Azure Storage Blob.

      What they didn't say - that petabytes of data was customer photos of animals they'd classified and taken for AI work, t'was some grads just exporting stuff. Good job everybody is preaching about Responsible AI(tm).

      masek@infosec.exchangeM This user is from outside of this forum
      masek@infosec.exchangeM This user is from outside of this forum
      masek@infosec.exchange
      wrote sidst redigeret af
      #17

      @GossiTheDog I would expect that they harvest open (no auth, indexable) S3 buckets for AI training.

      And you probably know what you find there ....

      imbrium_photography@mastodon.socialI 1 Reply Last reply
      0
      • gossithedog@cyberplace.socialG gossithedog@cyberplace.social

        Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

        If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
        https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data

        driusan@doomscroller.socialD This user is from outside of this forum
        driusan@doomscroller.socialD This user is from outside of this forum
        driusan@doomscroller.social
        wrote sidst redigeret af
        #18

        @GossiTheDog@cyberplace.social Sounds like police should be arresting and charging people at Amazon, then.

        1 Reply Last reply
        0
        • masek@infosec.exchangeM masek@infosec.exchange

          @GossiTheDog I would expect that they harvest open (no auth, indexable) S3 buckets for AI training.

          And you probably know what you find there ....

          imbrium_photography@mastodon.socialI This user is from outside of this forum
          imbrium_photography@mastodon.socialI This user is from outside of this forum
          imbrium_photography@mastodon.social
          wrote sidst redigeret af
          #19

          @masek @GossiTheDog But have they plundered Amazon S3 customer data, that the customers had set as private ?

          masek@infosec.exchangeM atlovato@mastodon.socialA 2 Replies Last reply
          0
          • gossithedog@cyberplace.socialG gossithedog@cyberplace.social

            Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

            If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
            https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data

            sassinake@mastodon.socialS This user is from outside of this forum
            sassinake@mastodon.socialS This user is from outside of this forum
            sassinake@mastodon.social
            wrote sidst redigeret af
            #20

            @GossiTheDog

            well there's your Epstein files right there!

            corax42@mastodon.socialC 1 Reply Last reply
            0
            • gossithedog@cyberplace.socialG gossithedog@cyberplace.social

              Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

              If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
              https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data

              lyle@cville.onlineL This user is from outside of this forum
              lyle@cville.onlineL This user is from outside of this forum
              lyle@cville.online
              wrote sidst redigeret af
              #21

              @GossiTheDog I’m starting to worry that these insanely powerful black box systems have some flaws

              1 Reply Last reply
              0
              • imbrium_photography@mastodon.socialI imbrium_photography@mastodon.social

                @masek @GossiTheDog But have they plundered Amazon S3 customer data, that the customers had set as private ?

                masek@infosec.exchangeM This user is from outside of this forum
                masek@infosec.exchangeM This user is from outside of this forum
                masek@infosec.exchange
                wrote sidst redigeret af
                #22

                @imbrium_photography I would not rule it out. But there is already plenty "not set private but really private" data in open S3 buckets.

                A colleague once found the financial data on a large part of a country in such bucket (plus a copy from their ID card.

                1 Reply Last reply
                0
                • gossithedog@cyberplace.socialG gossithedog@cyberplace.social

                  Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

                  If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
                  https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data

                  mast0d0nphan@beige.partyM This user is from outside of this forum
                  mast0d0nphan@beige.partyM This user is from outside of this forum
                  mast0d0nphan@beige.party
                  wrote sidst redigeret af
                  #23

                  @GossiTheDog Cool. If you continue to buy from Amazon, read off Kindle, buy from Whole Foods, and obtain AWS certifications, among other Amazon-owned things, YOU ARE SUPPORTING PEDOPHILIA AND PEDOPHILES!

                  1 Reply Last reply
                  0
                  • gossithedog@cyberplace.socialG gossithedog@cyberplace.social

                    Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

                    If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
                    https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data

                    photovince@mastodon.socialP This user is from outside of this forum
                    photovince@mastodon.socialP This user is from outside of this forum
                    photovince@mastodon.social
                    wrote sidst redigeret af
                    #24

                    @GossiTheDog Sounds very illegal to me, knowing of a crime and keeping info from the law (who this concerns, not some vague ‘regulators’)

                    1 Reply Last reply
                    0
                    • troed@swecyb.comT troed@swecyb.com

                      @GossiTheDog this sounds pretty unbelievable tbh. LAION having "thousands" was a big public thing forcing re-release of the dataset. Others just piling on after this was discovered with no detection algorithms having been used??

                      Amazon should really publish this information.

                      https://petapixel.com/2024/09/03/major-ai-image-dataset-is-back-online-after-being-pulled-over-csam-laion-5b/

                      wall_e@ioc.exchangeW This user is from outside of this forum
                      wall_e@ioc.exchangeW This user is from outside of this forum
                      wall_e@ioc.exchange
                      wrote sidst redigeret af
                      #25

                      @troed @GossiTheDog plot twist of the year would be if the "dataset" they're talking about turned out to be "any image file uploaded to an S3 bucket between 2022 and today" 😬

                      troed@swecyb.comT 1 Reply Last reply
                      0
                      • wall_e@ioc.exchangeW wall_e@ioc.exchange

                        @troed @GossiTheDog plot twist of the year would be if the "dataset" they're talking about turned out to be "any image file uploaded to an S3 bucket between 2022 and today" 😬

                        troed@swecyb.comT This user is from outside of this forum
                        troed@swecyb.comT This user is from outside of this forum
                        troed@swecyb.com
                        wrote sidst redigeret af
                        #26

                        @wall_e

                        _That_ I could believe!

                        @GossiTheDog

                        1 Reply Last reply
                        0
                        • gossithedog@cyberplace.socialG gossithedog@cyberplace.social

                          Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

                          If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
                          https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data

                          tehstu@hachyderm.ioT This user is from outside of this forum
                          tehstu@hachyderm.ioT This user is from outside of this forum
                          tehstu@hachyderm.io
                          wrote sidst redigeret af
                          #27

                          @GossiTheDog I didn't have "CSAM at scale is unavoidable" on my 2026 bingo card.

                          1 Reply Last reply
                          0
                          • imbrium_photography@mastodon.socialI imbrium_photography@mastodon.social

                            @masek @GossiTheDog But have they plundered Amazon S3 customer data, that the customers had set as private ?

                            atlovato@mastodon.socialA This user is from outside of this forum
                            atlovato@mastodon.socialA This user is from outside of this forum
                            atlovato@mastodon.social
                            wrote sidst redigeret af
                            #28

                            @imbrium_photography @masek @GossiTheDog - I like the word that you have used: "Plundered" Private Data that was set to privacy.

                            1 Reply Last reply
                            0
                            • drhyde@fosstodon.orgD drhyde@fosstodon.org

                              @GossiTheDog @scottgal they say they're not training on it, it was detected before training. But that's not the point. Amazon got the stuff from somewhere, and a decent person would report where it came from so that the rozzers can trace it back upstream. I flat out don't believe Amazon's claim to not know where it came from, they must know, because they must have got copyright clearance for making a derivative work from all that content 😉

                              atlovato@mastodon.socialA This user is from outside of this forum
                              atlovato@mastodon.socialA This user is from outside of this forum
                              atlovato@mastodon.social
                              wrote sidst redigeret af
                              #29

                              @DrHyde @GossiTheDog @scottgal - Or Plundered Data.

                              1 Reply Last reply
                              0
                              • scottgal@hachyderm.ioS scottgal@hachyderm.io

                                @GossiTheDog BUT certain types of AI it would be obviously. THOSE need to exist in a regulated way and made open source. Like current PII scrubbing models it's a public good but I don't know any commercial company who COULD do it. Orthogonal sorry but just occurred to me...how do you get those models?

                                atlovato@mastodon.socialA This user is from outside of this forum
                                atlovato@mastodon.socialA This user is from outside of this forum
                                atlovato@mastodon.social
                                wrote sidst redigeret af
                                #30

                                @scottgal @GossiTheDog 👍

                                1 Reply Last reply
                                0
                                • gossithedog@cyberplace.socialG gossithedog@cyberplace.social

                                  Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

                                  If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
                                  https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data

                                  cnx@awkward.placeC This user is from outside of this forum
                                  cnx@awkward.placeC This user is from outside of this forum
                                  cnx@awkward.place
                                  wrote sidst redigeret af
                                  #31

                                  If you’re using generative AI tools applied statistics, there’s a pretty good chance you’re generating imagery with supporting the distribution of child porn training data behind the scenes.

                                  FTFY, @GossiTheDog@cyberplace.social

                                  1 Reply Last reply
                                  0
                                  • gossithedog@cyberplace.socialG gossithedog@cyberplace.social

                                    Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

                                    If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
                                    https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data

                                    ralph@hear-me.socialR This user is from outside of this forum
                                    ralph@hear-me.socialR This user is from outside of this forum
                                    ralph@hear-me.social
                                    wrote sidst redigeret af
                                    #32

                                    @GossiTheDog

                                    ALT TEXT:

                                    Bloomberg
                                    Amazon Found 'High Volume' Of Child Sex Abuse Material in AI Training Data.
                                    The tech giant reported hundreds of thousands of cases of Child Sex Abuse Material but won’t say where it came from.

                                    1 Reply Last reply
                                    0
                                    • gossithedog@cyberplace.socialG gossithedog@cyberplace.social

                                      Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

                                      If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
                                      https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data

                                      jer@chirp.enworld.orgJ This user is from outside of this forum
                                      jer@chirp.enworld.orgJ This user is from outside of this forum
                                      jer@chirp.enworld.org
                                      wrote sidst redigeret af
                                      #33

                                      @GossiTheDog That article is full of red flags from Amazon. They claim they have a "lower threshold" so they're "overreporting" but not providing info on the source of the images?

                                      That sounds like they're trying to break NCMEC's reporting system either through malice or incompetence.

                                      Also it sounds like they're not keeping the provenance of the data they're using - which strongly suggests that they're not obtaining that data in a legal manner

                                      1 Reply Last reply
                                      0
                                      • gossithedog@cyberplace.socialG gossithedog@cyberplace.social

                                        Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

                                        If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
                                        https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data

                                        nxskok@cupoftea.socialN This user is from outside of this forum
                                        nxskok@cupoftea.socialN This user is from outside of this forum
                                        nxskok@cupoftea.social
                                        wrote sidst redigeret af
                                        #34

                                        @GossiTheDog and, every one of those pictures has been seen and classified by a minimum-wage worker in the third world so that the user doesn't get to see it (at a predictable cost to said third-world worker's mental health).

                                        1 Reply Last reply
                                        0
                                        • gossithedog@cyberplace.socialG gossithedog@cyberplace.social

                                          Amazon have reported "hundreds of thousands" of pictures of child sexual abuse material found in shared AI training data... but is refusing to tell regulators which data sets.

                                          If you're using generative AI tools, there's a pretty good chance you're generating imagery with child porn training data behind the scenes.
                                          https://www.bloomberg.com/news/features/2026-01-29/amazon-found-child-sex-abuse-in-ai-training-data

                                          syllopsium@peoplemaking.gamesS This user is from outside of this forum
                                          syllopsium@peoplemaking.gamesS This user is from outside of this forum
                                          syllopsium@peoplemaking.games
                                          wrote sidst redigeret af
                                          #35

                                          @GossiTheDog 'is refusing to tell regulators'?

                                          Good luck with that if there are any datasets in the UK. Time for arrests and seizure of machines.

                                          It should be the same in the US, but of course nothing comes before the 'mighty' dollar

                                          1 Reply Last reply
                                          0
                                          Svar
                                          • Svar som emne
                                          Login for at svare
                                          • Ældste til nyeste
                                          • Nyeste til ældste
                                          • Most Votes


                                          • Log ind

                                          • Har du ikke en konto? Tilmeld

                                          • Login or register to search.
                                          Powered by NodeBB Contributors
                                          Graciously hosted by data.coop
                                          • First post
                                            Last post
                                          0
                                          • Hjem
                                          • Seneste
                                          • Etiketter
                                          • Populære
                                          • Verden
                                          • Bruger
                                          • Grupper