Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. Under Attack, Please Stand By.

Under Attack, Please Stand By.

Planlagt Fastgjort Låst Flyttet Ikke-kategoriseret
5 Indlæg 2 Posters 0 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne tråd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • jwz@mastodon.socialJ This user is from outside of this forum
    jwz@mastodon.socialJ This user is from outside of this forum
    jwz@mastodon.social
    wrote on sidst redigeret af
    #1

    Under Attack, Please Stand By.

    My server is getting absolutely obliterated by AI scrapers today. Load is 140+ and I just manually banned 58,000 IP addresses.

    Top ten user agents: ...
    https://jwz.org/b/ykqJ

    jwz@mastodon.socialJ 2 Replies Last reply
    0
    • jwz@mastodon.socialJ jwz@mastodon.social

      Under Attack, Please Stand By.

      My server is getting absolutely obliterated by AI scrapers today. Load is 140+ and I just manually banned 58,000 IP addresses.

      Top ten user agents: ...
      https://jwz.org/b/ykqJ

      jwz@mastodon.socialJ This user is from outside of this forum
      jwz@mastodon.socialJ This user is from outside of this forum
      jwz@mastodon.social
      wrote sidst redigeret af
      #2

      Huh, why is my / disk full?

      Oh:
      -rw-------. 1 root root 1104886787 Aug 23 12:29 /var/log/php-fpm/error.log

      That's just 6 days worth of AI-scraping bots.

      1 Reply Last reply
      1
      0
      • pelle@veganism.socialP pelle@veganism.social shared this topic
      • jwz@mastodon.socialJ jwz@mastodon.social

        Under Attack, Please Stand By.

        My server is getting absolutely obliterated by AI scrapers today. Load is 140+ and I just manually banned 58,000 IP addresses.

        Top ten user agents: ...
        https://jwz.org/b/ykqJ

        jwz@mastodon.socialJ This user is from outside of this forum
        jwz@mastodon.socialJ This user is from outside of this forum
        jwz@mastodon.social
        wrote sidst redigeret af
        #3

        Getting absolutely reamed by AI scrapers again today, and it seemed like my mitigations were failing. Why weren't these being blocked?

        Oh. Because I had Facebook's subnets on the fail2ban whitelist, so that people sharing @dnalounge links on the Zuckerweb got link previews.

        Welp. You can't whitelist "facebookexternalhit" (the link preview bot) without also whitelisting "meta-externalagent" (the AI scraper, which seems to ignore robots.txt).

        I guess link previews are gonna be a casualty.

        jwz@mastodon.socialJ 1 Reply Last reply
        0
        • jwz@mastodon.socialJ jwz@mastodon.social

          Getting absolutely reamed by AI scrapers again today, and it seemed like my mitigations were failing. Why weren't these being blocked?

          Oh. Because I had Facebook's subnets on the fail2ban whitelist, so that people sharing @dnalounge links on the Zuckerweb got link previews.

          Welp. You can't whitelist "facebookexternalhit" (the link preview bot) without also whitelisting "meta-externalagent" (the AI scraper, which seems to ignore robots.txt).

          I guess link previews are gonna be a casualty.

          jwz@mastodon.socialJ This user is from outside of this forum
          jwz@mastodon.socialJ This user is from outside of this forum
          jwz@mastodon.social
          wrote sidst redigeret af
          #4

          Welp. I now have proof that Facebook is using the same outbound IP addresses for both A) scraping web sites for "AI" training and B) fetching images from my site when I use the official API to post to Instagram.

          This means that I can either defend myself from injuriously voracious AI scrapers, or have a functional business Instagram account, but not both.

          Not just the same subnets. The same IPs.

          pelle@veganism.socialP 1 Reply Last reply
          1
          0
          • jwz@mastodon.socialJ jwz@mastodon.social

            Welp. I now have proof that Facebook is using the same outbound IP addresses for both A) scraping web sites for "AI" training and B) fetching images from my site when I use the official API to post to Instagram.

            This means that I can either defend myself from injuriously voracious AI scrapers, or have a functional business Instagram account, but not both.

            Not just the same subnets. The same IPs.

            pelle@veganism.socialP This user is from outside of this forum
            pelle@veganism.socialP This user is from outside of this forum
            pelle@veganism.social
            wrote sidst redigeret af
            #5

            @jwz plz do to fb requests what you're doing to requests with hn referrer

            1 Reply Last reply
            0
            Svar
            • Svar som emne
            Login for at svare
            • Ældste til nyeste
            • Nyeste til ældste
            • Most Votes


            • Log ind

            • Har du ikke en konto? Tilmeld

            • Login or register to search.
            Powered by NodeBB Contributors
            Graciously hosted by data.coop
            • First post
              Last post
            0
            • Hjem
            • Seneste
            • Etiketter
            • Populære
            • Verden
            • Bruger
            • Grupper