Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. Web design in the early 2000s: Every 100ms of latency on page load costs visitors.

Web design in the early 2000s: Every 100ms of latency on page load costs visitors.

Planlagt Fastgjort Låst Flyttet Ikke-kategoriseret
72 Indlæg 43 Posters 224 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne tråd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

    Web design in the early 2000s: Every 100ms of latency on page load costs visitors.

    Web design in the late 2020s: Let's add a 10-second delay while Cloudflare checks that you are capable of ticking a checkbox in front of every page load.

    wayneepoo@mastodon.socialW This user is from outside of this forum
    wayneepoo@mastodon.socialW This user is from outside of this forum
    wayneepoo@mastodon.social
    wrote sidst redigeret af
    #53

    @david_chisnall
    But i LOVE finding which of 12 images has a zebra crossing in... 😳😱🤣

    1 Reply Last reply
    0
    • N nothacking@infosec.exchange

      @alexskunz @david_chisnall

      The thing is, you don't a CAPTCHA. Just three if statements on the server will do it:

      1. If the user agent is chrome, but it didn't send a "Sec-Ch-Ua" header: Send garbage.
      2. If the user agent is a known scraper ("GPTBot", etc): Send garbage.
      3. If the URL is one we generated: Send garbage.
      4. Otherwise, serve the page.

      The trick is that instead of blocking them, serve them randomly generated garbage pages.

      Each of these pages includes links that will always return garbage. Once these get into the bot's crawler queue, they will be identifiable regardless of how well they hide themselves.

      I use this on my site: after a few months, it's 100% effective. Every single scraper request is being blocked. At this point, I could ratelimit the generated URLs, but I enjoy sending them unhinged junk. (... and it's actually cheaper then serving static files!)

      This won't do anything about vuln scanners and other non-crawler bots, but those are easy enough to filter out anyway. (URL starts with /wp/?)

      bertkoor@mastodon.socialB This user is from outside of this forum
      bertkoor@mastodon.socialB This user is from outside of this forum
      bertkoor@mastodon.social
      wrote sidst redigeret af
      #54

      @nothacking
      Wdyt of this approach?

      > Connections are dropped (status code 444), rather than sending a 4xx HTTP response.
      > Why waste our precious CPU cycles and bandwidth? Instead, let the robot keep a connection open waiting for a reply from us.

      https://codeberg.org/fisharebest/robot-tools

      N 1 Reply Last reply
      0
      • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

        Web design in the early 2000s: Every 100ms of latency on page load costs visitors.

        Web design in the late 2020s: Let's add a 10-second delay while Cloudflare checks that you are capable of ticking a checkbox in front of every page load.

        keremgoart@mstdn.socialK This user is from outside of this forum
        keremgoart@mstdn.socialK This user is from outside of this forum
        keremgoart@mstdn.social
        wrote sidst redigeret af
        #55

        @david_chisnall yep 💯 frustrating 😞

        1 Reply Last reply
        0
        • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

          Web design in the early 2000s: Every 100ms of latency on page load costs visitors.

          Web design in the late 2020s: Let's add a 10-second delay while Cloudflare checks that you are capable of ticking a checkbox in front of every page load.

          cubeofcheese@mstdn.socialC This user is from outside of this forum
          cubeofcheese@mstdn.socialC This user is from outside of this forum
          cubeofcheese@mstdn.social
          wrote sidst redigeret af
          #56

          @david_chisnall crying emoji

          1 Reply Last reply
          0
          • hp@mastodon.tmm.cxH hp@mastodon.tmm.cx

            @david_chisnall This was when the tech bros realized that it is all in comparison to everything else.

            If you just make EVERYTHING worse then it doesn't matter that you're bad.

            The real story of computing (and perhaps all consumer goods)

            grumble209@kolektiva.socialG This user is from outside of this forum
            grumble209@kolektiva.socialG This user is from outside of this forum
            grumble209@kolektiva.social
            wrote sidst redigeret af
            #57

            @hp @david_chisnall Sounds like finding a candidate to vote for, to be honest...

            1 Reply Last reply
            0
            • V vendelan@mastodon.social

              @Laberpferd @david_chisnall proof of work is such a bad CAPTCHA. Like, who thought bots couldn't evaluate JS

              nachof@mastodon.uyN This user is from outside of this forum
              nachof@mastodon.uyN This user is from outside of this forum
              nachof@mastodon.uy
              wrote sidst redigeret af
              #58

              @vendelan
              The idea is not that they can't, it's that they won't.
              If you're a human visiting a website, evaluating some JS at worst costs you a few seconds. If you're a scraper bot trying to get millions of sites a second, it slows you down.

              @Laberpferd @david_chisnall

              1 Reply Last reply
              0
              • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

                Web design in the early 2000s: Every 100ms of latency on page load costs visitors.

                Web design in the late 2020s: Let's add a 10-second delay while Cloudflare checks that you are capable of ticking a checkbox in front of every page load.

                alexhelvetica@toot.catA This user is from outside of this forum
                alexhelvetica@toot.catA This user is from outside of this forum
                alexhelvetica@toot.cat
                wrote sidst redigeret af
                #59

                @david_chisnall and then webpages that load a dummy front end, because the real front end takes 15s to load. So then you click the search box and start typing type, and the characters end up in a random order when the real search box loads

                1 Reply Last reply
                0
                • bertkoor@mastodon.socialB bertkoor@mastodon.social

                  @nothacking
                  Wdyt of this approach?

                  > Connections are dropped (status code 444), rather than sending a 4xx HTTP response.
                  > Why waste our precious CPU cycles and bandwidth? Instead, let the robot keep a connection open waiting for a reply from us.

                  https://codeberg.org/fisharebest/robot-tools

                  N This user is from outside of this forum
                  N This user is from outside of this forum
                  nothacking@infosec.exchange
                  wrote sidst redigeret af
                  #60

                  @bertkoor Well, the advantage of sending junk is it makes crawlers trivially identifiable. That avoids the need for tricks like these:

                  > Other user-agents (hopefully all human!) get a cookie-check. e.g. Chrome, Safari, Firefox.

                  That still increases loading time. Even if the "CAPTCHA" is small, it'll still take several round trips to deliver.

                  ... of course once they've been feed poisoned URLs, they you can start blocking.

                  1 Reply Last reply
                  0
                  • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

                    Web design in the early 2000s: Every 100ms of latency on page load costs visitors.

                    Web design in the late 2020s: Let's add a 10-second delay while Cloudflare checks that you are capable of ticking a checkbox in front of every page load.

                    apz@some.apz.fiA This user is from outside of this forum
                    apz@some.apz.fiA This user is from outside of this forum
                    apz@some.apz.fi
                    wrote sidst redigeret af
                    #61

                    @david_chisnall The horrible delays were there way before CloudFlare. I use a lot of big company web services at work daily, most of them load 10+ seconds even with a gigabit Internet and a fast computer. They're totally miserable with a mobile connection. Every time I look the page sources, it just get sad and angry how relatively simple web GUIs have been implemented by pouring all kinds of libraries and frameworks to cause the browser tab to suck a gigabyte to show me couple of dropdowns and entry fields.

                    1 Reply Last reply
                    0
                    • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

                      Web design in the early 2000s: Every 100ms of latency on page load costs visitors.

                      Web design in the late 2020s: Let's add a 10-second delay while Cloudflare checks that you are capable of ticking a checkbox in front of every page load.

                      wobintosh@chaos.socialW This user is from outside of this forum
                      wobintosh@chaos.socialW This user is from outside of this forum
                      wobintosh@chaos.social
                      wrote sidst redigeret af
                      #62

                      @david_chisnall And another 10 seconds because somebody had the great idea that it would be smart to load something like 500 MB of JavaScript for a page with just text.

                      1 Reply Last reply
                      0
                      • david_chisnall@infosec.exchangeD david_chisnall@infosec.exchange

                        Web design in the early 2000s: Every 100ms of latency on page load costs visitors.

                        Web design in the late 2020s: Let's add a 10-second delay while Cloudflare checks that you are capable of ticking a checkbox in front of every page load.

                        bamfic@autonomous.zoneB This user is from outside of this forum
                        bamfic@autonomous.zoneB This user is from outside of this forum
                        bamfic@autonomous.zone
                        wrote sidst redigeret af
                        #63

                        @david_chisnall we notice you are using an ad blocker

                        1 Reply Last reply
                        0
                        • jernej__s@infosec.exchangeJ jernej__s@infosec.exchange

                          @the_wub They check user-agent and challenge anything that claims to be Mozilla (because that's what the majority of bots masquerade as).

                          Also, weird that Seamonkey can't pass it – I just tried with Servo, and it had no problems.

                          the_wub@mastodon.socialT This user is from outside of this forum
                          the_wub@mastodon.socialT This user is from outside of this forum
                          the_wub@mastodon.social
                          wrote sidst redigeret af
                          #64

                          @jernej__s 1/n SeaMonkey is still based on an ancient version of the Firefox codebase.

                          I love the email client, and the browser has the tabs in the right place, but the browser fails to work on features in modern web sites. Ones that do not fall back gracefully.

                          I presume that this is what causes Abubis challenges to fail when using SeaMonkey.

                          I can get into Mozillazine without any Anubis challenge appearing using Netsurf. Which has a limited implementation of javascript.

                          the_wub@mastodon.socialT 1 Reply Last reply
                          0
                          • the_wub@mastodon.socialT the_wub@mastodon.social

                            @jernej__s 1/n SeaMonkey is still based on an ancient version of the Firefox codebase.

                            I love the email client, and the browser has the tabs in the right place, but the browser fails to work on features in modern web sites. Ones that do not fall back gracefully.

                            I presume that this is what causes Abubis challenges to fail when using SeaMonkey.

                            I can get into Mozillazine without any Anubis challenge appearing using Netsurf. Which has a limited implementation of javascript.

                            the_wub@mastodon.socialT This user is from outside of this forum
                            the_wub@mastodon.socialT This user is from outside of this forum
                            the_wub@mastodon.social
                            wrote sidst redigeret af
                            #65

                            @jernej__s 2/n So I installed NoScript in SeaMonkey to see if it is a javascript issue.

                            With javascript turned off I get this message.

                            "Sadly, you must enable JavaScript to get past this challenge. This is required because AI companies have changed the social contract around how website hosting works. A no-JS solution is a work-in-progress."

                            So being blocked now from using an add-on to protect myself from malicious scripts on websites.

                            OK so I will now whitelist Mozillazine.

                            the_wub@mastodon.socialT 1 Reply Last reply
                            0
                            • the_wub@mastodon.socialT the_wub@mastodon.social

                              @jernej__s 2/n So I installed NoScript in SeaMonkey to see if it is a javascript issue.

                              With javascript turned off I get this message.

                              "Sadly, you must enable JavaScript to get past this challenge. This is required because AI companies have changed the social contract around how website hosting works. A no-JS solution is a work-in-progress."

                              So being blocked now from using an add-on to protect myself from malicious scripts on websites.

                              OK so I will now whitelist Mozillazine.

                              the_wub@mastodon.socialT This user is from outside of this forum
                              the_wub@mastodon.socialT This user is from outside of this forum
                              the_wub@mastodon.social
                              wrote sidst redigeret af
                              #66

                              @jernej__s 3/n
                              Aha! A message I did not get the last time I got stuck trying to get into Mozillazine using SeaMonkey.

                              "Your browser is configured to disable cookies. Anubis requires cookies for the legitimate interest of making sure you are a valid client. Please enable cookies for this domain."

                              (But SM is set to accept all cookies.)

                              So in order for websites to protect themselves from AI scraping users have to reduce the level of security they are prepared to accept as safe when browsing.

                              the_wub@mastodon.socialT 1 Reply Last reply
                              0
                              • the_wub@mastodon.socialT the_wub@mastodon.social

                                @jernej__s 3/n
                                Aha! A message I did not get the last time I got stuck trying to get into Mozillazine using SeaMonkey.

                                "Your browser is configured to disable cookies. Anubis requires cookies for the legitimate interest of making sure you are a valid client. Please enable cookies for this domain."

                                (But SM is set to accept all cookies.)

                                So in order for websites to protect themselves from AI scraping users have to reduce the level of security they are prepared to accept as safe when browsing.

                                the_wub@mastodon.socialT This user is from outside of this forum
                                the_wub@mastodon.socialT This user is from outside of this forum
                                the_wub@mastodon.social
                                wrote sidst redigeret af
                                #67

                                @jernej__s n/n
                                Or to go through processes of whitelisting all of the relevant sites that you wish to visit as safe so that Anubis can validate your browser whilst otherwise disabling cookies and javascript for other sites.

                                Or just go find other sites to visit that do not assume you are a bot and block you from viewing content.

                                As regards SM being set to accept cookies and Anubis not recognising that maybe my pihole is blocking something that Anubis expects to find in a valid client?

                                jernej__s@infosec.exchangeJ 1 Reply Last reply
                                0
                                • the_wub@mastodon.socialT the_wub@mastodon.social

                                  @jernej__s n/n
                                  Or to go through processes of whitelisting all of the relevant sites that you wish to visit as safe so that Anubis can validate your browser whilst otherwise disabling cookies and javascript for other sites.

                                  Or just go find other sites to visit that do not assume you are a bot and block you from viewing content.

                                  As regards SM being set to accept cookies and Anubis not recognising that maybe my pihole is blocking something that Anubis expects to find in a valid client?

                                  jernej__s@infosec.exchangeJ This user is from outside of this forum
                                  jernej__s@infosec.exchangeJ This user is from outside of this forum
                                  jernej__s@infosec.exchange
                                  wrote sidst redigeret af
                                  #68

                                  @the_wub Sadly, when you get 5000 requests per second from residential IPs (each IP doing 10-20 requests, all using user agents from legitimate browsers), there's very little other things you can do. This is not an exaggeration, that's what was happening at a client that has a web server hosting about 50 sites for their projects – they were getting hit with that several times per week, bringing the whole server down until we implemented Anubis.

                                  the_wub@mastodon.socialT 2 Replies Last reply
                                  0
                                  • zeborah@mastodon.nzZ zeborah@mastodon.nz

                                    @hex0x93 I know nothing about Cloudflare's data practices. But I do know a lot of sites have been forced to go with Cloudflare because so many AI bots are incessantly scraping their site that the site goes down and humans can't access it - essentially AI is doing a DDOS, and when that's sustained for weeks/months/more then the Cloudflare-type system seems to be the only way to have the site actually available to humans.

                                    I hate it but those f---ing AI bots, seriously, they are ruining the net.

                                    @david_chisnall

                                    elosha@chaos.socialE This user is from outside of this forum
                                    elosha@chaos.socialE This user is from outside of this forum
                                    elosha@chaos.social
                                    wrote sidst redigeret af
                                    #69

                                    @zeborah @hex0x93 @david_chisnall Partially correct – their reason is fair, but CloudFlare is just one of the providers offering such protection, an oligopolist, not a very good service, and probably a data hoarder.

                                    Anubis would be a popular alternative, however self-hosted.

                                    (I can‘t access CF-gated sites at all because the checkbox captcha just breaks most of the time. Happens if you activate basic privacy features of your browser.)

                                    1 Reply Last reply
                                    0
                                    • jernej__s@infosec.exchangeJ jernej__s@infosec.exchange

                                      @the_wub Sadly, when you get 5000 requests per second from residential IPs (each IP doing 10-20 requests, all using user agents from legitimate browsers), there's very little other things you can do. This is not an exaggeration, that's what was happening at a client that has a web server hosting about 50 sites for their projects – they were getting hit with that several times per week, bringing the whole server down until we implemented Anubis.

                                      the_wub@mastodon.socialT This user is from outside of this forum
                                      the_wub@mastodon.socialT This user is from outside of this forum
                                      the_wub@mastodon.social
                                      wrote sidst redigeret af
                                      #70

                                      @jernej__s I understand the battle.

                                      The protective measures taken though should not make things more dangerous for users.

                                      Unless, of course the internet is nearing the end of the path as a free and open source of information.

                                      In which case, what does it matter.

                                      1 Reply Last reply
                                      0
                                      • jernej__s@infosec.exchangeJ jernej__s@infosec.exchange

                                        @the_wub Sadly, when you get 5000 requests per second from residential IPs (each IP doing 10-20 requests, all using user agents from legitimate browsers), there's very little other things you can do. This is not an exaggeration, that's what was happening at a client that has a web server hosting about 50 sites for their projects – they were getting hit with that several times per week, bringing the whole server down until we implemented Anubis.

                                        the_wub@mastodon.socialT This user is from outside of this forum
                                        the_wub@mastodon.socialT This user is from outside of this forum
                                        the_wub@mastodon.social
                                        wrote sidst redigeret af
                                        #71

                                        @jernej__s OK. Here is the rub.

                                        I get challenged twice trying to get into the Mozillazine forums.

                                        1) now it lets me pass but I have to make sure the link from my search engine is the https on NOT the http link.

                                        2) I get to the Mozilla landing page with a list of links to the forums.

                                        https://mozillazine.org/

                                        These links are all http.

                                        I cannot get past the Anubis challenge unless I alter the link to the https version.

                                        Now finally logged into the forums with SeaMonkey.

                                        #anubis #mozillazine

                                        the_wub@mastodon.socialT 1 Reply Last reply
                                        0
                                        • the_wub@mastodon.socialT the_wub@mastodon.social

                                          @jernej__s OK. Here is the rub.

                                          I get challenged twice trying to get into the Mozillazine forums.

                                          1) now it lets me pass but I have to make sure the link from my search engine is the https on NOT the http link.

                                          2) I get to the Mozilla landing page with a list of links to the forums.

                                          https://mozillazine.org/

                                          These links are all http.

                                          I cannot get past the Anubis challenge unless I alter the link to the https version.

                                          Now finally logged into the forums with SeaMonkey.

                                          #anubis #mozillazine

                                          the_wub@mastodon.socialT This user is from outside of this forum
                                          the_wub@mastodon.socialT This user is from outside of this forum
                                          the_wub@mastodon.social
                                          wrote sidst redigeret af
                                          #72

                                          @jernej__s As the OP began this thread saying how we have to wait for tens of seconds for Cloudflare like challenges and I have spent a lot more than this amount of time this morning sorting one problem with Anubis for one site for one browser.

                                          1 Reply Last reply
                                          0
                                          • jwcph@helvede.netJ jwcph@helvede.net shared this topic
                                          Svar
                                          • Svar som emne
                                          Login for at svare
                                          • Ældste til nyeste
                                          • Nyeste til ældste
                                          • Most Votes


                                          • Log ind

                                          • Har du ikke en konto? Tilmeld

                                          • Login or register to search.
                                          Powered by NodeBB Contributors
                                          Graciously hosted by data.coop
                                          • First post
                                            Last post
                                          0
                                          • Hjem
                                          • Seneste
                                          • Etiketter
                                          • Populære
                                          • Verden
                                          • Bruger
                                          • Grupper