Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. all the criticism has been said, all the takes been had.

all the criticism has been said, all the takes been had.

Planlagt Fastgjort Låst Flyttet Ikke-kategoriseret
110 Indlæg 60 Posters 384 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne tråd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • bstacey@icosahedron.websiteB bstacey@icosahedron.website

    @jonny It's like everyone decided to take a bath in mercury and leaded gasoline.

    bipolaron@scholar.socialB This user is from outside of this forum
    bipolaron@scholar.socialB This user is from outside of this forum
    bipolaron@scholar.social
    wrote sidst redigeret af
    #59

    @bstacey @jonny with a plugged in datacenter

    1 Reply Last reply
    0
    • jonny@neuromatch.socialJ jonny@neuromatch.social

      So rsync rewriting all the tests puts the entire project in play. Now the entire protective surface has been sloshed through a layer of probability, so the loop must accelerate. Followup PRs add more carveouts with lengthy LLM justifications that sound perfectly plausible but amount to an erosion of the protective surface. We go from cumulative improvement to a random walk.

      poleguy@mastodon.socialP This user is from outside of this forum
      poleguy@mastodon.socialP This user is from outside of this forum
      poleguy@mastodon.social
      wrote sidst redigeret af
      #60

      @jonny I just lost my beer league hockey championship as the last shooter on a 14 round shoot out. I'm sitting in my driveway reading your thread. I'll need to read it again in the morning.

      I don't remember why I followed you originally. But I love this thread.

      This whole rsync thing is the most interesting thing that has come out of the ai bubble.

      I had a negative feel for rsync after years ago reading a blog criticizing its sloppy design.

      Yet I rely on it daily. I have so many questions.

      jonny@neuromatch.socialJ bms48@mastodon.socialB 2 Replies Last reply
      0
      • poleguy@mastodon.socialP poleguy@mastodon.social

        @jonny I just lost my beer league hockey championship as the last shooter on a 14 round shoot out. I'm sitting in my driveway reading your thread. I'll need to read it again in the morning.

        I don't remember why I followed you originally. But I love this thread.

        This whole rsync thing is the most interesting thing that has come out of the ai bubble.

        I had a negative feel for rsync after years ago reading a blog criticizing its sloppy design.

        Yet I rely on it daily. I have so many questions.

        jonny@neuromatch.socialJ This user is from outside of this forum
        jonny@neuromatch.socialJ This user is from outside of this forum
        jonny@neuromatch.social
        wrote sidst redigeret af
        #61

        @poleguy
        RIP on the shootout, hopefully the other team bought the beer and you got to pinch the other goalies cheek a bit. You'll get em next season

        poleguy@mastodon.socialP 1 Reply Last reply
        0
        • jonny@neuromatch.socialJ jonny@neuromatch.social

          @poleguy
          RIP on the shootout, hopefully the other team bought the beer and you got to pinch the other goalies cheek a bit. You'll get em next season

          poleguy@mastodon.socialP This user is from outside of this forum
          poleguy@mastodon.socialP This user is from outside of this forum
          poleguy@mastodon.social
          wrote sidst redigeret af
          #62

          @jonny indeed, that's the right feeling!

          We have sponsorship from a brewery, so the locker room beer (and custom jerseys) are "free."

          But we sat at the bar with the other team. It is just a game after all.

          Both sides had a good time. And we had fans cheering for both sides. And kids crashing the locker room to celebrate despite the loss... We shared our NA options. Can't ask for more.

          I'd love to engage more on this thread technically... I have thoughts. Maybe Monday.

          1 Reply Last reply
          0
          • jonny@neuromatch.socialJ jonny@neuromatch.social

            I think the modal situation here is that the people are reading none or very little of what is being generated by the LLM, so the tests have a special role: Tests function as the pull arm on the slot machine, you just generate until tests pass, and that's a jackpot. Obviously that's meaningless when the tests are meaningless, so tests take on a very different meaning and role in slot machine coding.

            Previously we would write careful test conditions that were based off some real problem or an understanding of what the code under test did, and had a specific thing they were intended to protect against. Tests move slow and are designed to protect us against the things we know can go wrong. When we learn of a new wrong thing, we add a test.

            LLM tests have the form of tests but don't do the same thing. They often test nothing, and are just expressions of truisms that the probabilistic text space explored while generating. They have strongly worded names but end up actually asserting that basic language features work as expected. Because it is not us writing tests for ourselves, where we only harm ourselves by making them weak, they function instead as a passively obfuscated justification for the code that the LLM generates. The user wants the tests to pass. The LLM provides.

            The tests are theater: they are the play field for the slot machine. They are mild, surmountable, need to fail a few times to be plausible, but must eventually pass within the expected generation loop window to deliver the payout.

            jens@social.finkhaeuser.deJ This user is from outside of this forum
            jens@social.finkhaeuser.deJ This user is from outside of this forum
            jens@social.finkhaeuser.de
            wrote sidst redigeret af
            #63

            @jonny related:

            https://finkhaeuser.de/2026-04-10-outsourcing-thought-is-going-great/

            sesamzoo@mastodon.socialS 1 Reply Last reply
            0
            • jonny@neuromatch.socialJ jonny@neuromatch.social

              @ainmosni
              Well good, keep those walls up, they are protective. I am not so lucky and rely on discipline and observation of the impacts on others. It sets my alarm bells ringing to run for cover, but to understand why the things are happening around me the only means has been to feel it for myself, and I get it.

              jens@social.finkhaeuser.deJ This user is from outside of this forum
              jens@social.finkhaeuser.deJ This user is from outside of this forum
              jens@social.finkhaeuser.de
              wrote sidst redigeret af
              #64

              @jonny @ainmosni Gambling (addiction) works on the so-called Variable Reinforcement Schedule.

              The TL;DR of it is, results are random enough that even though it seems there may be a pattern, there isn't. You're pulled in because "one more time will show my pattern detection was right".

              And since human brains are excellent pattern detection machines, every time this succeeds yields huge dopamine rewards.

              I'm pissed off with the pattern, which is why I stop. But I can't deny its power.

              1 Reply Last reply
              0
              • jonny@neuromatch.socialJ jonny@neuromatch.social

                So rsync rewriting all the tests puts the entire project in play. Now the entire protective surface has been sloshed through a layer of probability, so the loop must accelerate. Followup PRs add more carveouts with lengthy LLM justifications that sound perfectly plausible but amount to an erosion of the protective surface. We go from cumulative improvement to a random walk.

                themipper@mastodon.socialT This user is from outside of this forum
                themipper@mastodon.socialT This user is from outside of this forum
                themipper@mastodon.social
                wrote sidst redigeret af
                #65

                @jonny this whole thing is so bad that the only viable way seems to fork it before the LLM sloppening. It is a shame to see more and more foundational projects fall into the LLM trap.

                And as always you hit the nail on the head with your deep dive and explanations. I love reading them.

                I will use your observation on how for a LLM what is written is the same as what is happening.

                jetsetilly@mastodon.gamedev.placeJ 1 Reply Last reply
                0
                • jonny@neuromatch.socialJ jonny@neuromatch.social

                  Here's an example from some code that was thrust at me this week. The rest of the tests try a bit harder to look like tests, but this one is perplexing.

                  What does it test? The function name suggests its a smoke test. LLMs love to call things smoke tests. That would suggest this would be an early-run test that fails loudly if some basic precondition - like having ffmpeg - fails. Or, I guess we are smoke testing the ensure_ffmpeg function? Anyway who knows. However we first check if ffmpeg or ffprobe are present, which is exactly what ensure_ffmpeg does. If they aren't present, a warning tells us that ffmpeg/ffprobe are required for the video tests, which makes it seem like this should be a parameterizing test that controls which tests are run, which of course it does not do.

                  So the test literally does nothing and cannot possibly fail, but says it does at least two things, because to an LLM something saying it does something is the same thing as it actually doing that thing.

                  gunchleoc@mastodon.scotG This user is from outside of this forum
                  gunchleoc@mastodon.scotG This user is from outside of this forum
                  gunchleoc@mastodon.scot
                  wrote sidst redigeret af
                  #66

                  @jonny Of course it's a smoke test - as in "smoke and mirrors"

                  WTAF.

                  1 Reply Last reply
                  0
                  • jonny@neuromatch.socialJ jonny@neuromatch.social

                    So rsync rewriting all the tests puts the entire project in play. Now the entire protective surface has been sloshed through a layer of probability, so the loop must accelerate. Followup PRs add more carveouts with lengthy LLM justifications that sound perfectly plausible but amount to an erosion of the protective surface. We go from cumulative improvement to a random walk.

                    fluffy@plush.cityF This user is from outside of this forum
                    fluffy@plush.cityF This user is from outside of this forum
                    fluffy@plush.city
                    wrote sidst redigeret af
                    #67

                    @jonny also why the hell would they write tests for a C program/library in Python? It makes no sense.

                    fluffy@plush.cityF jonny@neuromatch.socialJ 0x2ba22e11@unstable.systems0 3 Replies Last reply
                    0
                    • jonny@neuromatch.socialJ jonny@neuromatch.social

                      RE: https://hails.org/@hailey/116657391001259044

                      all the criticism has been said, all the takes been had. the only metaphor i have been finding consistently useful for understanding what is happening with people and "AI" is addiction, and specifically gambling addiction.

                      spitfire@mastodon.deS This user is from outside of this forum
                      spitfire@mastodon.deS This user is from outside of this forum
                      spitfire@mastodon.de
                      wrote sidst redigeret af
                      #68

                      @jonny holy crap this story gets worse by the day. Thank you very much for summing-up this aspect of the situation for a non-sw-engineering-person like me. 🫡

                      1 Reply Last reply
                      0
                      • fluffy@plush.cityF fluffy@plush.city

                        @jonny also why the hell would they write tests for a C program/library in Python? It makes no sense.

                        fluffy@plush.cityF This user is from outside of this forum
                        fluffy@plush.cityF This user is from outside of this forum
                        fluffy@plush.city
                        wrote sidst redigeret af
                        #69

                        @jonny ... and why the everloving FUCK do these tests run as root

                        technocrow@blahaj.zoneT d_rift@beige.partyD 2 Replies Last reply
                        0
                        • jonny@neuromatch.socialJ jonny@neuromatch.social

                          I think the modal situation here is that the people are reading none or very little of what is being generated by the LLM, so the tests have a special role: Tests function as the pull arm on the slot machine, you just generate until tests pass, and that's a jackpot. Obviously that's meaningless when the tests are meaningless, so tests take on a very different meaning and role in slot machine coding.

                          Previously we would write careful test conditions that were based off some real problem or an understanding of what the code under test did, and had a specific thing they were intended to protect against. Tests move slow and are designed to protect us against the things we know can go wrong. When we learn of a new wrong thing, we add a test.

                          LLM tests have the form of tests but don't do the same thing. They often test nothing, and are just expressions of truisms that the probabilistic text space explored while generating. They have strongly worded names but end up actually asserting that basic language features work as expected. Because it is not us writing tests for ourselves, where we only harm ourselves by making them weak, they function instead as a passively obfuscated justification for the code that the LLM generates. The user wants the tests to pass. The LLM provides.

                          The tests are theater: they are the play field for the slot machine. They are mild, surmountable, need to fail a few times to be plausible, but must eventually pass within the expected generation loop window to deliver the payout.

                          dahukanna@mastodon.socialD This user is from outside of this forum
                          dahukanna@mastodon.socialD This user is from outside of this forum
                          dahukanna@mastodon.social
                          wrote sidst redigeret af
                          #70

                          @jonny Referencing
                          1. @shauna post based on @DGI about power dynamics & dysfunction between imaginary labour(iML) & interpretive labour(iNL)-https://www.rethinkingpower.info/how-interpretive-labor-straddles-the-gap-between-rules-and-reality/
                          2. Power, chapter 4 of Mary Parker Follet’s Dynamic administration - https://mastodon.social/@dahukanna/110643444784446704

                          Presuming Productivity(P)=(iML/iNL)
                          dysfunctional power-over tool imposition e.g. LLM, factory production,etc
                          - Imagined abstract: 1 LLM PR/0 review units= ∞P
                          - Interpreted reality: 1 LLM PR/>10 review units=0.1P
                          -https://mastodon.social/@dahukanna/113230734549577353

                          1 Reply Last reply
                          0
                          • fluffy@plush.cityF fluffy@plush.city

                            @jonny ... and why the everloving FUCK do these tests run as root

                            technocrow@blahaj.zoneT This user is from outside of this forum
                            technocrow@blahaj.zoneT This user is from outside of this forum
                            technocrow@blahaj.zone
                            wrote sidst redigeret af
                            #71

                            @fluffy@plush.city @jonny@neuromatch.social running tests as root is fucking wild

                            1 Reply Last reply
                            0
                            • jens@social.finkhaeuser.deJ jens@social.finkhaeuser.de

                              @jonny related:

                              https://finkhaeuser.de/2026-04-10-outsourcing-thought-is-going-great/

                              sesamzoo@mastodon.socialS This user is from outside of this forum
                              sesamzoo@mastodon.socialS This user is from outside of this forum
                              sesamzoo@mastodon.social
                              wrote sidst redigeret af
                              #72

                              @jens, great article, thank you. Did you pull the lever "just one more time" and if so, did it get even worse?

                              @jonny, thank you for this thread and lots of your other threads on the topic.

                              Both help feeling that I'm not the ghost driver although these days there is lot of contraflow on my lane. Mostly at work where the AI fanboys/believers/addicts are at least way louder than the people trying to understand and keeping their code in maintainable shape.

                              jens@social.finkhaeuser.deJ 1 Reply Last reply
                              0
                              • sesamzoo@mastodon.socialS sesamzoo@mastodon.social

                                @jens, great article, thank you. Did you pull the lever "just one more time" and if so, did it get even worse?

                                @jonny, thank you for this thread and lots of your other threads on the topic.

                                Both help feeling that I'm not the ghost driver although these days there is lot of contraflow on my lane. Mostly at work where the AI fanboys/believers/addicts are at least way louder than the people trying to understand and keeping their code in maintainable shape.

                                jens@social.finkhaeuser.deJ This user is from outside of this forum
                                jens@social.finkhaeuser.deJ This user is from outside of this forum
                                jens@social.finkhaeuser.de
                                wrote sidst redigeret af
                                #73

                                @sesamzoo @jonny No, I did not. I try to use LLMs not at all, so I really did maybe one or two more queries more than described, just to get a feel for it.

                                jens@social.finkhaeuser.deJ 1 Reply Last reply
                                0
                                • jens@social.finkhaeuser.deJ jens@social.finkhaeuser.de

                                  @sesamzoo @jonny No, I did not. I try to use LLMs not at all, so I really did maybe one or two more queries more than described, just to get a feel for it.

                                  jens@social.finkhaeuser.deJ This user is from outside of this forum
                                  jens@social.finkhaeuser.deJ This user is from outside of this forum
                                  jens@social.finkhaeuser.de
                                  wrote sidst redigeret af
                                  #74

                                  @sesamzoo @jonny Also, I feel dirty every time, so I don't want to waste water crying in the shower afterwards.

                                  sesamzoo@mastodon.socialS 1 Reply Last reply
                                  0
                                  • themipper@mastodon.socialT themipper@mastodon.social

                                    @jonny this whole thing is so bad that the only viable way seems to fork it before the LLM sloppening. It is a shame to see more and more foundational projects fall into the LLM trap.

                                    And as always you hit the nail on the head with your deep dive and explanations. I love reading them.

                                    I will use your observation on how for a LLM what is written is the same as what is happening.

                                    jetsetilly@mastodon.gamedev.placeJ This user is from outside of this forum
                                    jetsetilly@mastodon.gamedev.placeJ This user is from outside of this forum
                                    jetsetilly@mastodon.gamedev.place
                                    wrote sidst redigeret af
                                    #75

                                    @themipper @jonny
                                    > It is a shame to see more and more foundational projects fall into the LLM trap

                                    The one that breaks my heart is vim.

                                    themipper@mastodon.socialT 1 Reply Last reply
                                    0
                                    • jonny@neuromatch.socialJ jonny@neuromatch.social

                                      Here's an example from some code that was thrust at me this week. The rest of the tests try a bit harder to look like tests, but this one is perplexing.

                                      What does it test? The function name suggests its a smoke test. LLMs love to call things smoke tests. That would suggest this would be an early-run test that fails loudly if some basic precondition - like having ffmpeg - fails. Or, I guess we are smoke testing the ensure_ffmpeg function? Anyway who knows. However we first check if ffmpeg or ffprobe are present, which is exactly what ensure_ffmpeg does. If they aren't present, a warning tells us that ffmpeg/ffprobe are required for the video tests, which makes it seem like this should be a parameterizing test that controls which tests are run, which of course it does not do.

                                      So the test literally does nothing and cannot possibly fail, but says it does at least two things, because to an LLM something saying it does something is the same thing as it actually doing that thing.

                                      henryk@chaos.socialH This user is from outside of this forum
                                      henryk@chaos.socialH This user is from outside of this forum
                                      henryk@chaos.social
                                      wrote sidst redigeret af
                                      #76

                                      @jonny (Un)charitable interpretation: it smoke tests whether the ensure_ffmpeg function is syntactically correct — which is a failure mode LLMs are actually concerned about.

                                      jonny@neuromatch.socialJ 1 Reply Last reply
                                      0
                                      • fluffy@plush.cityF fluffy@plush.city

                                        @jonny also why the hell would they write tests for a C program/library in Python? It makes no sense.

                                        jonny@neuromatch.socialJ This user is from outside of this forum
                                        jonny@neuromatch.socialJ This user is from outside of this forum
                                        jonny@neuromatch.social
                                        wrote sidst redigeret af
                                        #77

                                        @fluffy
                                        Apparently all the tests for rsync are integration tests across bash rsync calls

                                        1 Reply Last reply
                                        0
                                        • ainmosni@social.ainmosni.euA ainmosni@social.ainmosni.eu

                                          @KalenXI @jonny if you don't want to get lazy when using AI, is there really a use for it at all? I mean, it's been proven that reviewing code is much more difficult than writing it, so I'm finding it much more taxing to review slopcode than if I'd just write it myself.

                                          Of course, that's adding the usual disclaimer that all this is not even relevant until the ethical and environmental shitshow of AI has been fixed.

                                          R This user is from outside of this forum
                                          R This user is from outside of this forum
                                          robinadams@mathstodon.xyz
                                          wrote sidst redigeret af
                                          #78

                                          @ainmosni @KalenXI @jonny With the caveat that the ethical problems with AI mean it's absolutely not worth the cost:

                                          It looks like AIs are actually getting better at finding bugs and security holes.

                                          So do your usual testing and code reviews, and then ask Claude to find any bugs you may have missed. It will give you some false positives but also some true ones.

                                          Very different from having an LLM generate the code and a human try to fix it up.

                                          Like Cory Doctorow's example: using an AI to give a second opinion on MRI scans (a centaur) makes scans more expensive but higher quality.

                                          Having an AI analyse the scans at high speed and then getting some poor schmuck to try to spot its mistakes (reverse centaur) makes scans cheaper and lower quality, but at least there's a person with little power in the hierarchy who gets the blame for the problems.

                                          Guess which one the people pouring trillions into AI want?

                                          1 Reply Last reply
                                          0
                                          Svar
                                          • Svar som emne
                                          Login for at svare
                                          • Ældste til nyeste
                                          • Nyeste til ældste
                                          • Most Votes


                                          • Log ind

                                          • Har du ikke en konto? Tilmeld

                                          • Login or register to search.
                                          Powered by NodeBB Contributors
                                          Graciously hosted by data.coop
                                          • First post
                                            Last post
                                          0
                                          • Hjem
                                          • Seneste
                                          • Etiketter
                                          • Populære
                                          • Verden
                                          • Bruger
                                          • Grupper