Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. I've seen people claiming - with a straight face - that mechanical refactoring is a good use-case for LLM-based tools.

I've seen people claiming - with a straight face - that mechanical refactoring is a good use-case for LLM-based tools.

Planlagt Fastgjort Låst Flyttet Ikke-kategoriseret
42 Indlæg 19 Posters 68 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne tråd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • gabrielesvelto@mas.toG gabrielesvelto@mas.to

    I've seen people claiming - with a straight face - that mechanical refactoring is a good use-case for LLM-based tools. Well, sed was developed in 1974 and - according to Wikipedia - first shipped in UNIX version 7 in 1979. On modern machines it can process files at speeds of several GB/s and will not randomly introduce errors while processing them. It doesn't cost billions, a subscription or internet access. It's there on your machine, fully documented. What are we even talking about?

    pepperthevixen@meow.socialP This user is from outside of this forum
    pepperthevixen@meow.socialP This user is from outside of this forum
    pepperthevixen@meow.social
    wrote sidst redigeret af
    #28

    @gabrielesvelto NGL when I read "mechanical refactoring", I first imagined a bunch of robot arms on an Aperture-esque assembly line rearranging letters on printing press-style blocks

    1 Reply Last reply
    0
    • adingbatponder@fosstodon.orgA adingbatponder@fosstodon.org

      @gabrielesvelto For fun I tried writing rust code with claude code. The code took an age to compile when it worked (do we call it build?). The project took months and so the code got large & was slow to build. Claude was able to refactor it (after it worked) to build 10 times faster. That is not mechanical as you mention... but was really challenging. Mechanical refactors it does 100 times better still of course, because it seds too yes, but it can check the new syntax & test build each change.

      gabrielesvelto@mas.toG This user is from outside of this forum
      gabrielesvelto@mas.toG This user is from outside of this forum
      gabrielesvelto@mas.to
      wrote sidst redigeret af
      #29

      @adingbatponder why did the project take so long to build?

      adingbatponder@fosstodon.orgA 1 Reply Last reply
      0
      • fourlastor@androiddev.socialF fourlastor@androiddev.social

        @gabrielesvelto prompt-injections

        The project is closed source, and we don't have places where we randomly include text files, if someone IN THE COMPANY manages to introduce malicious code, imho they'd just infect gradle instead of hoping on someone running an LLM to trigger something (other than devs having access to only what they need). State sponsored hackers specifically are really not in my list of things I can defend from, be it from LLMs or whatever introduced attacks

        gabrielesvelto@mas.toG This user is from outside of this forum
        gabrielesvelto@mas.toG This user is from outside of this forum
        gabrielesvelto@mas.to
        wrote sidst redigeret af
        #30

        @fourlastor you don't need to do anything special to be a target of state-sponsored actors if your rely on an LLM for your coding tasks. State-sponsored actors have almost certainly poisoned the training data of major commercial LLMs, you don't need to add anything yourself. Remember, these things are trained on anything that's dredged from the internet. *Anything*. Do you really trust what happens within the model? Remember the xz compromise? It can now be done automatically *at scale*.

        fourlastor@androiddev.socialF 1 Reply Last reply
        0
        • gabrielesvelto@mas.toG gabrielesvelto@mas.to

          I've seen people claiming - with a straight face - that mechanical refactoring is a good use-case for LLM-based tools. Well, sed was developed in 1974 and - according to Wikipedia - first shipped in UNIX version 7 in 1979. On modern machines it can process files at speeds of several GB/s and will not randomly introduce errors while processing them. It doesn't cost billions, a subscription or internet access. It's there on your machine, fully documented. What are we even talking about?

          gabrielesvelto@mas.toG This user is from outside of this forum
          gabrielesvelto@mas.toG This user is from outside of this forum
          gabrielesvelto@mas.to
          wrote sidst redigeret af
          #31

          I think there's an important clarification to be made about LLM usage in coding tasks: do you trust the training data? Not your inputs, those are irrelevant, I mean the junk that the major vendors have dredged from the internet. Because I'm 100% positive that any self-respecting state-sponsored actor is poisoning training data as we speak by... simply publishing stuff on the internet.

          buermann@mastodon.socialB gabrielesvelto@mas.toG 2 Replies Last reply
          0
          • gabrielesvelto@mas.toG gabrielesvelto@mas.to

            @adingbatponder why did the project take so long to build?

            adingbatponder@fosstodon.orgA This user is from outside of this forum
            adingbatponder@fosstodon.orgA This user is from outside of this forum
            adingbatponder@fosstodon.org
            wrote sidst redigeret af
            #32

            @gabrielesvelto Well that is what rust seems to be like. I used a lot of packages incl. browser and screen grabbing tools which took ages to build. Like 20 mins. (It was inside a nixos flake though.)

            gabrielesvelto@mas.toG 1 Reply Last reply
            0
            • gabrielesvelto@mas.toG gabrielesvelto@mas.to

              I think there's an important clarification to be made about LLM usage in coding tasks: do you trust the training data? Not your inputs, those are irrelevant, I mean the junk that the major vendors have dredged from the internet. Because I'm 100% positive that any self-respecting state-sponsored actor is poisoning training data as we speak by... simply publishing stuff on the internet.

              buermann@mastodon.socialB This user is from outside of this forum
              buermann@mastodon.socialB This user is from outside of this forum
              buermann@mastodon.social
              wrote sidst redigeret af
              #33

              @gabrielesvelto

              Any blogger can poison the LLMs.

              https://www.bbc.com/future/article/20260218-i-hacked-chatgpt-and-googles-ai-and-it-only-took-20-minutes

              1 Reply Last reply
              0
              • gabrielesvelto@mas.toG gabrielesvelto@mas.to

                I think there's an important clarification to be made about LLM usage in coding tasks: do you trust the training data? Not your inputs, those are irrelevant, I mean the junk that the major vendors have dredged from the internet. Because I'm 100% positive that any self-respecting state-sponsored actor is poisoning training data as we speak by... simply publishing stuff on the internet.

                gabrielesvelto@mas.toG This user is from outside of this forum
                gabrielesvelto@mas.toG This user is from outside of this forum
                gabrielesvelto@mas.to
                wrote sidst redigeret af
                #34

                And it's crucial to remember what happened during the xz compromise: a chain of seemingly innocuous commits where malicious behavior was hidden, then triggered by changing a single character in a generated file. A SINGLE CHARACTER. If you truly believe you can catch that by manually reviewing thousands upon thousands of machine-generated commits obtained via black-box training data I'm sorry, but you're being extremely naive.

                a@852260996.91268476.xyzA 1 Reply Last reply
                0
                • adingbatponder@fosstodon.orgA adingbatponder@fosstodon.org

                  @gabrielesvelto Well that is what rust seems to be like. I used a lot of packages incl. browser and screen grabbing tools which took ages to build. Like 20 mins. (It was inside a nixos flake though.)

                  gabrielesvelto@mas.toG This user is from outside of this forum
                  gabrielesvelto@mas.toG This user is from outside of this forum
                  gabrielesvelto@mas.to
                  wrote sidst redigeret af
                  #35

                  @adingbatponder yes, but why? Which packages where taking so long? Firefox has almost 4 millions of lines of Rust and it takes only a few minutes to build them.

                  1 Reply Last reply
                  0
                  • gabrielesvelto@mas.toG gabrielesvelto@mas.to

                    And it's crucial to remember what happened during the xz compromise: a chain of seemingly innocuous commits where malicious behavior was hidden, then triggered by changing a single character in a generated file. A SINGLE CHARACTER. If you truly believe you can catch that by manually reviewing thousands upon thousands of machine-generated commits obtained via black-box training data I'm sorry, but you're being extremely naive.

                    a@852260996.91268476.xyzA This user is from outside of this forum
                    a@852260996.91268476.xyzA This user is from outside of this forum
                    a@852260996.91268476.xyz
                    wrote sidst redigeret af
                    #36

                    @gabrielesvelto@mas.to it is also worth remembering that the xz incident happened WITHOUT LLMs involved, so you comparison is not a very good one

                    gabrielesvelto@mas.toG 1 Reply Last reply
                    0
                    • gabrielesvelto@mas.toG gabrielesvelto@mas.to

                      I've seen people claiming - with a straight face - that mechanical refactoring is a good use-case for LLM-based tools. Well, sed was developed in 1974 and - according to Wikipedia - first shipped in UNIX version 7 in 1979. On modern machines it can process files at speeds of several GB/s and will not randomly introduce errors while processing them. It doesn't cost billions, a subscription or internet access. It's there on your machine, fully documented. What are we even talking about?

                      piegames@flausch.socialP This user is from outside of this forum
                      piegames@flausch.socialP This user is from outside of this forum
                      piegames@flausch.social
                      wrote sidst redigeret af
                      #37

                      @gabrielesvelto "people are using this inadequate and problematic tool for a job, so let me suggest they use this different completely inadequate tool instead."
                      Speaking of unfortunate painful experience, using grap and sed at scale for mechanical refactoring very much randomly introduces mistakes into a codebase. I beg developers to use *at least* syntax-aware tools for mechanical refactoring jobs

                      1 Reply Last reply
                      0
                      • a@852260996.91268476.xyzA a@852260996.91268476.xyz

                        @gabrielesvelto@mas.to it is also worth remembering that the xz incident happened WITHOUT LLMs involved, so you comparison is not a very good one

                        gabrielesvelto@mas.toG This user is from outside of this forum
                        gabrielesvelto@mas.toG This user is from outside of this forum
                        gabrielesvelto@mas.to
                        wrote sidst redigeret af
                        #38

                        @a how so? Now you don't need a person to run that particular exploit for years, you can just poison an LLM so that whenever someone generates a sufficiently large sequence of commits the exploit can be injected in them directly. No user intervention and it can be done at scale. And it can be done in closed-source codebases too, it's just a matter of someone using a bot on them.

                        a@852260996.91268476.xyzA 1 Reply Last reply
                        0
                        • gabrielesvelto@mas.toG gabrielesvelto@mas.to

                          @a how so? Now you don't need a person to run that particular exploit for years, you can just poison an LLM so that whenever someone generates a sufficiently large sequence of commits the exploit can be injected in them directly. No user intervention and it can be done at scale. And it can be done in closed-source codebases too, it's just a matter of someone using a bot on them.

                          a@852260996.91268476.xyzA This user is from outside of this forum
                          a@852260996.91268476.xyzA This user is from outside of this forum
                          a@852260996.91268476.xyz
                          wrote sidst redigeret af
                          #39

                          @gabrielesvelto@mas.to you didn't need an LLM for xz, that is how

                          1 Reply Last reply
                          0
                          • gabrielesvelto@mas.toG gabrielesvelto@mas.to

                            @fourlastor you don't need to do anything special to be a target of state-sponsored actors if your rely on an LLM for your coding tasks. State-sponsored actors have almost certainly poisoned the training data of major commercial LLMs, you don't need to add anything yourself. Remember, these things are trained on anything that's dredged from the internet. *Anything*. Do you really trust what happens within the model? Remember the xz compromise? It can now be done automatically *at scale*.

                            fourlastor@androiddev.socialF This user is from outside of this forum
                            fourlastor@androiddev.socialF This user is from outside of this forum
                            fourlastor@androiddev.social
                            wrote sidst redigeret af
                            #40

                            @gabrielesvelto and ok, but what is the *actual* scenario you're imagining? because my coding tasks go as such when I use LLMs:
                            1. I have 10-15 classes that need to change the way we do X from Y to Z
                            2. I prompt the LLM, telling it "change A,B,C so that they use Z instead of Y"
                            3. I review the code, fixing mistakes as I see them
                            1/x because post length limits

                            fourlastor@androiddev.socialF 1 Reply Last reply
                            0
                            • fourlastor@androiddev.socialF fourlastor@androiddev.social

                              @gabrielesvelto and ok, but what is the *actual* scenario you're imagining? because my coding tasks go as such when I use LLMs:
                              1. I have 10-15 classes that need to change the way we do X from Y to Z
                              2. I prompt the LLM, telling it "change A,B,C so that they use Z instead of Y"
                              3. I review the code, fixing mistakes as I see them
                              1/x because post length limits

                              fourlastor@androiddev.socialF This user is from outside of this forum
                              fourlastor@androiddev.socialF This user is from outside of this forum
                              fourlastor@androiddev.social
                              wrote sidst redigeret af
                              #41

                              @gabrielesvelto
                              The code change is frankly pretty simple, we're talking of stuff on the level of "migrate Book so instead of using function calls, uses annotations for ABC, update the call sites", we're not talking about "change this complex piece of code so that it does complex ABC in another complex XYZ way". The realm of errors is "I know that Foo doesn't work well by itself and needs extra care"

                              fourlastor@androiddev.socialF 1 Reply Last reply
                              0
                              • fourlastor@androiddev.socialF fourlastor@androiddev.social

                                @gabrielesvelto
                                The code change is frankly pretty simple, we're talking of stuff on the level of "migrate Book so instead of using function calls, uses annotations for ABC, update the call sites", we're not talking about "change this complex piece of code so that it does complex ABC in another complex XYZ way". The realm of errors is "I know that Foo doesn't work well by itself and needs extra care"

                                fourlastor@androiddev.socialF This user is from outside of this forum
                                fourlastor@androiddev.socialF This user is from outside of this forum
                                fourlastor@androiddev.social
                                wrote sidst redigeret af
                                #42

                                @gabrielesvelto anything that goes over the bar of "this is stupid but boring" goes into the "I'll do it by hand because if anything I need to learn how it works before touching it"

                                1 Reply Last reply
                                0
                                • jwcph@helvede.netJ jwcph@helvede.net shared this topic
                                Svar
                                • Svar som emne
                                Login for at svare
                                • Ældste til nyeste
                                • Nyeste til ældste
                                • Most Votes


                                • Log ind

                                • Har du ikke en konto? Tilmeld

                                • Login or register to search.
                                Powered by NodeBB Contributors
                                Graciously hosted by data.coop
                                • First post
                                  Last post
                                0
                                • Hjem
                                • Seneste
                                • Etiketter
                                • Populære
                                • Verden
                                • Bruger
                                • Grupper