Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it toLLMs: (enable that)Free software people: Oh no not like that

Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it toLLMs: (enable that)Free software people: Oh no not like that

Planlagt Fastgjort Låst Flyttet Ikke-kategoriseret
317 Indlæg 120 Posters 0 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne tråd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • mjg59@nondeterministic.computerM mjg59@nondeterministic.computer

    Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it to
    LLMs: (enable that)
    Free software people: Oh no not like that

    rogersm@mastodon.socialR This user is from outside of this forum
    rogersm@mastodon.socialR This user is from outside of this forum
    rogersm@mastodon.social
    wrote sidst redigeret af
    #139

    @mjg59 I have some issues about using LLM, but the only one in the free software world is about license tainting: I’m not sure if the code generated by a LLM is public domain.

    1 Reply Last reply
    0
    • mnl@hachyderm.ioM mnl@hachyderm.io

      @ced @david_chisnall @mjg59 @ignaloidas neither does an llm? We are perfectly able to deal with, say, search engine results, which are arguably more problematic than llms. For all intents and purposes, the books and resources I have at my disposal are also the product of random processes. I can still work with them to learn things.

      ced@mapstodon.spaceC This user is from outside of this forum
      ced@mapstodon.spaceC This user is from outside of this forum
      ced@mapstodon.space
      wrote sidst redigeret af
      #140

      @mnl @david_chisnall @mjg59 @ignaloidas well great for you. *I*'m not able to deal with random search results (especially now that they are often slop). And if your books were bought randomly, sure. Mine were selected because I trust the author, or because I know enough about the author bias to be able to correct it.

      mnl@hachyderm.ioM 1 Reply Last reply
      0
      • ced@mapstodon.spaceC ced@mapstodon.space

        @mnl @david_chisnall @mjg59 @ignaloidas well great for you. *I*'m not able to deal with random search results (especially now that they are often slop). And if your books were bought randomly, sure. Mine were selected because I trust the author, or because I know enough about the author bias to be able to correct it.

        mnl@hachyderm.ioM This user is from outside of this forum
        mnl@hachyderm.ioM This user is from outside of this forum
        mnl@hachyderm.io
        wrote sidst redigeret af
        #141

        @ced @david_chisnall @mjg59 @ignaloidas do you not use a search engine (genuinely curious, I love building search engines and making them work well)?

        Do you think it’s impossible to assign varying degrees of trust to llm output?

        ced@mapstodon.spaceC 1 Reply Last reply
        0
        • mjg59@nondeterministic.computerM mjg59@nondeterministic.computer

          Look, coders, we are not writers. There's no way to turn "increment this variable" into life changing prose. The creativity exists outside the code. It always has done and it always will do. Let it go.

          karolherbst@chaos.socialK This user is from outside of this forum
          karolherbst@chaos.socialK This user is from outside of this forum
          karolherbst@chaos.social
          wrote sidst redigeret af
          #142

          @mjg59 I think this understanding of art stems from a misunderstanding what art in itself is.

          Like of course writing code can be an artistic activity and trying to argue against is just shows a deep misunderstanding of those who see it that way.

          But "arts goal" isn't even to be life changing prose, most arts goal isn't even that at all. Most "classical" art was even seen as "just a craft".

          "beauty" can manifest in many ways, and self-expression through code is a thing.

          1 Reply Last reply
          0
          • mnl@hachyderm.ioM mnl@hachyderm.io

            @newhinton @david_chisnall @mjg59 @ignaloidas I’m not really following. using an llm doesn’t erase my brain the minute I use it, nor are is it a random number generator where you are forbidden to check the answers? These all hold for llms.

            ignaloidas@not.acu.ltI This user is from outside of this forum
            ignaloidas@not.acu.ltI This user is from outside of this forum
            ignaloidas@not.acu.lt
            wrote sidst redigeret af
            #143

            @mnl@hachyderm.io @newhinton@troet.cafe @david_chisnall@infosec.exchange @mjg59@nondeterministic.computer the difference is that you can gain trust that some author knows his stuff in a specific field and you no longer need to cross-check every single thing that they write.

            With an LLM no such trust can be developed, because fundamentally it's just rolling dice out of a modeled distribution, the fact that the LLM was right about something 9 previous times has no influence whether the next statement will be correct or wrong.

            It's these trust relationships that allow to work efficiently - cross checking everything every time is incredibly time consuming.

            mnl@hachyderm.ioM 1 Reply Last reply
            0
            • ignaloidas@not.acu.ltI ignaloidas@not.acu.lt

              @mnl@hachyderm.io @newhinton@troet.cafe @david_chisnall@infosec.exchange @mjg59@nondeterministic.computer the difference is that you can gain trust that some author knows his stuff in a specific field and you no longer need to cross-check every single thing that they write.

              With an LLM no such trust can be developed, because fundamentally it's just rolling dice out of a modeled distribution, the fact that the LLM was right about something 9 previous times has no influence whether the next statement will be correct or wrong.

              It's these trust relationships that allow to work efficiently - cross checking everything every time is incredibly time consuming.

              mnl@hachyderm.ioM This user is from outside of this forum
              mnl@hachyderm.ioM This user is from outside of this forum
              mnl@hachyderm.io
              wrote sidst redigeret af
              #144

              @ignaloidas @mjg59 @david_chisnall @newhinton that’s not how llms work though, it being right 9 times out of 10 very much has an influence on whether the 10th time will be correct. That’s literally how models are trained. There’s an entire research field out there that studies it.

              ignaloidas@not.acu.ltI 1 Reply Last reply
              0
              • mjg59@nondeterministic.computerM mjg59@nondeterministic.computer

                Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it to
                LLMs: (enable that)
                Free software people: Oh no not like that

                kyle@mastodon.kylerank.inK This user is from outside of this forum
                kyle@mastodon.kylerank.inK This user is from outside of this forum
                kyle@mastodon.kylerank.in
                wrote sidst redigeret af
                #145

                @mjg59 You will get backlash, but you are right.

                Free software folks will have to decide whether what they really wanted was *everyone* to have the freedom to use and modify software, or only that subset of everyone who had the privilege of learning software development.

                There has always been this elitist dividing line in the community between people who contribute code, and people who contribute all the other things FOSS needs to thrive. Now those people can contribute code too.

                zachdecook@social.librem.oneZ 1 Reply Last reply
                0
                • mnl@hachyderm.ioM mnl@hachyderm.io

                  @ced @david_chisnall @mjg59 @ignaloidas do you not use a search engine (genuinely curious, I love building search engines and making them work well)?

                  Do you think it’s impossible to assign varying degrees of trust to llm output?

                  ced@mapstodon.spaceC This user is from outside of this forum
                  ced@mapstodon.spaceC This user is from outside of this forum
                  ced@mapstodon.space
                  wrote sidst redigeret af
                  #146

                  @mnl @david_chisnall @mjg59 @ignaloidas I do use search engines, but if I don't recognize or can easily get context about the sites listed, it's now nearly impossible to trust the results. It used to be possible (creating content was costly so well written content was usually the mark or someone at least a bit invested on the subject, but in those case I used to cross check several hits) it's not anymore.
                  LLMs: without knowing the source of the answer, how could it be 🤔 It's just plausible.

                  mnl@hachyderm.ioM 1 Reply Last reply
                  0
                  • mjg59@nondeterministic.computerM mjg59@nondeterministic.computer

                    Free software people: A major goal of free software is for individuals to be able to cause software to behave in the way they want it to
                    LLMs: (enable that)
                    Free software people: Oh no not like that

                    funhouseradio@mastodon.worldF This user is from outside of this forum
                    funhouseradio@mastodon.worldF This user is from outside of this forum
                    funhouseradio@mastodon.world
                    wrote sidst redigeret af
                    #147

                    @mjg59 I'm all about running very outdated software on slightly less outdated Hardware. That's the good stuff bro.

                    1 Reply Last reply
                    0
                    • mnl@hachyderm.ioM mnl@hachyderm.io

                      @ignaloidas @mjg59 @david_chisnall @newhinton that’s not how llms work though, it being right 9 times out of 10 very much has an influence on whether the 10th time will be correct. That’s literally how models are trained. There’s an entire research field out there that studies it.

                      ignaloidas@not.acu.ltI This user is from outside of this forum
                      ignaloidas@not.acu.ltI This user is from outside of this forum
                      ignaloidas@not.acu.lt
                      wrote sidst redigeret af
                      #148

                      @mnl@hachyderm.io @mjg59@nondeterministic.computer @david_chisnall@infosec.exchange @newhinton@troet.cafe the training objective is not "be correct", so that's not what the models are trained on. They aren't trained on such an objective because there's no way to score it - if you had a system that could determine whether a statement was correct, then you could just use that. No, what the models are trained on are globs of existing text, targeting the continuations to be the same as the text. Notably, most(all?) LLM makers don't even care whether most of the text is "correct" (in any sense sense of the word), and "solve" it by training on some more carefully selected globs of text. And in the end, what the model itself outputs are probabilities of a specific token (not even a sentence or something) to be next. The text you get is all just dice rolls on those probabilities, again and again.

                      It is a text prediction machine. A very powerful one, but it's just a prediction. It just picks whatever is likely, with no regard with what is correct

                      mnl@hachyderm.ioM 1 Reply Last reply
                      0
                      • ced@mapstodon.spaceC ced@mapstodon.space

                        @mnl @david_chisnall @mjg59 @ignaloidas I do use search engines, but if I don't recognize or can easily get context about the sites listed, it's now nearly impossible to trust the results. It used to be possible (creating content was costly so well written content was usually the mark or someone at least a bit invested on the subject, but in those case I used to cross check several hits) it's not anymore.
                        LLMs: without knowing the source of the answer, how could it be 🤔 It's just plausible.

                        mnl@hachyderm.ioM This user is from outside of this forum
                        mnl@hachyderm.ioM This user is from outside of this forum
                        mnl@hachyderm.io
                        wrote sidst redigeret af
                        #149

                        @ced @david_chisnall @mjg59 @ignaloidas which search engine do you use? I use @kagihq and it’s always a pleasure.

                        Llms can provide information about sources. If they tell me that Shannon said x in his thesis on p.463 I can look it up. If they tell me that variable foo is on line X in file Y, I can easily verify it. If they think that Z compiles, I don’t even need to cross check that, the computer can do it for me. In fact verifying certain assumptions about code might be the easiest of them all, which is why llms are quite effective at writing code.

                        mnl@hachyderm.ioM ced@mapstodon.spaceC 2 Replies Last reply
                        0
                        • mjg59@nondeterministic.computerM mjg59@nondeterministic.computer

                          Clearly my most unpopular thread ever, so let me add a clarification: submitting LLM generated code you don't understand to an upstream project is absolute bullshit and you should never do that. Having an LLM turn an existing codebase into something that meets your local needs? Do it. The code may be awful, it may break stuff you don't care about, and that's what all my early patches to free software looked like. It's ok to solve your problem locally.

                          kaimac@sunny.gardenK This user is from outside of this forum
                          kaimac@sunny.gardenK This user is from outside of this forum
                          kaimac@sunny.garden
                          wrote sidst redigeret af
                          #150

                          @mjg59 telling people that they shouldn't care about the things they care about is generally unpopular, yes

                          1 Reply Last reply
                          0
                          • ignaloidas@not.acu.ltI ignaloidas@not.acu.lt

                            @mnl@hachyderm.io @mjg59@nondeterministic.computer @david_chisnall@infosec.exchange @newhinton@troet.cafe the training objective is not "be correct", so that's not what the models are trained on. They aren't trained on such an objective because there's no way to score it - if you had a system that could determine whether a statement was correct, then you could just use that. No, what the models are trained on are globs of existing text, targeting the continuations to be the same as the text. Notably, most(all?) LLM makers don't even care whether most of the text is "correct" (in any sense sense of the word), and "solve" it by training on some more carefully selected globs of text. And in the end, what the model itself outputs are probabilities of a specific token (not even a sentence or something) to be next. The text you get is all just dice rolls on those probabilities, again and again.

                            It is a text prediction machine. A very powerful one, but it's just a prediction. It just picks whatever is likely, with no regard with what is correct

                            mnl@hachyderm.ioM This user is from outside of this forum
                            mnl@hachyderm.ioM This user is from outside of this forum
                            mnl@hachyderm.io
                            wrote sidst redigeret af
                            #151

                            @ignaloidas @mjg59 @david_chisnall @newhinton that’s also not how current llms work, there is a significant amount of post-training using RL being done, and that too is a whole field of research.

                            Furthermore, current llm-based tools usually do multiple round of inference interspersed with more traditional “tool calls” (or, as I prefer to call it, interpreting sampled tokens in a deterministic/formal manner).

                            ignaloidas@not.acu.ltI 1 Reply Last reply
                            0
                            • mjg59@nondeterministic.computerM mjg59@nondeterministic.computer

                              Personally I'm not going to literally copy code from a codebase under an incompatible license because that is what the law says, but have I read proprietary code and learned the underlying creative aspect and then written new code that embodies it? Yes! Anyone claiming otherwise is lying!

                              mutesplash@uncontrollablegas.comM This user is from outside of this forum
                              mutesplash@uncontrollablegas.comM This user is from outside of this forum
                              mutesplash@uncontrollablegas.com
                              wrote sidst redigeret af
                              #152

                              @mjg59 Learning from and adapting ideas from unlicensed code into new code is an accommodation under law for humans. If you built a machine to do this at scale, however, that's a choice to leverage a humane decision into a profitable hack.

                              1 Reply Last reply
                              0
                              • mnl@hachyderm.ioM mnl@hachyderm.io

                                @ced @david_chisnall @mjg59 @ignaloidas which search engine do you use? I use @kagihq and it’s always a pleasure.

                                Llms can provide information about sources. If they tell me that Shannon said x in his thesis on p.463 I can look it up. If they tell me that variable foo is on line X in file Y, I can easily verify it. If they think that Z compiles, I don’t even need to cross check that, the computer can do it for me. In fact verifying certain assumptions about code might be the easiest of them all, which is why llms are quite effective at writing code.

                                mnl@hachyderm.ioM This user is from outside of this forum
                                mnl@hachyderm.ioM This user is from outside of this forum
                                mnl@hachyderm.io
                                wrote sidst redigeret af
                                #153

                                @ced @david_chisnall @mjg59 @ignaloidas @kagihq to the search engine thing, one reason I think that they’re usually more problematic to use is that there’s actually incentives to make results worse. I switched to Kagi from google/duckduckgo before ChatGPT because the results were already complete trash.

                                Sure, I have to pay by the search, but that’s the only business model that at least enables non-gameable results.

                                1 Reply Last reply
                                0
                                • mnl@hachyderm.ioM mnl@hachyderm.io

                                  @ced @david_chisnall @mjg59 @ignaloidas which search engine do you use? I use @kagihq and it’s always a pleasure.

                                  Llms can provide information about sources. If they tell me that Shannon said x in his thesis on p.463 I can look it up. If they tell me that variable foo is on line X in file Y, I can easily verify it. If they think that Z compiles, I don’t even need to cross check that, the computer can do it for me. In fact verifying certain assumptions about code might be the easiest of them all, which is why llms are quite effective at writing code.

                                  ced@mapstodon.spaceC This user is from outside of this forum
                                  ced@mapstodon.spaceC This user is from outside of this forum
                                  ced@mapstodon.space
                                  wrote sidst redigeret af
                                  #154

                                  @mnl @david_chisnall @mjg59 @ignaloidas @kagihq
                                  sure, but if I have to check every sentence, because even if 99 of them are correct I can't trust that the 100th will, doesn't it quite defeat the point? If I'm not reading a primary source, I have to be sure that I can trust the synthesis (at least to a point). With LLMs I can't.

                                  mnl@hachyderm.ioM 1 Reply Last reply
                                  0
                                  • mnl@hachyderm.ioM mnl@hachyderm.io

                                    @ignaloidas @mjg59 @david_chisnall @newhinton that’s also not how current llms work, there is a significant amount of post-training using RL being done, and that too is a whole field of research.

                                    Furthermore, current llm-based tools usually do multiple round of inference interspersed with more traditional “tool calls” (or, as I prefer to call it, interpreting sampled tokens in a deterministic/formal manner).

                                    ignaloidas@not.acu.ltI This user is from outside of this forum
                                    ignaloidas@not.acu.ltI This user is from outside of this forum
                                    ignaloidas@not.acu.lt
                                    wrote sidst redigeret af
                                    #155

                                    @mnl@hachyderm.io @mjg59@nondeterministic.computer @david_chisnall@infosec.exchange @newhinton@troet.cafe all of that training is still continuation based because that is what the models predict. Yes, there is a bunch of research, and honestly, most of it is banging head against fundamental issues of the model, but is still being funded because LLMs are at the end of it all, quite useless if they just spit nonsense from time to time and it's indistinguishable from sensible stuff without carefully cross-checking it all.

                                    Tool calls are just that - tools to add stuff into the context for further prediction, but they in no way do anything to make sure that the LLM output is correct, because once again - everything is treated as a continuation after the tool call, and it's just predicting, what's the most likely thing to do, not what's the correct thing to do.

                                    mnl@hachyderm.ioM 1 Reply Last reply
                                    0
                                    • mjg59@nondeterministic.computerM mjg59@nondeterministic.computer

                                      When I write code I am turning a creative idea into a mechanical embodiment of that idea. I am not creating beauty. Every line of code I write is a copy of another line of code I've read somewhere before, lightly modified to meet my needs. My code is not intended to evoke emotion. It does not change people think about the world. The idea→code pipeline in my head is not obviously distinguishable from the prompt->code process in an LLM

                                      boydstephensmithjr@hachyderm.ioB This user is from outside of this forum
                                      boydstephensmithjr@hachyderm.ioB This user is from outside of this forum
                                      boydstephensmithjr@hachyderm.io
                                      wrote sidst redigeret af
                                      #156

                                      @mjg59

                                      > When I write code I am turning a creative idea into a mechanical embodiment of that idea. I am not creating beauty

                                      When *I* code, I am creating beauty, or at least trying to.

                                      I hope each proof/program I write is as close to the proof from "the book" has possible. At a Pareto optimum of simplicity and elegance.

                                      1 Reply Last reply
                                      0
                                      • ced@mapstodon.spaceC ced@mapstodon.space

                                        @mnl @david_chisnall @mjg59 @ignaloidas @kagihq
                                        sure, but if I have to check every sentence, because even if 99 of them are correct I can't trust that the 100th will, doesn't it quite defeat the point? If I'm not reading a primary source, I have to be sure that I can trust the synthesis (at least to a point). With LLMs I can't.

                                        mnl@hachyderm.ioM This user is from outside of this forum
                                        mnl@hachyderm.ioM This user is from outside of this forum
                                        mnl@hachyderm.io
                                        wrote sidst redigeret af
                                        #157

                                        @ced I just read the primary source when I think it’s useful to do so

                                        1 Reply Last reply
                                        0
                                        • ignaloidas@not.acu.ltI ignaloidas@not.acu.lt

                                          @mnl@hachyderm.io @mjg59@nondeterministic.computer @david_chisnall@infosec.exchange @newhinton@troet.cafe all of that training is still continuation based because that is what the models predict. Yes, there is a bunch of research, and honestly, most of it is banging head against fundamental issues of the model, but is still being funded because LLMs are at the end of it all, quite useless if they just spit nonsense from time to time and it's indistinguishable from sensible stuff without carefully cross-checking it all.

                                          Tool calls are just that - tools to add stuff into the context for further prediction, but they in no way do anything to make sure that the LLM output is correct, because once again - everything is treated as a continuation after the tool call, and it's just predicting, what's the most likely thing to do, not what's the correct thing to do.

                                          mnl@hachyderm.ioM This user is from outside of this forum
                                          mnl@hachyderm.ioM This user is from outside of this forum
                                          mnl@hachyderm.io
                                          wrote sidst redigeret af
                                          #158

                                          @ignaloidas @mjg59 @david_chisnall @newhinton do you blindly trust code just because it’s been written by a human? Or your own code for that matter? I don’t, and yet I am able to produce hopefully useful software. In fact I have to trust an immense amount of software without verifying it, based on vibes. For llms at least I can benchmark the vibes, or at least more easily gather empirical observations than with humans.

                                          ignaloidas@not.acu.ltI 1 Reply Last reply
                                          0
                                          Svar
                                          • Svar som emne
                                          Login for at svare
                                          • Ældste til nyeste
                                          • Nyeste til ældste
                                          • Most Votes


                                          • Log ind

                                          • Har du ikke en konto? Tilmeld

                                          • Login or register to search.
                                          Powered by NodeBB Contributors
                                          Graciously hosted by data.coop
                                          • First post
                                            Last post
                                          0
                                          • Hjem
                                          • Seneste
                                          • Etiketter
                                          • Populære
                                          • Verden
                                          • Bruger
                                          • Grupper