Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. 👀 … https://sfconservancy.org/blog/2026/apr/15/eternal-november-generative-ai-llm/ …my colleague Denver Gingerich writes: newcomers' extensive reliance on LLM-backed generative AI is comparable to the Eternal September onslaught to USENET in 1993.

👀 … https://sfconservancy.org/blog/2026/apr/15/eternal-november-generative-ai-llm/ …my colleague Denver Gingerich writes: newcomers' extensive reliance on LLM-backed generative AI is comparable to the Eternal September onslaught to USENET in 1993.

Planlagt Fastgjort LĂĄst Flyttet Ikke-kategoriseret
llmopensource
310 Indlæg 57 Posters 0 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne trĂĄd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • jens@social.finkhaeuser.deJ jens@social.finkhaeuser.de

    @cwebber @bkuhn @ossguy @richardfontana Worse IMHO is that we're putting FOSS as a movement at risk if we deskill everyone to the point where you either pay money to have code generated for you, or there is no code.

    cwebber@social.coopC This user is from outside of this forum
    cwebber@social.coopC This user is from outside of this forum
    cwebber@social.coop
    wrote sidst redigeret af
    #188

    @jens @bkuhn @ossguy @richardfontana This is indeed a serious risk, though tangential to this subthread. But it's a concern I also have.

    jens@social.finkhaeuser.deJ 1 Reply Last reply
    0
    • cwebber@social.coopC cwebber@social.coop

      @trwnh @bkuhn @ossguy @richardfontana Plenty of Microsoft code has been released under "shared source" licenses and also leaks

      trwnh@mastodon.socialT This user is from outside of this forum
      trwnh@mastodon.socialT This user is from outside of this forum
      trwnh@mastodon.social
      wrote sidst redigeret af
      #189

      @cwebber @bkuhn @ossguy @richardfontana sure, but my point is this would happen less often

      1 Reply Last reply
      0
      • cwebber@social.coopC cwebber@social.coop

        @jens @bkuhn @ossguy @richardfontana This is indeed a serious risk, though tangential to this subthread. But it's a concern I also have.

        jens@social.finkhaeuser.deJ This user is from outside of this forum
        jens@social.finkhaeuser.deJ This user is from outside of this forum
        jens@social.finkhaeuser.de
        wrote sidst redigeret af
        #190

        @cwebber @bkuhn @ossguy @richardfontana Fully tangential, agreed.

        1 Reply Last reply
        0
        • cwebber@social.coopC cwebber@social.coop

          @bkuhn @ossguy @richardfontana So let me summarize:

          - Without knowing the legal status of accepting LLM contributions, we're potentially polluting our codebases with stuff that we are going to have a HELL of a time cleaning up later
          - The idea of a copyleft-only LLM is a joke and we should not rely on it
          - We really only have two realistic scenarios: either FOSS projects cannot accept LLM based contributions legally from an international perspective, or everything is effectively in the public domain as outputted from these machines, but at least in the latter scenario we get to weaken copyright for everyone.

          That's leaving out a lot of other considerations about LLMs and the ethics of using them, which I think most of the other replies were focused on, I largely focused on the copyright implications aspects in this subthread. Because yes, I agree, it can be important to focus a conversation.

          But we can't ignore this right now.

          We're putting FOSS codebases at risk.

          richardfontana@mastodon.socialR This user is from outside of this forum
          richardfontana@mastodon.socialR This user is from outside of this forum
          richardfontana@mastodon.social
          wrote sidst redigeret af
          #191

          @cwebber copyleft-only LLM is nonsensical , agreed @bkuhn @ossguy

          cwebber@social.coopC 1 Reply Last reply
          0
          • richardfontana@mastodon.socialR richardfontana@mastodon.social

            @cwebber copyleft-only LLM is nonsensical , agreed @bkuhn @ossguy

            cwebber@social.coopC This user is from outside of this forum
            cwebber@social.coopC This user is from outside of this forum
            cwebber@social.coop
            wrote sidst redigeret af
            #192

            @richardfontana @bkuhn @ossguy Glad to hear we agree there!

            richardfontana@mastodon.socialR 1 Reply Last reply
            0
            • cwebber@social.coopC cwebber@social.coop

              @bkuhn @ossguy @richardfontana So let me summarize:

              - Without knowing the legal status of accepting LLM contributions, we're potentially polluting our codebases with stuff that we are going to have a HELL of a time cleaning up later
              - The idea of a copyleft-only LLM is a joke and we should not rely on it
              - We really only have two realistic scenarios: either FOSS projects cannot accept LLM based contributions legally from an international perspective, or everything is effectively in the public domain as outputted from these machines, but at least in the latter scenario we get to weaken copyright for everyone.

              That's leaving out a lot of other considerations about LLMs and the ethics of using them, which I think most of the other replies were focused on, I largely focused on the copyright implications aspects in this subthread. Because yes, I agree, it can be important to focus a conversation.

              But we can't ignore this right now.

              We're putting FOSS codebases at risk.

              fuzzychef@m6n.ioF This user is from outside of this forum
              fuzzychef@m6n.ioF This user is from outside of this forum
              fuzzychef@m6n.io
              wrote sidst redigeret af
              #193

              @cwebber @bkuhn @ossguy @richardfontana

              Based on my following of current legal cases, I think it's entirely possible that in a year or two we'll suddenly be rolling large OSS codebases back to 2023. And won't that be fun!

              bkuhn@fedi.copyleft.orgB 1 Reply Last reply
              0
              • cwebber@social.coopC cwebber@social.coop

                However, it's not actually the laundering angle I am concerned with here entirely, it's whether we're turning FOSS codebases into potential legal toxic waste dumps that we will have a hell of a time cleaning up later.

                The previous Conservancy post, which @bkuhn linked upthread, indicates that Conservancy does indeed consider the matter unsettled.

                Current LLMs wouldn't "default to copyleft", since they also include all-rights-reserved mixed in there. If the result of output of these systems is a slurry of inputs which carry their licensing somehow, their default licensing output situation is one of a hazard.

                I note that @bkuhn and @ossguy seem to be hinting at hoping a "copyleft based LLM" with all-copyleft output it a winning scenario. I'm going to state plainly: I believe that's an impossible outcome.

                @richardfontana

                evan@cosocial.caE This user is from outside of this forum
                evan@cosocial.caE This user is from outside of this forum
                evan@cosocial.ca
                wrote sidst redigeret af
                #194

                @cwebber

                Are you concerned that the LLMs generate nontrivial verbatim excerpts of copyrighted works?

                Or that there is a hidden "intellectual property" in the deep patterns that they use?

                Say, when an LLM was trained on a file I made with an interesting loop structure, and it emits code with a similar loop structure, even if the variable names, problem domain, details, or programming language differ.

                What if a court says I can demand royalties for my "IP"?

                @bkuhn @ossguy @richardfontana

                evan@cosocial.caE cwebber@social.coopC sfoskett@techfieldday.netS 3 Replies Last reply
                0
                • cwebber@social.coopC cwebber@social.coop

                  @richardfontana @bkuhn @ossguy Glad to hear we agree there!

                  richardfontana@mastodon.socialR This user is from outside of this forum
                  richardfontana@mastodon.socialR This user is from outside of this forum
                  richardfontana@mastodon.social
                  wrote sidst redigeret af
                  #195

                  @cwebber I mean, as a practical idea worth contemplating. Could imagine it as an experiment by someone with sufficient resources. There were some highly ill-conceived efforts to create anti-copyleft models a few years ago @bkuhn @ossguy

                  1 Reply Last reply
                  0
                  • evan@cosocial.caE evan@cosocial.ca

                    @cwebber

                    Are you concerned that the LLMs generate nontrivial verbatim excerpts of copyrighted works?

                    Or that there is a hidden "intellectual property" in the deep patterns that they use?

                    Say, when an LLM was trained on a file I made with an interesting loop structure, and it emits code with a similar loop structure, even if the variable names, problem domain, details, or programming language differ.

                    What if a court says I can demand royalties for my "IP"?

                    @bkuhn @ossguy @richardfontana

                    evan@cosocial.caE This user is from outside of this forum
                    evan@cosocial.caE This user is from outside of this forum
                    evan@cosocial.ca
                    wrote sidst redigeret af
                    #196

                    @cwebber @bkuhn @ossguy @richardfontana

                    Like, not copyrightable, not patents, but some secret third thing, kind of what people mean when we say that someone "copied our idea".

                    cwebber@social.coopC 1 Reply Last reply
                    0
                    • evan@cosocial.caE evan@cosocial.ca

                      @cwebber

                      Are you concerned that the LLMs generate nontrivial verbatim excerpts of copyrighted works?

                      Or that there is a hidden "intellectual property" in the deep patterns that they use?

                      Say, when an LLM was trained on a file I made with an interesting loop structure, and it emits code with a similar loop structure, even if the variable names, problem domain, details, or programming language differ.

                      What if a court says I can demand royalties for my "IP"?

                      @bkuhn @ossguy @richardfontana

                      cwebber@social.coopC This user is from outside of this forum
                      cwebber@social.coopC This user is from outside of this forum
                      cwebber@social.coop
                      wrote sidst redigeret af
                      #197

                      @evan @richardfontana I am saying we don't know the answer to that question, and it seems that @bkuhn and @ossguy agree that we don't know the answer to it, based on previous posts, and the lack of knowledge about what the copyright implications of LLM based contributions means that we are creating a schrodingers-licensing-timebomb for our FOSS codebases

                      evan@cosocial.caE bkuhn@fedi.copyleft.orgB 2 Replies Last reply
                      0
                      • evan@cosocial.caE evan@cosocial.ca

                        @cwebber @bkuhn @ossguy @richardfontana

                        Like, not copyrightable, not patents, but some secret third thing, kind of what people mean when we say that someone "copied our idea".

                        cwebber@social.coopC This user is from outside of this forum
                        cwebber@social.coopC This user is from outside of this forum
                        cwebber@social.coop
                        wrote sidst redigeret af
                        #198

                        @evan @bkuhn @ossguy @richardfontana I am talking about copyright

                        evan@cosocial.caE cwebber@social.coopC 2 Replies Last reply
                        0
                        • cwebber@social.coopC cwebber@social.coop

                          @evan @bkuhn @ossguy @richardfontana I am talking about copyright

                          evan@cosocial.caE This user is from outside of this forum
                          evan@cosocial.caE This user is from outside of this forum
                          evan@cosocial.ca
                          wrote sidst redigeret af
                          #199

                          @cwebber excellent, thanks!

                          @bkuhn @ossguy @richardfontana

                          1 Reply Last reply
                          0
                          • cwebber@social.coopC cwebber@social.coop

                            @evan @bkuhn @ossguy @richardfontana I am talking about copyright

                            cwebber@social.coopC This user is from outside of this forum
                            cwebber@social.coopC This user is from outside of this forum
                            cwebber@social.coop
                            wrote sidst redigeret af
                            #200

                            @evan @bkuhn @ossguy @richardfontana Say for a moment that we *did* make a model which intentionally pulled in leaked source code from various proprietary codebases.

                            What would your opinion be on the legal-hazard state of accepting that code output? Would you consider it relatively safe from a copyright perspective?

                            bkuhn@fedi.copyleft.orgB 1 Reply Last reply
                            0
                            • cwebber@social.coopC cwebber@social.coop

                              @bkuhn @ossguy @richardfontana Except, I actually believe this scenario isn't legally viable. And it's easier to understand if we scale back to the middle case.

                              Let's now look at the LLM trained on CC0 and CC BY. Because it's the BY aspect that makes everything complicated.

                              There is *NO WAY* in current LLM technology, nor I believe from studying how neural networks work, any viable computationally performant LLM, that they can track provenance. The BY clause cannot be upheld.

                              This isn't a theoretical concern for me; someone built another vibecoded Scheme-to-WASM-GC compiler that looks an awful lot like Spritely's own Hoot compiler in places. They didn't attribute us. They probably didn't know. But like many FOSS licenses, Apache v2 does require certain levels of attribution to be upheld. Most FOSS projects do.

                              You can't uphold the CC BY requirement, as far as I can tell.

                              richardfontana@mastodon.socialR This user is from outside of this forum
                              richardfontana@mastodon.socialR This user is from outside of this forum
                              richardfontana@mastodon.social
                              wrote sidst redigeret af
                              #201

                              @cwebber I think adequate compliance might be possible with good enough detection/matching tools but I don't necessarily expect such tools to be developed (let alone available to foss projects) (my assumption is that the few such tools in use today are pretty bad) @bkuhn @ossguy

                              cwebber@social.coopC richardfontana@mastodon.socialR bkuhn@fedi.copyleft.orgB 3 Replies Last reply
                              0
                              • richardfontana@mastodon.socialR richardfontana@mastodon.social

                                @cwebber I think adequate compliance might be possible with good enough detection/matching tools but I don't necessarily expect such tools to be developed (let alone available to foss projects) (my assumption is that the few such tools in use today are pretty bad) @bkuhn @ossguy

                                cwebber@social.coopC This user is from outside of this forum
                                cwebber@social.coopC This user is from outside of this forum
                                cwebber@social.coop
                                wrote sidst redigeret af
                                #202

                                @richardfontana @bkuhn @ossguy That's a problem so hard it throws the "NP complete" debate out the window in favor of something brand new. Given that these codebases have no trouble "translating" from one language's source code into another, how on *earth* could you possibly hope to build a compliance tool around that?

                                Laughable, to anyone who tries.

                                evan@cosocial.caE 1 Reply Last reply
                                0
                                • richardfontana@mastodon.socialR richardfontana@mastodon.social

                                  @cwebber I think adequate compliance might be possible with good enough detection/matching tools but I don't necessarily expect such tools to be developed (let alone available to foss projects) (my assumption is that the few such tools in use today are pretty bad) @bkuhn @ossguy

                                  richardfontana@mastodon.socialR This user is from outside of this forum
                                  richardfontana@mastodon.socialR This user is from outside of this forum
                                  richardfontana@mastodon.social
                                  wrote sidst redigeret af
                                  #203

                                  @cwebber to be clear compliance cannot somehow be built in to the LLM for reasons you stated, but ancillary tools for LLM users to reconstruct provenance exist and conceivably could be made more useful @bkuhn @ossguy

                                  cwebber@social.coopC 1 Reply Last reply
                                  0
                                  • cwebber@social.coopC cwebber@social.coop

                                    @bkuhn @ossguy @richardfontana So the question is: is it safe, from a legal perspective, given the current state of uncertainty of copyright of such contributions, to encourage accepting such contributions into repositories?

                                    Now clearly, many projects are: the Linux kernel most famously is, and their recent policy document says effectively, "You can contribute AI generated code, but the onus is on you whether or not you legally could have".

                                    Which is not very helpful of a handwave, I would say, since few contributors are equipped to assess such a thing. I've left myself and three others addressed in this portion of the thread, and all of us *have* done licensing work, and my suspicion is, *especially* based on what's been written, that none of us could confidently project where things are going to go.

                                    zacchiro@mastodon.xyzZ This user is from outside of this forum
                                    zacchiro@mastodon.xyzZ This user is from outside of this forum
                                    zacchiro@mastodon.xyz
                                    wrote sidst redigeret af
                                    #204

                                    @cwebber @bkuhn @ossguy @richardfontana

                                    My current answer to your "is it safe" question is to answer a slightly different question. Namely: "is it any less safe than accepting code from a random employee that claims to be submitting under a inbound=outbound regime, whereas in fact they cannot?". The latter we have been doing for decades, with limited damages to the commons.

                                    (I *also* think the legal odds are more in our favor with AI-assisted contributions than in the previous case.)

                                    cwebber@social.coopC ? 2 Replies Last reply
                                    0
                                    • zacchiro@mastodon.xyzZ zacchiro@mastodon.xyz

                                      @cwebber @bkuhn @ossguy @richardfontana

                                      My current answer to your "is it safe" question is to answer a slightly different question. Namely: "is it any less safe than accepting code from a random employee that claims to be submitting under a inbound=outbound regime, whereas in fact they cannot?". The latter we have been doing for decades, with limited damages to the commons.

                                      (I *also* think the legal odds are more in our favor with AI-assisted contributions than in the previous case.)

                                      cwebber@social.coopC This user is from outside of this forum
                                      cwebber@social.coopC This user is from outside of this forum
                                      cwebber@social.coop
                                      wrote sidst redigeret af
                                      #205

                                      @zacchiro @bkuhn @ossguy @richardfontana While true, there is a big difference in that the previous scenario was someone out of compliance with what the community actually accepted as hygienic and acceptable contributions, and those contributions were relatively rare.

                                      Saying that we don't need to worry about the risks from these tools right now from a licensing situation is different: it's advising on a path being acceptable where we *don't know* whether or not it's generally safe practice to recommend! And which most in this thread seem to agree we don't know. Even your post seems to say "it seems like it'll probably be okay and end up in our favor".

                                      I guess I feel increasingly like I am maybe the only "oldschool FOSS licensing wonk" who cares about this, and maybe that means I should just give up.

                                      But *damn* I can't believe it feels like when people are both saying "we don't know what the implications will be" we're also saying "so go ahead and say those patches are a-ok!"

                                      1 Reply Last reply
                                      0
                                      • richardfontana@mastodon.socialR richardfontana@mastodon.social

                                        @cwebber to be clear compliance cannot somehow be built in to the LLM for reasons you stated, but ancillary tools for LLM users to reconstruct provenance exist and conceivably could be made more useful @bkuhn @ossguy

                                        cwebber@social.coopC This user is from outside of this forum
                                        cwebber@social.coopC This user is from outside of this forum
                                        cwebber@social.coop
                                        wrote sidst redigeret af
                                        #206

                                        @richardfontana As said here, given the "translation between languages" aspect, I can't really see that as likely to be true https://social.coop/@cwebber/116426770262334234

                                        Which maybe that means that all this stuff really is public domain, a position I am *fully willing to accept*! But I don't think it's known (especially internationally), and I don't think @bkuhn or @ossguy are eager to adopt that perspective

                                        1 Reply Last reply
                                        0
                                        • cwebber@social.coopC cwebber@social.coop

                                          @evan @richardfontana I am saying we don't know the answer to that question, and it seems that @bkuhn and @ossguy agree that we don't know the answer to it, based on previous posts, and the lack of knowledge about what the copyright implications of LLM based contributions means that we are creating a schrodingers-licensing-timebomb for our FOSS codebases

                                          evan@cosocial.caE This user is from outside of this forum
                                          evan@cosocial.caE This user is from outside of this forum
                                          evan@cosocial.ca
                                          wrote sidst redigeret af
                                          #207

                                          @cwebber

                                          This is probably a healthy concern.

                                          I think there might be some good ways to hedge one's bets, though.

                                          Use LLMs for rubber ducking, code scanning and review, rather than code generation.

                                          Keep LLM code contributions minimal and unremarkable, too.

                                          Don't make them load-bearing. If the code is central to the program, it's too unique.

                                          @richardfontana @bkuhn @ossguy

                                          cwebber@social.coopC triptych@social.lolT fay@lingo.lolF evan@cosocial.caE 4 Replies Last reply
                                          0
                                          Svar
                                          • Svar som emne
                                          Login for at svare
                                          • Ældste til nyeste
                                          • Nyeste til ældste
                                          • Most Votes


                                          • Log ind

                                          • Har du ikke en konto? Tilmeld

                                          • Login or register to search.
                                          Powered by NodeBB Contributors
                                          Graciously hosted by data.coop
                                          • First post
                                            Last post
                                          0
                                          • Hjem
                                          • Seneste
                                          • Etiketter
                                          • Populære
                                          • Verden
                                          • Bruger
                                          • Grupper