Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. 👀 … https://sfconservancy.org/blog/2026/apr/15/eternal-november-generative-ai-llm/ …my colleague Denver Gingerich writes: newcomers' extensive reliance on LLM-backed generative AI is comparable to the Eternal September onslaught to USENET in 1993.

👀 … https://sfconservancy.org/blog/2026/apr/15/eternal-november-generative-ai-llm/ …my colleague Denver Gingerich writes: newcomers' extensive reliance on LLM-backed generative AI is comparable to the Eternal September onslaught to USENET in 1993.

Planlagt Fastgjort Låst Flyttet Ikke-kategoriseret
llmopensource
310 Indlæg 57 Posters 0 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne tråd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • richardjacton@fosstodon.orgR richardjacton@fosstodon.org

    @cwebber @bkuhn @ossguy @richardfontana Under this view it doesn't matter how the training data was licensed as it's a fair use defense. The outputs being uncopyrightable / effectively public domain allows people to claim they wrote it when it's convenient and they want to be able to copyright it as it's hard to prove if it was AI generated or human authored. And simultaneously to claim that it was the output of and LLM when they want to strip inconvenient licensing terms.

    bkuhn@fedi.copyleft.orgB This user is from outside of this forum
    bkuhn@fedi.copyleft.orgB This user is from outside of this forum
    bkuhn@fedi.copyleft.org
    wrote sidst redigeret af
    #265

    @RichardJActon
    The copyleft-ish hack I propose is *we* (FOSS community) assume that any output of an LLM-backed genAI system *is* copylefted (since we are pretty sure all such systems — at least those designed for software development assist — have been trained on copylefted codebases).
    Then, we copyleft any work that comes out of the system.
    The only threat is proprietary software in the training set, & the industry can't abide enforcing *that*!
    @cwebber @ossguy @richardfontana
    @evan
    @kees

    richardfontana@mastodon.socialR 1 Reply Last reply
    0
    • cwebber@social.coopC cwebber@social.coop

      @bkuhn @evan @richardfontana @ossguy Probably a ton of people here think I am anti-AI-output, and that I would be upset to find out that the chardet rewrite were legal.

      Actually, I'm not! I'd be fine with the ability to copyright launder software to some degree, as long as we could do the same for proprietary software (including in binary form).

      I'm concerned about whether or not we have an *equitable* situation, though. And I'm *more concerned* that we need to advise people, who are incorporating code *today*.

      bkuhn@fedi.copyleft.orgB This user is from outside of this forum
      bkuhn@fedi.copyleft.orgB This user is from outside of this forum
      bkuhn@fedi.copyleft.org
      wrote sidst redigeret af
      #266

      @cwebber

      We already know the situation isn't equitable & probably won't become such in our lifetimes. Microsoft already all-but-admitted they will never train Copilot on their code. No proprietary software company is going to offer training data back to other vendors.

      The goal here obviously was to LLM-wash away copyleft. *That* we must resist, and use their own tools against them: which is the very spirit that made copyleft in the first place!

      Cc: @evan @richardfontana @ossguy @kees
      @karen

      1 Reply Last reply
      0
      • evan@cosocial.caE evan@cosocial.ca

        I consider myself an expert on this process since I learned about it 45 minutes ago, but it seems like AFC follows the hierarchical layers of modern programming-in-the-large -- statements, functions, modules, packages, program. That is the stuff that LLMs handle pretty well.

        @richardfontana @cwebber @bkuhn @ossguy

        bkuhn@fedi.copyleft.orgB This user is from outside of this forum
        bkuhn@fedi.copyleft.orgB This user is from outside of this forum
        bkuhn@fedi.copyleft.org
        wrote sidst redigeret af
        #267

        @evan wrote:

        > “I consider myself an expert on this process since I learned about it 45 minutes ago ”

        This is the second time you've made me 🤣 in this thread. Thanks for being comic relief (and I know that's not *all* you're doing, but that part is particularly helpful). Thank you!

        Cc:
        @richardfontana @cwebber @ossguy
        @karen

        evan@cosocial.caE 1 Reply Last reply
        0
        • sfoskett@techfieldday.netS sfoskett@techfieldday.net

          @richardfontana @evan @cwebber @bkuhn @ossguy I feel like it’s 3 questions for the court:
          1 Can a non-human actor produce a copyrightable work? Likely no.
          2 Is the human prompt and review enough to apply copyright to LLM content? Maybe?
          3 Does this have implications for open source? I guess not.

          bkuhn@fedi.copyleft.orgB This user is from outside of this forum
          bkuhn@fedi.copyleft.orgB This user is from outside of this forum
          bkuhn@fedi.copyleft.org
          wrote sidst redigeret af
          #268

          @sfoskett

          *Thaler is limited to DC Circuit & very narrow. It's a registration question, & even *its* dicta hints there is no way we can know the answer on (1).

          I think (2) is a strong argument.

          As for (3), there is huge value to be extracted by applying copyleft-ish principles (and copyleft licenses themselves) to LLM-backed genAI output.

          In worse case: a big complex mix of public domain + copylefted-human-authored stuff can't easily be separated.

          @richardfontana @evan @cwebber @ossguy

          sfoskett@techfieldday.netS 1 Reply Last reply
          0
          • wwahammy@social.treehouse.systemsW wwahammy@social.treehouse.systems

            @ossguy @cwebber @LordCaramac @bkuhn @richardfontana proprietary software companies extensively use GitHub and yet SFC's position is "don't use GitHub".

            There are so many things we do in free software and in the interactions with SFC and FSF that would be simpler if we used proprietary software. How many janky experiences have people been asking to tolerate to participate? Why shouldn't we use proprietary software there?

            bkuhn@fedi.copyleft.orgB This user is from outside of this forum
            bkuhn@fedi.copyleft.orgB This user is from outside of this forum
            bkuhn@fedi.copyleft.org
            wrote sidst redigeret af
            #269

            @wwahammy

            Indeed, SFC's position is #GiveUpGithub, but N.B. the https://giveupgithub.com/ site itself admits most people will uses it & suggests a “using Github under protest” README.md.

            I use proprietary software every day. I've been convinced for ≥ 10yrs: one can't succeed in an industrialized nation at *anything* w/out sometimes doing so.

            The difficulty is figuring out when to compromise. I remain open-minded.
            Few of us will be FOSS monks.

            @ossguy @cwebber @LordCaramac @richardfontana

            1 Reply Last reply
            0
            • richardfontana@mastodon.socialR richardfontana@mastodon.social

              @evan oh I mean of course you could use LLMs to help with the analysis @cwebber @bkuhn @ossguy

              bkuhn@fedi.copyleft.orgB This user is from outside of this forum
              bkuhn@fedi.copyleft.orgB This user is from outside of this forum
              bkuhn@fedi.copyleft.org
              wrote sidst redigeret af
              #270

              @richardfontana wrote:
              > “oh I mean of course you could use LLMs to help with the analysis ”

              I'm catching up backwards on this thread, but do you see now the monster you created by telling @evan that?

              🤣

              cc: @cwebber @ossguy @karen

              evan@cosocial.caE 1 Reply Last reply
              0
              • evan@cosocial.caE evan@cosocial.ca

                @richardfontana @cwebber @bkuhn @ossguy Yeah, I thought my job couldn't be automated, either, and yet here we are.

                bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                bkuhn@fedi.copyleft.org
                wrote sidst redigeret af
                #271

                LLM-backed genAI never makes as good jokes as you do, @evan

                But are you finally coming clean with us here today that, in fact, #EvanPoll's are all created by a genAI system?

                Cc: @richardfontana @cwebber @ossguy @karen

                evan@cosocial.caE 1 Reply Last reply
                0
                • sfoskett@techfieldday.netS sfoskett@techfieldday.net

                  @evan @cwebber @bkuhn @ossguy @richardfontana Another major concern is that works generated by AI are not copyrightable per the US Supreme Court. So code generated by an LLM can not be licensed at all, open or closed. https://www.reuters.com/legal/government/us-supreme-court-declines-hear-dispute-over-copyrights-ai-generated-material-2026-03-02/

                  bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                  bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                  bkuhn@fedi.copyleft.org
                  wrote sidst redigeret af
                  #272

                  @sfoskett
                  I responded in detail in another post to your conclusions later, but the assumption is wrong too. It's just pure FUD to say: “works generated by AI are not copyrightable per the US Supreme Court”.
                  https://sfconservancy.org/blog/2026/mar/04/scotus-deny-cert-dc-circuit-thaler-appeal-llm-ai/
                  TL;DR: *DC Circuit* held that a specific copyright registration *for a digital painting* that lists a computer program as the sole author is not eligible *at this time* for copyright *registration*. SCOTUS decided to not hear the case.

                  @evan @cwebber @richardfontana

                  1 Reply Last reply
                  0
                  • richardfontana@mastodon.socialR richardfontana@mastodon.social

                    @cwebber I think adequate compliance might be possible with good enough detection/matching tools but I don't necessarily expect such tools to be developed (let alone available to foss projects) (my assumption is that the few such tools in use today are pretty bad) @bkuhn @ossguy

                    bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                    bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                    bkuhn@fedi.copyleft.org
                    wrote sidst redigeret af
                    #273

                    @richardfontana

                    I'm with @cwebber, there is no way to automate compliance. But, again, we should use that to our advantage in a copyleft-ish way.

                    Cc: @ossguy

                    1 Reply Last reply
                    0
                    • bkuhn@fedi.copyleft.orgB bkuhn@fedi.copyleft.org

                      @evan wrote:

                      > “I consider myself an expert on this process since I learned about it 45 minutes ago ”

                      This is the second time you've made me 🤣 in this thread. Thanks for being comic relief (and I know that's not *all* you're doing, but that part is particularly helpful). Thank you!

                      Cc:
                      @richardfontana @cwebber @ossguy
                      @karen

                      evan@cosocial.caE This user is from outside of this forum
                      evan@cosocial.caE This user is from outside of this forum
                      evan@cosocial.ca
                      wrote sidst redigeret af
                      #274

                      @bkuhn @richardfontana @cwebber @ossguy @karen thanks! I hope I wasn't too flip.

                      1 Reply Last reply
                      0
                      • cwebber@social.coopC cwebber@social.coop

                        @bkuhn @ossguy @richardfontana So let me summarize:

                        - Without knowing the legal status of accepting LLM contributions, we're potentially polluting our codebases with stuff that we are going to have a HELL of a time cleaning up later
                        - The idea of a copyleft-only LLM is a joke and we should not rely on it
                        - We really only have two realistic scenarios: either FOSS projects cannot accept LLM based contributions legally from an international perspective, or everything is effectively in the public domain as outputted from these machines, but at least in the latter scenario we get to weaken copyright for everyone.

                        That's leaving out a lot of other considerations about LLMs and the ethics of using them, which I think most of the other replies were focused on, I largely focused on the copyright implications aspects in this subthread. Because yes, I agree, it can be important to focus a conversation.

                        But we can't ignore this right now.

                        We're putting FOSS codebases at risk.

                        larsmb@mastodon.onlineL This user is from outside of this forum
                        larsmb@mastodon.onlineL This user is from outside of this forum
                        larsmb@mastodon.online
                        wrote sidst redigeret af
                        #275

                        @cwebber @bkuhn @ossguy @richardfontana FWIW, I'd be delighted to read this as a blog post.

                        I'm still baffled that chardet just sidestepped this via 0BSD, sort of.

                        A thought that recently struck me that, if code is essentially impossible to license now, will we see a resurgence in other forms of IP, like ... software patents?

                        Those *would* be defensible post-laundering ...

                        1 Reply Last reply
                        0
                        • bkuhn@fedi.copyleft.orgB bkuhn@fedi.copyleft.org

                          LLM-backed genAI never makes as good jokes as you do, @evan

                          But are you finally coming clean with us here today that, in fact, #EvanPoll's are all created by a genAI system?

                          Cc: @richardfontana @cwebber @ossguy @karen

                          evan@cosocial.caE This user is from outside of this forum
                          evan@cosocial.caE This user is from outside of this forum
                          evan@cosocial.ca
                          wrote sidst redigeret af
                          #276

                          @bkuhn @richardfontana @cwebber @ossguy @karen sadly no!

                          I really don't like having anyone, including AI systems, write for me under my own name. Not least because I don't like the style and tone of ChatGPT and friends. They just write very blandly.

                          1 Reply Last reply
                          0
                          • bkuhn@fedi.copyleft.orgB bkuhn@fedi.copyleft.org

                            @sfoskett

                            *Thaler is limited to DC Circuit & very narrow. It's a registration question, & even *its* dicta hints there is no way we can know the answer on (1).

                            I think (2) is a strong argument.

                            As for (3), there is huge value to be extracted by applying copyleft-ish principles (and copyleft licenses themselves) to LLM-backed genAI output.

                            In worse case: a big complex mix of public domain + copylefted-human-authored stuff can't easily be separated.

                            @richardfontana @evan @cwebber @ossguy

                            sfoskett@techfieldday.netS This user is from outside of this forum
                            sfoskett@techfieldday.netS This user is from outside of this forum
                            sfoskett@techfieldday.net
                            wrote sidst redigeret af
                            #277

                            @bkuhn @richardfontana @evan @cwebber @ossguy Wow I really appreciate you weighing in here! I was thinking Naruto v. Slater for point one not just Thaler but I certainly defer to your expertise especially on point 3.

                            sfoskett@techfieldday.netS 1 Reply Last reply
                            0
                            • bkuhn@fedi.copyleft.orgB bkuhn@fedi.copyleft.org

                              @richardfontana wrote:
                              > “oh I mean of course you could use LLMs to help with the analysis ”

                              I'm catching up backwards on this thread, but do you see now the monster you created by telling @evan that?

                              🤣

                              cc: @cwebber @ossguy @karen

                              evan@cosocial.caE This user is from outside of this forum
                              evan@cosocial.caE This user is from outside of this forum
                              evan@cosocial.ca
                              wrote sidst redigeret af
                              #278

                              @bkuhn @richardfontana @cwebber @ossguy @karen hahahaha sorry!

                              It wasn't till I had gone through the exercise that I realized I was doing work in a similar vein that you'd already committed to do. I hope it wasn't too monstrous.

                              1 Reply Last reply
                              0
                              • cwebber@social.coopC cwebber@social.coop

                                @evan @bkuhn @ossguy @richardfontana Say for a moment that we *did* make a model which intentionally pulled in leaked source code from various proprietary codebases.

                                What would your opinion be on the legal-hazard state of accepting that code output? Would you consider it relatively safe from a copyright perspective?

                                bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                                bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                                bkuhn@fedi.copyleft.org
                                wrote sidst redigeret af
                                #279

                                @cwebber

                                Wow, 2ⁿᵈ time in 2 days that I can work in quotes from ST:TNG,“Unification” (S05E07-8)!
                                To quote the Ferengi, Omag¹:
                                > Omag: “Hypothetically speaking?”
                                > Riker: “Yes.”
                                > Omag: “I never learned to speak hypothetical.”

                                IOW, E_TOO_MANY_NON_HYPOTHETICAL_PROBLEMS_WITH_AI

                                ¹ I had to look up Omag's name — my ST:TNG knowledge is not *that* encyclopedic. But see image: Google's G-E-H-munyae can't tell Klingons from Ferengi.

                                Cc: @evan @richardfontana @karen

                                #StarTrek #AI #LLM #Gemini

                                1 Reply Last reply
                                0
                                • cwebber@social.coopC cwebber@social.coop

                                  @evan @richardfontana I am saying we don't know the answer to that question, and it seems that @bkuhn and @ossguy agree that we don't know the answer to it, based on previous posts, and the lack of knowledge about what the copyright implications of LLM based contributions means that we are creating a schrodingers-licensing-timebomb for our FOSS codebases

                                  bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                                  bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                                  bkuhn@fedi.copyleft.org
                                  wrote sidst redigeret af
                                  #280

                                  @cwebber

                                  I don't see a plausible path where the timebomb exists: (a) likely none of these proprietary LLM-backed genAI systems are *trained* on proprietary software, & (b) even if they *are*, the proprietary industry as a whole seems very much to *want* to maintain this absolute fiction that these systems are magically always public domain, & if not fair use defense always works.

                                  We meanwhile use copyleft-ish strategies to beat them at their own game.

                                  Cc: @evan @richardfontana
                                  @zacchiro

                                  trashheap@tech.lgbtT 1 Reply Last reply
                                  0
                                  • bkuhn@fedi.copyleft.orgB bkuhn@fedi.copyleft.org

                                    @jedbrown

                                    A case from 2022 still not a trial in 2026 doesn't indicate unreasonable or manipulative delay by Defendants. Such cases really do take that long.

                                    Also, Doe vs. Microsoft's Github is a terribly constructed case and actually pushes us toward compulsory licensing of #FOSS works for #LLM-backed gen-#AI training— since the Plaintiff's lawyers in that case are clearly chasing their own avarice, not software freedom.

                                    Background:
                                    https://sfconservancy.org/news/2022/nov/04/class-action-lawsuit-filing-copilot/

                                    @cwebber @ossguy @richardfontana

                                    jedbrown@hachyderm.ioJ This user is from outside of this forum
                                    jedbrown@hachyderm.ioJ This user is from outside of this forum
                                    jedbrown@hachyderm.io
                                    wrote sidst redigeret af
                                    #281

                                    @bkuhn
                                    I had browsed the docket, but you are right that it is not for me to say whether motions are a delay, and plaintiffs also do not seem to be in a rush (e.g., joint motion to postpone deadlines). The point is that we don't know how such litigation will play out, especially in light of the volatility of public sentiment about this industry.

                                    Has anyone written an analysis of how their case pushes toward compulsory licensing?

                                    If LLM outputs routinely constitute derivative works, then it is impossible to comply with licenses (even permissive ones) without acknowledging all such training data and/or constant open-ended research quests as due diligence that each response does not infringe an unknown corpus. The companies don't want to disclose their corpus because their business relies on not acknowledging the derivative relation.

                                    @cwebber @ossguy @richardfontana

                                    1 Reply Last reply
                                    0
                                    • fuzzychef@m6n.ioF fuzzychef@m6n.io

                                      @cwebber @bkuhn @ossguy @richardfontana

                                      Based on my following of current legal cases, I think it's entirely possible that in a year or two we'll suddenly be rolling large OSS codebases back to 2023. And won't that be fun!

                                      bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                                      bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                                      bkuhn@fedi.copyleft.org
                                      wrote sidst redigeret af
                                      #282

                                      @fuzzychef

                                      Can you please cite the actual precedent?

                                      If it's ongoing, yet-undecided cases you mean, which of the 100s of cases do you mean, what rulings have occurred that lead you to this speculation, and why?

                                      I know you didn't mean to, but your post just feeds the FUD monsters.

                                      Cc: @cwebber @ossguy @richardfontana
                                      @evan

                                      1 Reply Last reply
                                      0
                                      • jens@social.finkhaeuser.deJ jens@social.finkhaeuser.de

                                        @cwebber @bkuhn @ossguy @richardfontana Worse IMHO is that we're putting FOSS as a movement at risk if we deskill everyone to the point where you either pay money to have code generated for you, or there is no code.

                                        bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                                        bkuhn@fedi.copyleft.orgB This user is from outside of this forum
                                        bkuhn@fedi.copyleft.org
                                        wrote sidst redigeret af
                                        #283

                                        @jens wrote:
                                        > “we're putting FOSS as a movement at risk if we deskill everyone to the point where you either pay money to have code generated for you, or there is no code.”

                                        I agree completely. We *need* to encourage extreme discipline if LLM-backed genAI systems are used for software to ensure: (a) experienced developers' skills don't atrophy, and (b) ensure that new developers understand these tools aren't for neophytes b/c they newbies far astray.

                                        Cc: @cwebber @ossguy @richardfontana

                                        1 Reply Last reply
                                        0
                                        • bkuhn@fedi.copyleft.orgB bkuhn@fedi.copyleft.org

                                          @cwebber

                                          I don't see a plausible path where the timebomb exists: (a) likely none of these proprietary LLM-backed genAI systems are *trained* on proprietary software, & (b) even if they *are*, the proprietary industry as a whole seems very much to *want* to maintain this absolute fiction that these systems are magically always public domain, & if not fair use defense always works.

                                          We meanwhile use copyleft-ish strategies to beat them at their own game.

                                          Cc: @evan @richardfontana
                                          @zacchiro

                                          trashheap@tech.lgbtT This user is from outside of this forum
                                          trashheap@tech.lgbtT This user is from outside of this forum
                                          trashheap@tech.lgbt
                                          wrote sidst redigeret af
                                          #284

                                          @bkuhn @cwebber @evan @richardfontana @zacchiro Even if attribution issues disappear. Surely it's a time bomb in terms of projects who are intentionally not using copyleft licenses. Or incompatible licenses?

                                          trashheap@tech.lgbtT 1 Reply Last reply
                                          0
                                          Svar
                                          • Svar som emne
                                          Login for at svare
                                          • Ældste til nyeste
                                          • Nyeste til ældste
                                          • Most Votes


                                          • Log ind

                                          • Har du ikke en konto? Tilmeld

                                          • Login or register to search.
                                          Powered by NodeBB Contributors
                                          Graciously hosted by data.coop
                                          • First post
                                            Last post
                                          0
                                          • Hjem
                                          • Seneste
                                          • Etiketter
                                          • Populære
                                          • Verden
                                          • Bruger
                                          • Grupper