👀 … https://sfconservancy.org/blog/2026/apr/15/eternal-november-generative-ai-llm/ …my colleague Denver Gingerich writes: newcomers' extensive reliance on LLM-backed generative AI is comparable to the Eternal September onslaught to USENET in 1993.
-
If there was ever a time in 40+ years of #FOSS history to tell our #copyleft-hating FOSS friends that they erred in their license choices, now is the time.
If they don't switch, they're giving hand-outs to the proprietary software companies. Now, in an entirely new & #disturbing way.
I really think these cases where proprietary software ends up in #LLM training sets & actually creates risk are exceedingly rare, if not entirely hypothetical.
@bkuhn @cwebber @bwana @zacchiro @richardfontana Permissive licenses except extreme ones like maybe 0BSD don't allow LLM training or regurgitation either. They almost universally require copyright notice & attribution which slop can't do.
-
@bkuhn @cwebber @zacchiro @richardfontana We don't see any successful litigation yet stopping abusive "AI" companies from scooping up any free software, regardless of the strength of the license. It's a matter of power imbalance and who controls the legal system, not who's right or wrong. I don't thing stronger license terms will save us. Only organizing to bring them down can do that.
-
@bkuhn @cwebber @bwana @zacchiro @richardfontana Permissive licenses except extreme ones like maybe 0BSD don't allow LLM training or regurgitation either. They almost universally require copyright notice & attribution which slop can't do.
Have you seen attribution lawsuits anywhere in the 50 years of the BSD license?
Of course, you have a point. But we're facing possible compulsory licensing if we don't improve our strategy. We may do the wrong thing, & if a BSD-licensed project wants to burn it down & sue a copylefted project for non-attribution infringement, let's deal with that when it comes.
& I'm speaking as one of the few who has actually been threatened with such a lawsuit.
-
@bkuhn @cwebber @zacchiro @richardfontana We don't see any successful litigation yet stopping abusive "AI" companies from scooping up any free software, regardless of the strength of the license. It's a matter of power imbalance and who controls the legal system, not who's right or wrong. I don't thing stronger license terms will save us. Only organizing to bring them down can do that.
@dalias
Any complex public policy litigation takes ≥ 10 years to make its way through the Courts. *Thaler* was an anomaly precisely because Thaler narrowed the issue on purpose to a pointless degree (just to make a point, apparently).
The reason copyleft exists is because rather than wait for the Courts & legislature to make good law, we used the existing system against itself.I propose we design a series of moves that can do the same for LLM-backed genAI.
-
@dalias
Any complex public policy litigation takes ≥ 10 years to make its way through the Courts. *Thaler* was an anomaly precisely because Thaler narrowed the issue on purpose to a pointless degree (just to make a point, apparently).
The reason copyleft exists is because rather than wait for the Courts & legislature to make good law, we used the existing system against itself.I propose we design a series of moves that can do the same for LLM-backed genAI.
@bkuhn @cwebber @zacchiro @richardfontana At the time we used the existing system against itself, it was largely a working system. Corporations actually feared "IP" law. They put
and
and
all over the place meticulously. Companies had lost theirs by not doing everything by the letter of the law in the past.Nowadays, we hardly have a rule of law to begin with, and corporations just get to settle any infringement they're caught doing for 0.01% of the profit they made off the infringement. I don't see a way to use this system against itself. It's going to take more drastic things to tear it down.
-
@bkuhn @cwebber @zacchiro @richardfontana At the time we used the existing system against itself, it was largely a working system. Corporations actually feared "IP" law. They put
and
and
all over the place meticulously. Companies had lost theirs by not doing everything by the letter of the law in the past.Nowadays, we hardly have a rule of law to begin with, and corporations just get to settle any infringement they're caught doing for 0.01% of the profit they made off the infringement. I don't see a way to use this system against itself. It's going to take more drastic things to tear it down.
I disagree: the system never “worked”. It was always stacked against users' rights & freedom to share.
I have said elsewhere in this mega-thread: Big Tech vs. Big Content battle in the Courts has so many moving parts that we'll never predict the outcome quite right. FOSS is adrift on that sea, & we have to do the best we can now with the information we have.
As I keep saying, the boycotts have not solved the problem.
What's next?
-
@RichardJActon
The copyleft-ish hack I propose is *we* (FOSS community) assume that any output of an LLM-backed genAI system *is* copylefted (since we are pretty sure all such systems — at least those designed for software development assist — have been trained on copylefted codebases).
Then, we copyleft any work that comes out of the system.
The only threat is proprietary software in the training set, & the industry can't abide enforcing *that*!
@cwebber @ossguy @richardfontana
@evan
@kees -
@novalis
I agree with your supporting arguments but not the conclusion.It goes back to the mutually assured destruction idea: no one in the for profit proprietary software industry is going to bring a lawsuit because they are so invested in LLM-backed AI succeeding.
That's where our commons differs widely from other creative works of expression.
I am worried about compulsory licensing for *training* —could be a disaster — but unrelated to output.
@bkuhn @zacchiro @cwebber @ossguy @richardfontana Actually, I just thought of one proprietary software company that would be much happier not to have LLMs around: Salesforce. Nobody's going to buy their overpriced shit when the alternative is to vibecode something that works exactly with your business process and that you can change, yourself, any time you want at the cost of a couple hundred bucks of Claude and a few hours of work.
-
If there was ever a time in 40+ years of #FOSS history to tell our #copyleft-hating FOSS friends that they erred in their license choices, now is the time.
If they don't switch, they're giving hand-outs to the proprietary software companies. Now, in an entirely new & #disturbing way.
I really think these cases where proprietary software ends up in #LLM training sets & actually creates risk are exceedingly rare, if not entirely hypothetical.
@bkuhn This might be of interest, https://en.wikipedia.org/wiki/Open_source_license_litigation.
-
Re: “polluting”, my reply is: https://fedi.copyleft.org/@bkuhn/116426437134023846 (elsewhere in thread).
Re: “copyleft-only #LLM”: I didn't propose that. I proposed copylefting the human-modified output of LLMs.
Re: “two scenarios”: IMO you propose a false dichotomy.
I hope you come to one of #SFC's public sessions on this, as I'd be glad to talk more about it, & this discussion doesn't lend itself to online debate because it's so complex.
> Re: “copyleft-only #LLM”: I didn't propose that. I proposed copylefting the human-modified output of LLMs.
You didn't propose it, but @ossguy brought it up here: https://fedi.copyleft.org/@ossguy/116411885602822736
You two have been speaking fairly collaboratively in this thread, so I'm assuming relatively synced at the moment. Since I presume the goal of "only train on copylefted [and implied, compatible] software" would not be, from your end, to erase copyleft, my assumption here is that the hope was that the output was also copylefted.
And I presume reading this that you do hope that would be the presumption: https://fedi.copyleft.org/@bkuhn/116428083639528264
-
Have you seen attribution lawsuits anywhere in the 50 years of the BSD license?
Of course, you have a point. But we're facing possible compulsory licensing if we don't improve our strategy. We may do the wrong thing, & if a BSD-licensed project wants to burn it down & sue a copylefted project for non-attribution infringement, let's deal with that when it comes.
& I'm speaking as one of the few who has actually been threatened with such a lawsuit.
@bkuhn @dalias @zacchiro @richardfontana I don't have them on hand, but I remember attribution enforcement with CC BY to be pretty common, particularly making newspapers update their use of "stock photo" style headers with CC BY licensed photos. I don't remember, off hand, escalation in enforcement to a court case (I feel like there may have been, but usually when attribution requirements are pointed out, people simply comply pretty quickly, since it's not hard to comply with and rarely against interests... LLMs are a change there, since there's not really a way to do so)
-
@bkuhn @dalias @zacchiro @richardfontana I don't have them on hand, but I remember attribution enforcement with CC BY to be pretty common, particularly making newspapers update their use of "stock photo" style headers with CC BY licensed photos. I don't remember, off hand, escalation in enforcement to a court case (I feel like there may have been, but usually when attribution requirements are pointed out, people simply comply pretty quickly, since it's not hard to comply with and rarely against interests... LLMs are a change there, since there's not really a way to do so)
@bkuhn @dalias @zacchiro @richardfontana (As in, I remember that being common with CC BY and newspapers, when I was working at Creative Commons)
-
@bkuhn This might be of interest, https://en.wikipedia.org/wiki/Open_source_license_litigation.
I assure you there are few people more familiar with those cases than @bkuhn, even if the two of us are disagreeing right now
-
However, it's not actually the laundering angle I am concerned with here entirely, it's whether we're turning FOSS codebases into potential legal toxic waste dumps that we will have a hell of a time cleaning up later.
The previous Conservancy post, which @bkuhn linked upthread, indicates that Conservancy does indeed consider the matter unsettled.
Current LLMs wouldn't "default to copyleft", since they also include all-rights-reserved mixed in there. If the result of output of these systems is a slurry of inputs which carry their licensing somehow, their default licensing output situation is one of a hazard.
I note that @bkuhn and @ossguy seem to be hinting at hoping a "copyleft based LLM" with all-copyleft output it a winning scenario. I'm going to state plainly: I believe that's an impossible outcome.
@cwebber @bkuhn @ossguy @richardfontana On top of "potential legal toxic waste dumps" we'd be making them known technical toxic waste dumps.

-
@dalias
Any complex public policy litigation takes ≥ 10 years to make its way through the Courts. *Thaler* was an anomaly precisely because Thaler narrowed the issue on purpose to a pointless degree (just to make a point, apparently).
The reason copyleft exists is because rather than wait for the Courts & legislature to make good law, we used the existing system against itself.I propose we design a series of moves that can do the same for LLM-backed genAI.
@bkuhn @dalias @cwebber @zacchiro @richardfontana
What might those moves look like?
-
I assure you there are few people more familiar with those cases than @bkuhn, even if the two of us are disagreeing right now
it is a bit fun to read that and note I was involved in some way with more than I wasn't. Listing old patent cases is really tangential though.
-
It's solved with a new copyleft-next clause I have not pitched you yet.
Remember how I keep telling we need to talk every week?

-
@bkuhn @dalias @cwebber @zacchiro @richardfontana
What might those moves look like?
I'm working on it. It will require good, disciplined behavior by key FOSS contributors and a better copyleft license.
@dalias @cwebber @zacchiro @richardfontana -
A abekonge@venner.network shared this topic