👀 … https://sfconservancy.org/blog/2026/apr/15/eternal-november-generative-ai-llm/ …my colleague Denver Gingerich writes: newcomers' extensive reliance on LLM-backed generative AI is comparable to the Eternal September onslaught to USENET in 1993.
-
@evan @cwebber @bkuhn @ossguy @richardfontana Another major concern is that works generated by AI are not copyrightable per the US Supreme Court. So code generated by an LLM can not be licensed at all, open or closed. https://www.reuters.com/legal/government/us-supreme-court-declines-hear-dispute-over-copyrights-ai-generated-material-2026-03-02/
@sfoskett @evan @bkuhn @ossguy @richardfontana That outcome I am not worried about; code that's not copyrightable is considered in the public domain within the US, which means there aren't any real risks to incorporating into FOSS projects. But the Supreme Court punted on it, they didn't rule that way.
-
@evan @cwebber @bkuhn @ossguy @richardfontana Another major concern is that works generated by AI are not copyrightable per the US Supreme Court. So code generated by an LLM can not be licensed at all, open or closed. https://www.reuters.com/legal/government/us-supreme-court-declines-hear-dispute-over-copyrights-ai-generated-material-2026-03-02/
@sfoskett you can incorporate public domain code into a licensed work.
-
This is a really interesting question! TIL about CA vs. Altai and the abstraction-filtration-comparison test.
I'm not sure how automatable it is. Interesting to try though!
-
… https://sfconservancy.org/blog/2026/apr/15/eternal-november-generative-ai-llm/ …my colleague Denver Gingerich writes: newcomers' extensive reliance on LLM-backed generative AI is comparable to the Eternal September onslaught to USENET in 1993. I was on USENET extensively then; I confirm the disruption was indeed similar. I urge you to read his essay, think about it, & join Denver, me, & others at the following datetimes…
$ date -d '2026-04-21 15:00 UTC'
$ date -d '2026-04-28 23:00 UTC'
…in https://bbb-new.sfconservancy.org/rooms/welcome-llm-gen-ai-users-to-foss/join
#AI #LLM #OpenSourceSorry for interfering in the discussion out of the blue, but the topic is really interesting. I really hope that the conclusion of this will not be engineers saying they are not lawyers, and lawyers saying that it's for the courts to decide.
-
Sorry for interfering in the discussion out of the blue, but the topic is really interesting. I really hope that the conclusion of this will not be engineers saying they are not lawyers, and lawyers saying that it's for the courts to decide.
To chip in, a situation where AI-generated code is completely unacceptable is hard, if not impossible, to implement. This also puts a lot of pressure on reviewers, who have the difficult job of determining whether a piece of code is AI-generated. Sometimes it's easy; sometimes it's impossible under the conditions they operate in. If the code is good, it should be accepted.
-
To chip in, a situation where AI-generated code is completely unacceptable is hard, if not impossible, to implement. This also puts a lot of pressure on reviewers, who have the difficult job of determining whether a piece of code is AI-generated. Sometimes it's easy; sometimes it's impossible under the conditions they operate in. If the code is good, it should be accepted.
The practical issue is the fact that reviewers risk facing large amounts of PRs done completely with LLMs, and even by LLMs, under the name of a human who uses them. That generates enormous risk for code quality. Probably a practical way of handling this is to accept PRs only from community members who are validated with a status of “valid contributor”, making it more official, basically.
-
The practical issue is the fact that reviewers risk facing large amounts of PRs done completely with LLMs, and even by LLMs, under the name of a human who uses them. That generates enormous risk for code quality. Probably a practical way of handling this is to accept PRs only from community members who are validated with a status of “valid contributor”, making it more official, basically.
This would mean putting responsibility on those who are contributing: if they make PRs, those should be carefully analyzed by them beforehand, even if they use AI, since it's hard to control that.
-
This would mean putting responsibility on those who are contributing: if they make PRs, those should be carefully analyzed by them beforehand, even if they use AI, since it's hard to control that.
This will avoid a situation where a reviewer ends up reviewing what an LLM has written and then becomes, in some sense, an author without their consent of whatever the LLM outputs next, through comments on the PR and suggestions for improvement that will turn into future prompts.
-
This will avoid a situation where a reviewer ends up reviewing what an LLM has written and then becomes, in some sense, an author without their consent of whatever the LLM outputs next, through comments on the PR and suggestions for improvement that will turn into future prompts.
The sanction for not respecting that should be related to reputation within that community and decided locally: whether they further allow that person to contribute, completely ban contributions, close PRs from the beginning, etc.
-
The sanction for not respecting that should be related to reputation within that community and decided locally: whether they further allow that person to contribute, completely ban contributions, close PRs from the beginning, etc.
On the topic of proprietary code generated by LLMs and then accepted in OSS, the responsibility should be on the LLM company; the code should naturally inherit the OS license it is associated with. On the topic of LLM companies using OSS code inappropriately, the responsibility should again be on the LLM company. In both situations, courts will probably have opinions in the future, and LLM companies might consider adapting their use policies further.
-
On the topic of proprietary code generated by LLMs and then accepted in OSS, the responsibility should be on the LLM company; the code should naturally inherit the OS license it is associated with. On the topic of LLM companies using OSS code inappropriately, the responsibility should again be on the LLM company. In both situations, courts will probably have opinions in the future, and LLM companies might consider adapting their use policies further.
Something in their policies like: you can use LLM-generated code however you please, but consider that it is trained on X, Y, Z, and it might not follow the policies of where you use it, and you are using it at your own responsibility, might help them out. But it is still on them if they train models on things they should not, and the LLMs further generate questionable code from a policy perspective.
-
@sfoskett you can incorporate public domain code into a licensed work.
@evan @cwebber @bkuhn @ossguy @richardfontana Ok I haven’t really heard people before you guys explain that to me. So I was wondering if it was possible that it couldn’t be licensed. Thanks.
-
@richardfontana @cwebber @bkuhn @ossguy Yeah, I thought my job couldn't be automated, either, and yet here we are.
-
@richardfontana @cwebber @bkuhn @ossguy Yeah, I thought my job couldn't be automated, either, and yet here we are.
@richardfontana @cwebber @bkuhn @ossguy Seriously, though, a lot of the work seems like it is tractable to LLM automation?
Like, the abstraction part seems like it's just summarizing components at the function, module, and program level. This is the command-line argument parser, this is the database abstraction layer, this is the logging module. LLMs are pretty good at this!
-
@richardfontana @cwebber @bkuhn @ossguy Seriously, though, a lot of the work seems like it is tractable to LLM automation?
Like, the abstraction part seems like it's just summarizing components at the function, module, and program level. This is the command-line argument parser, this is the database abstraction layer, this is the logging module. LLMs are pretty good at this!
-
@cwebber @LordCaramac @bkuhn @richardfontana Sadly it will be years before we have an answer re copyright and we can't wait for that. Outlining usage in the meantime is the best we can do, in case we need to do something with that later.
We know proprietary software companies are using these tools extensively, so this is in effect a mutually assured destruction situation. While we wait, we should make sure that we are pushing freedom on all other axes, since they won't do that part.
@ossguy @cwebber @LordCaramac @bkuhn @richardfontana proprietary software companies extensively use GitHub and yet SFC's position is "don't use GitHub".
There are so many things we do in free software and in the interactions with SFC and FSF that would be simpler if we used proprietary software. How many janky experiences have people been asking to tolerate to participate? Why shouldn't we use proprietary software there?
-
@zacchiro @cwebber @bkuhn @ossguy @richardfontana I would say it's dramatically less safe. First, there's very little incentive to go after some OSS project over an unauthorized inbound=outbound contribution. Second, if someone did, the damage would likely be a small part of a single project. Third, only a small number of parties (the employer, or maybe some other single party whose code was copied) have the ability to sue.
With LLMs, it's different. When the authors sued Anthropic, they all sued. Is a shell script that Claude generated a derivative work of, say, the romantasy novel A Court of Thorns and Roses (to pick a random thing included in Anthropic's training set)? Well, it's hard to show that it's not, in the sense that that novel is one of the zillion things that went into generating the weights that generated the shell script.
Now it happens that the authors sued Anthropic (and settled). But I don't know if their settlement covers users of Claude (and even if it did, there are two other big models). And that's only the book authors -- there's still all of the code authors in the world.
So yes, I think the risk is high. I mean, in some sense -- in another sense, it seems unlikely that Congress would say, "sorry, LLMs as code generators are toast because of some century-old laws". At most, they would set up a statutory licensing scheme for LLM providers which covers LLM outputs. Of course, Europe might go a different way, but I think they would probably do the same. Under this hypothetical scheme, if your code were used to train Claude, you would get a buck or two in the mail every year. Authors got I think $3k per book as a one-time payment, but that was a funny case because of how Anthropic got access to the books.
Still, there's a risk that Congress wouldn't act (due to standard US government dysfunction).
It seems like most people are willing to take this risk, which I think says something interesting about most people's moral intuitions.
@novalis
I agree with your supporting arguments but not the conclusion.It goes back to the mutually assured destruction idea: no one in the for profit proprietary software industry is going to bring a lawsuit because they are so invested in LLM-backed AI succeeding.
That's where our commons differs widely from other creative works of expression.
I am worried about compulsory licensing for *training* —could be a disaster — but unrelated to output.
-
@evan @cwebber @bkuhn @ossguy @richardfontana Another major concern is that works generated by AI are not copyrightable per the US Supreme Court. So code generated by an LLM can not be licensed at all, open or closed. https://www.reuters.com/legal/government/us-supreme-court-declines-hear-dispute-over-copyrights-ai-generated-material-2026-03-02/
-
@richardfontana @cwebber @bkuhn @ossguy Seriously, though, a lot of the work seems like it is tractable to LLM automation?
Like, the abstraction part seems like it's just summarizing components at the function, module, and program level. This is the command-line argument parser, this is the database abstraction layer, this is the logging module. LLMs are pretty good at this!
For filtration, it seems like merger or scènes à faire would also be kind of automatable, maybe with human oversight. Is there a way to make a mailing daemon without a logging module? Maybe, but it's so common that everyone does it that way. Could you have a Person class without a getter and setter for the name? Probably not?
-
For filtration, it seems like merger or scènes à faire would also be kind of automatable, maybe with human oversight. Is there a way to make a mailing daemon without a logging module? Maybe, but it's so common that everyone does it that way. Could you have a Person class without a getter and setter for the name? Probably not?
The comparison seems tough, but I'd put an LLM to the task. "How similar are the database abstraction layers in activitypub-bot and Fedify?" Again, I'd probably want some human review, but for that code stuff LLMs are pretty good.