If you ask AI to rewrite the entirety of an open-source program, do you still need to abide by the original license?
-
If you ask AI to rewrite the entirety of an open-source program, do you still need to abide by the original license? In philosophy, this problem is known as the Slop of Theseus
@lcamtuf I thought it was Slop of Thelicenseus
-
If you ask AI to rewrite the entirety of an open-source program, do you still need to abide by the original license? In philosophy, this problem is known as the Slop of Theseus
@lcamtuf actual answer: of course you do, it’s prima facie a derivative work, same as if you had rewritten the program by hand.
-
@lcamtuf actual answer: of course you do, it’s prima facie a derivative work, same as if you had rewritten the program by hand.
-
If you ask AI to rewrite the entirety of an open-source program, do you still need to abide by the original license? In philosophy, this problem is known as the Slop of Theseus
Agreed. That was my initial take on the issue:
-
If you ask AI to rewrite the entirety of an open-source program, do you still need to abide by the original license? In philosophy, this problem is known as the Slop of Theseus
This case law exists in the U.S.

There are two cases (or arguably three if you include Sega v. SNK).

Here's what you really care about:
️ Any author of code is judged based on their own use of the existing code, so reverse-engineering of code used to be based on an engineer writing down, line by line, in plain English, what to do. Then a second person sat down and made up code, line-by-line to accomplish that task. Things have changed but the idea that you can't literally harvest existing code is still a thing.
️ You own the #AI made code but can't copyright it... so you can't profit from it in the same way. -
@lcamtuf actual answer: of course you do, it’s prima facie a derivative work, same as if you had rewritten the program by hand.
-
@bgalehouse @kevinr @lcamtuf it's a tempting argument to attempt but it kinda falls apart when "the entire library was in the training corpus anyway" is a given.
The fact that it is a terrible argument is of course not really going to stop anyone from making it.
-
Assuming you used the original source code to derive the detailed spec, then yes, that too is a derivative work.
The "viral" nature of that sort of license has bothered me for a long time. It's always been simultaneously overly far reaching and impossible to realistically enforce.
-
Assuming you used the original source code to derive the detailed spec, then yes, that too is a derivative work.
The "viral" nature of that sort of license has bothered me for a long time. It's always been simultaneously overly far reaching and impossible to realistically enforce.
But here's an interesting question:
If you do not execute the code - did you accept the license? Does simply reading it sufficiently to be able to write a spec bind you to that license? That seems a bit too much.
-
If you ask AI to rewrite the entirety of an open-source program, do you still need to abide by the original license? In philosophy, this problem is known as the Slop of Theseus
@lcamtuf The current declination by the Supreme Court to overturn or review this ruling: https://www.copyright.gov/rulings-filings/review-board/docs/a-recent-entrance-to-paradise.pdf Which holds things created by AI are neither "derivative works" or "original works" and are not eligible for Copyright protection so no, you don't need to abide by the previous license. No one does. And if someone reverse engineers your code DMCA doesn't apply either (it isn't copyrighted).
-
If you ask AI to rewrite the entirety of an open-source program, do you still need to abide by the original license? In philosophy, this problem is known as the Slop of Theseus
@lcamtuf someone or sonething else has done the work. not you. so whoever creates the work, owns the work.
-
@bgalehouse @kevinr @lcamtuf it's a tempting argument to attempt but it kinda falls apart when "the entire library was in the training corpus anyway" is a given.
The fact that it is a terrible argument is of course not really going to stop anyone from making it.
@SnoopJ There’s the concept of clean room reimplementations (see the link by @bgalehouse
one group writes the spec -- possibly with access to the source.The second group has never seen the source and only gets the spec. This second group then writes the program according to the spec.
You could simulate this if you had an AI that was provably not trained on the original source.
("provably not trained" most likely means re-training from scratch)
-
But here's an interesting question:
If you do not execute the code - did you accept the license? Does simply reading it sufficiently to be able to write a spec bind you to that license? That seems a bit too much.
@tbortels if you do not accept the license, you do not have any right to use the code. It’s "all rights reserved" then. @lcamtuf @bgalehouse @kevinr
-
But here's an interesting question:
If you do not execute the code - did you accept the license? Does simply reading it sufficiently to be able to write a spec bind you to that license? That seems a bit too much.
@tbortels @lcamtuf @bgalehouse @kevinr if a thing has a licence then that covers its use, so using it as a wallpaper image or software component or training data could be argued.
-
@tbortels if you do not accept the license, you do not have any right to use the code. It’s "all rights reserved" then. @lcamtuf @bgalehouse @kevinr
@ArneBab @tbortels @lcamtuf @bgalehouse
Yeah the license applies whether you accept it or not. And whether your spec counts as a derivative work or not will depend greatly on the details of your spec
-
-
@SnoopJ There’s the concept of clean room reimplementations (see the link by @bgalehouse
one group writes the spec -- possibly with access to the source.The second group has never seen the source and only gets the spec. This second group then writes the program according to the spec.
You could simulate this if you had an AI that was provably not trained on the original source.
("provably not trained" most likely means re-training from scratch)
@ArneBab @SnoopJ @bgalehouse @lcamtuf
And the spec would need to carefully elide certain details which would get it classed as a derivative work itself—much harder for an LLM to do than a team of humans
-
@lcamtuf actual answer: of course you do, it’s prima facie a derivative work, same as if you had rewritten the program by hand.
-
@lcamtuf actual answer: of course you do, it’s prima facie a derivative work, same as if you had rewritten the program by hand.
@lcamtuf There are a number of tools online which purport to strip the copyright from images by running them through an image model, and they're just as obviously bullshit
-
@ArneBab @SnoopJ @bgalehouse @lcamtuf
And the spec would need to carefully elide certain details which would get it classed as a derivative work itself—much harder for an LLM to do than a team of humans
@kevinr and proving that the AI was not trained on the original source will be pretty hard, because FLOSS programs with compatible licenses can legally copy code from one project into the other.
You’ll likely have to exclude all code from the project and all code that’s too similar from the training data. And then train an AI from scratch. Which would be extremely expensive.