GitHub Copilot is not infringing your copyright

poVoq · 3 years ago

GitHub Copilot is not infringing your copyright

poVoq · edit-2 3 years ago

Reading public code is not a copyright violation, neither is reproducing tiny snippets from it. The latter falls under fair use and/or doesn’t even have sufficient complexity to fall under copyright in the first place, e.g you can’t copyright “1+1=2”.

And if you use the copilot for reproducing more complex code, then the programmer but not the tool is doing a copyright violation.

Strongmanning your argument you could think this copilot itself is a derivative work of the code it read, but this AFAIK isn’t the case as it is building its own database out of it and then only referencing this database. You might have a slightly stronger argument that this database is a derivative work, but as far as I can tell there is nothing in the GPL that forbids creating a code database and reading from it. If there was, then Github itself (a giant code database) would be in violation of the GPL.

@nutomic@lemmy.ml · 3 years ago

GPL specifies that derived works have to be licensed under GPL, and similarly for other licenses. Their ML model wouldnt exist without the GPL code, ergo its a derived work. Github is not comparable at all, because the code hosted there is just data, not a core part of its functionality.

poVoq · edit-2 3 years ago

Feel free to disagree, but my (somewhat limited) understanding of such AI models says that the model data is not core part of its functionality either.

Edit: It’s like saying “the internet” is a core part of Google’s search algorithm’s functionality.