GitHub Copilot is Getting Sued
Week 61 of Founding Typogram
I have written about AI coding assistants like Tabnine. Interestingly, a very similar competitor product GitHub Copilot is getting sued! The lawsuit is led by none other than Matthew Butterick, a familiar face in the typography field.
To step back a little, GitHub Copilot is an AI coding assistant that helps users code by predicting what they are about to write and suggesting it. Sometimes all you need is to write the function name, and it auto-completes the rest of the function for you! It worked very well and got lots of compliments on Twitter.
GitHub claims the Copilot AI is trained using code from public GitHub repositories, but some authors of these GitHub repositories are suing GitHub in a class action because the usage (training AI using their code) is violating the licenses of their project. Some of the code in these public repositories is under licenses that require attribution of the author’s name. When Copilot prompts code suggestions to its users, there is no attribution to anyone, therefore not meeting the prerequisite of the usage permission.
This lawsuit reminds me of a story that happened to me a couple of years ago. In hindsight, how I responded then is highly related to what my stance is on this lawsuit now.
I worked on a personal project called Font Playground, which explores user interface design around variable fonts; one of the UI innovations I explored s a “2d slider” that controls two variable axes at the same time:
I posted the project on GitHub for web hosting purposes but didn’t add any licensing info to it. This project is more of a design project than a coding project, and the code was written just for the website to run, not meant to be re-used, so I didn’t add any typical open-sourced licenses to it. My intention for the project is to spread and promote my design ideas around variable fonts UI.
One day, I was contacted by someone working on a commissioned project for Google Fonts. They are working on a font tool and wanted to use the slider modules that I had on Font Playground. Google Fonts wanted me to grant the usage by adding an open-sourced license to my project.
I told them that I welcome them to use my design ideas in Font Playground — I wanted the UI around variable fonts to improve and the project is made to promote greater adoption of my design ideas.
I hesitated to add the open-sourced license to my GitHub repository, as I didn’t mean my code to be re-used — properly maintaining an open-sourced project requires time and effort, and my code was not in a state to be scrutinized by third parties; it was only meant to keep the website Font Playground running to present those design ideas in the front end.
The code for the sliders that they were interested in was very intertwined with the rest of the app; it is not like a code library that can be easily stripped away and migrated into another project. I told them to feel free to dig into my code to study how it is implemented and warned them that it wouldn’t be an easy copy-and-paste job. They will have to use the knowledge gained from studying my code to write a code component from scratch. For studying my code, I don’t feel the need to grant permission by adding an open-source license — they are allowed to do that by default.
I believe anyone can read any code as long as it is legally obtained, to study how to get a job done and then do that job, without having to ask for permission to learn. A lot of people taught themselves to be a coder this way including myself.
How is it related to the GitHub Copilot lawsuit? The only difference that I see between the Google Fonts contractor and GitHub Copilot is that one is a human, and the other is an AI. The rest s the same: they both study publicly shared code, obtain knowledge, and use that knowledge to perform a job. If a person can do it, I think an AI should be allowed too. It would only be wrong if GitHub gave the Copilot AI read permission to private repositories — that would be studying illegally obtained code.
Contrary to my current stance, I am supportive of this class action, not only because it is led by a fellow typographer (who is also a lawyer and wrote the book Typography for Lawyers) but also because I think AI needs scrutiny and regulation. This lawsuit provides an opportunity for discussion like the one I just had in this article to happen. The class action lawsuit also has more arguments that I didn’t discuss in this article.
Hear from You
Many gray areas are up for debate, and none of my arguments is meant to be an assertion. I write it down as a way to organize my thoughts and form my stance.
One of the gray areas in my head is the extent of re-using code. Studying someone’s code to learn how to write a for loop is different from copy-paste an entire code component and just changing a few variable names. Where is the line between “studying code and internalizing it as knowledge to write their own code” and “directly using someone else’s code”? Also, what if one day, someone types “2d-slider” in their code editor, and then GitHub Copilot prompts exactly the same code component from my project? Would I still think it is fair usage without permission? I welcome more discussion.
See you next week! If you have friends who are interested in founding startups, please consider sharing my newsletter with them!
Thanks for reading wentin’s newsletter! Subscribe for free to receive new posts and support my work.