When a prototype of a chatbot created by OpenAI was first launched in late 2022, it sparked a great deal of controversy worldwide. It very quickly gained supporters, although alongside its obvious benefits, a number of risks were perceived – including those related to data processing and the copyright of creators of works used by OpenAI’s algorithms. Although seen as a useful tool to help with work or study, in many cases it began to stir up controversy, and the first copyright infringement cases hit the media, involving creators who realised that the chatbot may have used their works without permission.
Authors sue OpenAI
According to media reports, novelists Mona Awad (Canada) and Paul Tremblay (US) have filed a lawsuit with the San Francisco federal court against OpenAI for copyright infringement.
They claim that OpenAI is illegally using their property to “train” its tool, arguing that ChatGPT generates highly accurate summaries of their books. The lawsuit points out that OpenAI’s training data includes more than 300,000 books, including those from illegal shadow libraries that offer copyrighted books without permission.
This is not the only case of its kind. US artist Sarah Silverman has joined the group of those who have filed class action lawsuits against OpenAI and Meta. The lawsuits claim that both ChatGPT and LLaMA (i.e. a chatbot developed by Meta) are trained on copyrighted material.
The artists unanimously argue that the use of their works to train artificial intelligence infringes their copyright. But these are not the only allegations. They also claim other infringements, such as the unauthorised creation of advanced text algorithms based on their works and infringement of the right to dispose of their creations.
Others are also considering legal action against OpenAI
The New York Times and OpenAI have been negotiating a potential licensing agreement under which OpenAI would have the right to use NYT articles in its tools for a fee. Unfortunately, these talks have become so contentious that The New York Times is also considering legal action.
The main concern for The New York Times is that artificial intelligence technology companies are becoming direct competitors to traditional publishers. This is because they create content that answers users’ questions based on original reports and articles written by journalists. As a result, there is concern that this practice could negatively impact the value and uniqueness of the newspaper’s content and revenue.
Machine learning vs. third-party rights
The above cases highlight the growing copyright challenges at a time when algorithms are learning from huge data sets, often protected by copyright.
The Polish Copyright and Related Rights Act currently appears to be insufficiently adapted to provide adequate protection for creators whose works are used to train artificial intelligence, and is limited to considering a work solely as a manifestation of human activity. According to legal scholars and commentators, “creations generated by computer applications that imitate the human creative process do not constitute a work”, with ChatGPT being such an application in the present case.
According to dissenting views, AI-generated creations could be considered works, e.g. “derivative works of the works on which the system has learned. Such an assumption could have serious consequences for ChatGPT’s end users, as their use of AI-generated creations in their own activities could lead to infringement of the rights of creators of original works and, consequently, to direct liability.”
Therefore, the stage at which AI can be perceived as infringing intellectual property rights is the moment of “training” AI, or TDM (text and data mining) for short. According to Article 23 of the Act on Copyright and Related Rights in Poland, it is possible to use other people’s works on the basis of authorised use or on the basis of a licence agreement. However, the issue remains unclear.
A more concrete legal basis for the use of works by companies such as OpenAI will be provided by Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the digital single market and amending Directives 96/9/EC and 2001/29/EC, which is yet to be transposed into Polish law.
An analysis of the draft provisions and recitals of the above-mentioned Directive shows that AI tools will be able to analyse copyrighted works in the learning process without the consent of authors and other rights holders for commercial purposes by private parties under the following conditions:
- The works that AI tools will analyse will be lawfully accessible (lawful access should include access to content that is freely available on the Internet)
- Reproduction and downloading will take no longer than necessary for text and data mining purposes
- Holders of rights to works to be analysed by AI do not explicitly state that they refuse access to the works for the purposes of mining, e.g. in metadata or terms of website or service use
The steps taken by the European Commission should be seen as a sign of awareness of the nature of copyright in the European Union. The cases described provide a reasonable basis for concluding that the works of Polish artists may also be exploited by ChatGPT-type programmes. The lack of case law and the ambiguity of the Copyright and Related Rights Act create uncertainty for creators and end users with regard to ChatGPT-generated creations.
However, it appears that the forthcoming Directive will remove the existing uncertainty and provide a precise basis for the lawful use of new technology products and, in the event of non-compliance, a basis for legal redress.
 Kamil Nowak, Authors have filed a lawsuit against OpenAI. It is about the illegal use of their books by ChatGPT [Pisarze złożyli pozew przeciw OpenAI. Chodzi o bezprawne wykorzystanie ich książek przez ChataGPT], access date: 31.08.2023.
 The Feed, The Economic Times, OpenAI faces Lawsuit filed by US-based authors Mona Awad, Paul Tremblay, access date: 31.08.2023.
 Jack Queen, Sarah Silverman sues Meta, OpenAI for copyright infringement, access date: 31.08.2023.
 Bobby Allyn, ‘New York Times’ considers legal action Against OpenAI as copyright tensions swirl, access date: 31.08.2023.
 Act of 4 February 1994 on copyright and related rights (uniform text: Journal of Laws of 2022, item 2509, hereinafter: the “Copyright and Related Rights Act”).
 A. Niewęgłowski [in:] Copyright. Commentary [Prawo autorskie. Komentarz], Warsaw 2021, Article 1.
 Cf. P. Księżak, S. Wojtczak, Copyright in the face of artificial intelligence (an attempt at an alternative view) [Prawo autorskie wobec sztucznej inteligencji (próba alternatywnego spojrzenia)], PiP 2021, No. 2, pp. 18-33.
 Agnieszka Wachowska, Marcin Ręgorowicz, ChatGPT in practice – key legal issues [ChatGPT w praktyce – najważniejsze kwestie prawne], access date: 06.09.2023.
 Recital 14 of Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the Digital Single Market and amending Directives 96/9/EC and 2001/29/EC (OJ L 2019 No. 130, p. 92).
 R. Markiewicz, 2.2.4. Commercial use of TDM [Użytek komercyjny TDM] [in:] Copyright in the Digital Single Market [Prawo autorskie na jednolitym rynku cyfrowym]. Directive (EU) 2019/790 of the European Parliament and of the Council, Warsaw 2021