Beyond the code: copyright rights and liability of Artificial Intelligence

Beyond the code: copyright rights and liability of Artificial Intelligence

The recent rise in the use of generative AI capable of producing original, creative content, has introduced new complexities in copyright law. Conventionally, copyright infringement claims aimed to identify plagiarism by human authors, but with AI now generating content that may resemble existing copyrighted works, determining liability for copyright violation has become more complex. Many AI companies use copyrighted content from books, music, and images for training their models. This has resulted in an outrage from authors, musicians and artists, who believe that this leads to unfair competition as their works are used without their consent and they are not compensated for the same. As these legal battles continue, the legality of using such copyrighted content for training AI models remains uncertain.

This article explores the current legal scenario concerning AI and copyright infringement. Unlike conventional software, the development of an AI software cannot be attributed to a single individual. AI based software development companies typically employ multiple development teams, gather training data from a variety of sources for training the generative AI models to obtain the desired results. Furthermore, once an AI product is released, the data fed to the product by the end users and consumers plays a crucial role in shaping how the AI learns, and generates content.

Therefore, in determining liability for copyright infringement by an AI, the scope of liable parties extends beyond just the developer or programmer. The potential liable parties may include the end user, seller, the development/programming team, and even the AI model itself. This article examines the potential liabilities of both AI users and AI developers, analyses the fair use doctrine as a potential defence, and discusses the ongoing legal debate by taking the example of a recent landmark case of copyright infringement against AI.

A Non-Human Author: Can AI own copyright?

The Copyright Act[1] protects “original works of authorship,” but current legal interpretations restrict copyright ownership only to works created by humans. Courts have denied copyright to non-human “authors” like a photo-taking monkey[2] and an AI-generated book. This “human authorship” requirement was recently upheld in a lawsuit challenging the copyrightability of AI-generated art.[3] 

While works created with AI assistance could still be copyrighted, the Copyright Office is unlikely to grant protection for works solely generated by AI in response to prompts as evident from recent proceedings.[4] The US Copyright Office’s recent decision to refuse to register Suryast, an AI generated painting, for lack of substantial human authorship can be considered as setting a standard for the extent of human requirement.[5]A failure to assign authorship to AI is less problematic when compared to the failure to assign liability to AI for infringement of copyright, the significance of which shall be dealt in the next section.

The Risk of Infringement: AI as a Copyright Culprit

Generative AI models are trained on vast amounts of data from various sources that may include copyrighted works as well. The AI model identifies patterns and learns from this data, subsequently generating its own content, based on the patterns learnt. This process can lead to copyright infringement in two primary ways.

Firstly, the very act of training the AI model might involve gathering copyrighted material. Web scraping is a common practice to gather training data and this might include copyrighted data as well. This results in unauthorized use of protected data. Many authors have filed law suits against AI companies claiming that the company’s AI training process infringed their copyrighted works. For example, the authors Michael Chabon, Paul Tremblay, Sarah Silverman, and others, as well as the Authors Guild filed lawsuits against OpenAI for copyright infringement.[6]

Secondly, the generated content itself could infringe upon existing copyrighted works, if it significantly resembles them. Generative AI is a very powerful tool, it not only infringes literary works but is also capable of producing music that can mimic the style of a particular artist, or generate text that can imitate the writing of a well-known author, or even create visuals that resemble copyrighted works.

Therefore, such usage of generative AI raise the question of who should be held accountable for copyright infringement: the user who prompts the AI to generate content or the developer who created the AI tool itself? The doctrine of secondary liability[7] gives us some answers.

The User in the Driver’s Seat: Contributory Infringement

Copyright law recognizes the concept of contributory infringement, which holds a party liable if they knowingly induce, encourage, or contribute to the infringing activity of another. In the context of AI, a user who intently prompts an AI model to generate content that infringes on a copyrighted content may be liable to contributory infringement.[8]

Further, a party is liable for infringement if he/she is aware that their actions are likely to cause infringement, even if they don’t have the specific intent to do so. For example, if a user familiar with the capabilities of a particular AI tool, and prompts it to generate content that is highly likely to infringe upon a copyrighted work, could potentially be held liable for knowing infringement.

However, determining the user’s level of knowledge or intent is a challenging task. AI tools can be complex, and users may not always be fully aware of the intricacies of the underlying algorithms or the precise data used for training. For example, A user who prompts an AI to produce a song in a technique similar to that of a particular composer/musician might not be aware that the generated song infringes on the artist’s copyright unless the similarities are very obvious. Thus while a theoretical imputation of liability seems simple, its practical implications are challenging.

The Developer’s Responsibility: Vicarious Infringement and Control

The doctrine of vicarious infringement provides the basis to hold AI developers liable for copyright infringement. This doctrine states that a party can be held liable for the copyright infringement by another party, if they have the right and ability to supervise the infringing activity and have a direct financial benefit or interest in such activities.[9]

AI developers arguably have the right to control and supervise the functionality and outcome of their software tools. They can design features that can prevent users from generating content that is likely to infringe on copyrights. AI developers also undoubtedly profit from the use of their AI tools, thus satisfying both the beforementioned requirements.

A recent case in China, the Guangzhou Internet Court case, is an example of vicarious infringement. The court held the operator of an AI website liable for the infringed content generated by his software. The Court ruled that the Defendant, being a provider of GenAI services, had the legal obligation to undertake certain technical measures to prevent the generation of copyrighted content by his software.

The Court also found that the Defendant had failed to fulfil his obligations of reasonable care. These obligations included the provision of a software module or mechanism for the users to report potential copyright infringements, issuing warnings to users regarding the infringement risks associated with the commercial usage of AI tools, and clearly labelling the content generated by the AI tool.[10] This decision by the Court suggests that AI developers may be held more accountable for the actions of their AI tools.

Fair Use: A Potential Defence

The fair use doctrine is a statutory privilege within the copyright jurisprudence, that permits the circumscribed utilization of copyright-protected materials without the prior authorization of the copyright owner. This circumscribed use is approved for transformative purposes such as criticism, pedagogy, commentary, news dissemination, or scholarly and scientific inquiry.[11]

The use of generative AI for these purposes can be considered fair use and is not penalised. For example, an AI tool used to generate satirical content that critiques a copyrighted work may benefit from the fair use doctrine.[12] Similarly, using an AI tool to create educational content that analyses or comments on a copyrighted work may also be considered fair use.[13] However, the fair use doctrine is a complex legal concept.

Several factors are taken into account while determining fair use, such as, the amount and substantiality of the copyrighted material used, the purpose and character of the use, the effect of the use on the market for the copyrighted work, and the nature of the copyrighted work.[14] Whether the fair use doctrine will be applicable to a specific AI generated content depends on the specific circumstances of the case. For example, in a recent case, a U.S. district court ruled that a jury trial would be needed to determine whether it was fair use and that the Court must consider the public benefit aspect as well.[15] 

New York Times v. OpenAI[16]: A Case Study in Uncertainty

The ongoing lawsuit between The New York Times (NYT) and OpenAI is an interesting case study of copyright infringement by AI. Although the court is yet to issue a verdict, the pleadings by both the parties provide interesting insights. The New York Times claimed that OpenAI violates their copyright by using millions of their articles to train its GPT models, including ChatGPT and Bing Chat. This resulted in the AI models “memorizing” and even directly copying NYT’s content, which infringes on their exclusive rights.

However, OpenAI argued that using publicly available information, including NYT’s articles, to train AI models is a fair use protected by copyright law. They have claimed that use of AI benefits society and innovation. OpenAI has argued that its purpose is “transformative” as their model training  leads to “a useful generative AI system.” OpenAI also mentioned that it provide an opt-out option for publishers concerned about their content being used and that the New York Times’ content makes up a only tiny fraction of their training data and doesn’t significantly impact the models’ learning.

While the decision in this case is still pending, the ruling in this case will definitely be a major precedent in determining AI’s liability in copyright infringement. The decision in this case is likely to set the course for determining the liability of AI.


[1] Copyright Law of the United States (Title 17)

[2] Naruto V David John Stalter < https://cdn.ca9.uscourts.gov/datastore/opinions/2018/04/23/16-15469.pdf>

[3] Thaler V Perlmutter < https://www.courtlistener.com/docket/63356475/24/thaler-v-perlmutter/>

[4] https://fingfx.thomsonreuters.com/gfx/legaldocs/klpygnkyrpg/AI%20COPYRIGHT%20decision.pdf

[5] https://czi515.p3cdn2.secureserver.net/C/Sahni_2023_USCO.pdf

[6] Christopher T. Zirpoli, “Generative Artificial Intelligence and Copyright Law” <https://crsreports.congress.gov/product/pdf/LSB/LSB10922#:~:text=AI%20programs%20might%20also%20infringe,created%20“substantially%20similar”%20outputs.>

[7] Fonovisa, Inc. v. Cherry Auction, Inc., 76 F.3d 259 (1996)

[8] Perfect 10, Inc. v. Amazon.com, Inc., 508 F.3d 1146 (9th Cir. 2007)

[9] Metro-Goldwyn-Mayer, Inc. v. Grokster545 U.S. at 930, 125 S.Ct. 2764

[10]Rouse, “The Guangzhou Internet Court Concludes First Instance Hearing Of The First Copyright Infringement Case Involving A Generative Artificial Intelligence Platform” <https://www.mondaq.com/china/copyright/1450010/the-guangzhou-internet-court-concludes-first-instance-hearing-of-the-first-copyright-infringement-case-involving-a-generative-artificial-intelligenceplatform#:~:text=The%20Guangzhou%20Internet%20Court%20found,the%20Ultraman%20works%20without%20permission.>

[11] Campbell v. Acuff-Rose Music, Inc 510 U.S. 569

[12] ibid

[13] Masters & Scholars of University of Oxford v. Rameshwari Photocopy Services, 2016 SCC OnLine Del 6229

[14] 17 U.S. Code § 107

[15] Thomson Reuters Enterprise Centre GMBH v. Ross Intelligence Inc, memorandum opinion. Signed by Judge Stephanos Bibas on 9/28/2023. < https://www.govinfo.gov/content/pkg/USCOURTS-ded-1_20-cv-00613/pdf/USCOURTS-ded-1_20-cv-00613-3.pdf>

[16] https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf


Author: Sharada A Kalale is a 4th year law student at National Law University Delhi.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *