Newsletter
Artificial Intelligence and Copyright Laws:
Risks of Copyright Infringement associated with Generative AI
I. Introduction
In recent years, various industries have poured significant resources into developing artificial intelligence (AI) applications. ChatGPT, an AI chatbot that was made available to users for free in 2022, has tremendously boosted the accessibility of generative AI tools. The range of generative AI's capabilities includes automatically answering questions posed by users, assisting with translation tasks, compiling information, and even generating texts, picture, and music. The impact of generative AI has been felt throughout society, changing how people create and social division of labor. However, generative AI's ability to produce quality works is in fact based on the training of AI using collections of existing copyrighted materials. As the value of creator content has become more widely recognized nowadays, should AI tool developers who use vast amounts of copyrighted materials to train AI tools first obtain licenses from copyright holders? Can AI tool developers, without obtaining licenses, cite the fair use doctrine of the Copyright Act to defend that their use does not infringe copyright? This topic concerns the distribution of interests among copyright holders, AI tool developers, and AI tool users, and has created much controversy.
II. Using works to train AI models may infringe copyright holders' reproduction right
While a number of infringement lawsuits have been filed by copyright holders in the US against AI tool developers, there have been no similar lawsuits in Taiwan so far.
In 2018, the Taiwan Intellectual Property Office (TIPO), the specialized agency in charge of copyright matters, issued an administrative interpretation on the issue of copyright of works involving AI, stating that where AI is only used by an author as a tool in the creative process, the completed work still has the author's "original" and "creative" input and is not purely machine- or system-generated content, and therefore such work shall be under copyright protection. Subsequently, the TIPO in 2022 and 2023 successively made the following interpretations: other than the circumstances of fair use as provided in Articles 44 to 65 of the Copyright Act, generative AI model developers collecting and feeding large amounts of data into AI models, such as ChatGPT (text generation), Midjourney (image generation), and Pictory (video generation), for training and learning purposes, and engaging in acts involving the reproduction of original works, shall first obtain the consent or license of copyright holders. The TIPO has also made clear that Articles 44 to 63 of the Copyright Act shall not be applicable in circumstances where generative AI models use algorithms to utilize online works for learning purposes, and that the four determining factors of fair use as provided in Article 65-2 of the Copyright Act shall be used to consider the facts of each individual case—in other words, there is no single set of criteria.
Article 3-1(5) of the Copyright Act provides the following definition on reproduction right: "Reproduce" means to reproduce directly, indirectly, permanently, or temporarily a work by means of printing, reprography, sound recording, video recording, photography, handwritten notes, or otherwise. Developers of generative AI tools such as ChatGPT have yet to reveal their methods of building massive corpora and how these corpora are used for AI training. If the corpora of generative AI are built with web crawlers capturing and copying all existing works that may be accessible online via open networks, the creation of such corpora would meet the definition of reproduction as elucidated by the TIPO's administrative interpretations. However, if AI models, like human beings, merely browse and read existing works, or, rather than copying and storing existing works, AI models merely compile and analyze such works to create abstract parameters for AI model training, the question of whether the definition of reproduction as provided in the Copyright Act is met merits further discussion.
III. Conclusion
Whether the fair use defense available in current laws is applicable to the use of others' works in large volumes as AI training databases requires further discussion. Currently, Taiwan has not had any copyright infringement lawsuit arising from the use of another's work by generative AI. In practice, the courts comprehensively consider all factors relevant to the circumstances of the case at issue when it comes to the determining criteria of fair use as provided in Article 65-2 of the Copyright Act. Under what circumstances would it be considered reasonable for AI developers to use another's work for AI model training on the basis of fair use? It will be up to the courts to shed light on this issue. Meanwhile, the TIPO has made plans to provide guidelines that conform to international standards. In addition to the aforesaid administrative interpretations, it would be noteworthy to pay attention to the TIPO's latest interpretations on critical and emerging issues, such as the scope of fair use of training data.