Generative AI models such as ChatGPT are trained on massive amounts of data harvested from web sources, including large numbers of copyrighted materials. The authors, artists and creative professionals behind those works are, however, neither credited nor compensated for their involuntary contributions to the development of those large AI models, despite the fact that those could not have been built without their collective input. As a consequence, many content creators have taken steps to limit access to their work, as evidenced by the rapid decline of online data available for train generative AI models. This situation negatively impact all parties, including content creators (who experience a decrease in the visibility of their work along with a reduction in their revenue streams), technology providers (who struggle to collect enough high-quality data to train their models), and the public at large (who encounters diminished access to digital content and lower-quality AI tools).
The objective of the COPY.AI project is to investigate the future of intellectual property in the light of current developments within generative AI. A particular focus will be on finding ways to safeguard the intellectual, cultural and linguistic heritage of the Nordic and Baltic countries without hampering the innovation potential of generative AI in this region.
The project will investigate six key research questions:
- How the non-consensual use of copyrighted materials for the purpose of training generative AI models should be interpreted from a legal perspective ;
- How the inclusion of copyrighted materials can be technically detected in generative AI models ;
- Whether AI-generated outputs may in some cases constitute a copyright infringement ;
- How to adapt existing generation techniques to reduce the risk of such infringements ;
- How to trace back the responses of a generative AI model to content found in its underlying training set and thereby acknowledge their contributions ;
- What possible schemes could be envisioned to financially compensate content creators for those contributions, and what are their legal, economic and technological feasibility.
To address those questions, the COPY.AI project brings together an interdisciplinary consortium of researchers from the Nordic and Baltic countries, covering the fields of intellectual property law, ethics of AI, public economics, machine learning, natural language processing and computer vision.