Sean Flynn. American University, PIJIP
Cross posted on the Education International blog, Link.

This year’s World Intellectual Property Day is being dedicated to the theme of youth empowerment. The focus is on recognition of the role of youth “stepping up to innovation challenges, using their energy and ingenuity, their curiosity and creativity to steer a course towards a better future.” Intellectual property exclusive rights may play some role in rewarding the innovative activities of youth. But more often, intellectual property exclusive rights may work in the other direction – posing a barrier to access and use protected materials that youth need to learn, innovate, and develop. This one reason why our attention to intellectual property on this day must include the limitations of and exceptions to intellectual property as well.

Inappropriate copyright laws for the digital age: Text and Data Mining (TDM) research

The Program on Information Justice and Intellectual Property (PIJIP) is analyzing the extent to which copyright exceptions in every country can permit text and data mining: a computational research method that is used to analyze large quantities of text or data in articles, books, databases and other sources with the purpose of discovering patterns and relationships or analyzing semantics. Text and data mining (TDM) research is helping to solve some of the world’s greatest challenges, including the discovery of COVID [1], development of vaccines and treatments [2], and analyzing hate speech and disinformation on social media [3]. Unfortunately, while the benefits of TDM research become ever more apparent, so to do the legal complexities posed by copyright and other regulations.

A patchwork of laws and regulations determine whether researchers are allowed to use TDM in their research projects. To engage in TDM, researchers must create a “corpus” of material to be mined, use software to analyze the text and data in the corpus (often making temporary copies), but also communicate the results of the research and underlying data out to researchers and the public. Sharing the materials with others is crucial for validation, collaboration, and dissemination of results. Often, both academic and commercial applications of TDM occur across borders, with researchers, subjects and materials located in more than one country. All of these activities with articles, books, web pages, social media content, and other subjects of research are likely governed by copyright.

Copyright exceptions for research

One way we enable research with copyrighted works is through limitations on the scope of copyright protection or exceptions from the application of those rights for some purposes. From its beginning, copyright law has always recognized the need for exceptions to exclusive rights to further research and learning. The U.S. Constitution grants Congress the right to enact copyright law to promote “science,” and the primary purpose of England’s first-in-the-world copyright law was to promote “learning.” The 1886 Berne Convention on the Protection of Literary and Artistic Works explicitly protected the right of countries to adopt exceptions to copyright for “educational or scientific” purposes. And the more recent 1996 World Intellectual Property Organization Copyright Treaty records the objective to “maintain a balance between the rights of authors and the larger public interest, particularly education, research and access to information.”

PIJIP’s research shows that every law includes at least one exception that can be used for research or educational uses (sometimes referred to as a “private” use). The bad news is that many of the world’s laws are not fit for the purposes of the digital age.

Our research shows that a distinct minority of countries around the world – including many of the richest – permit text and data mining uses of copyrighted materials. Figure 1 shows the results using the following color scheme:

  • Green (46 countries) – only these countries have sufficiently broad research exceptions to allow collaborative TDM research (i.e. including sharing of works between researchers) by any user, with any kind of work. In all the other countries, copyright poses a barrier to at least some TDM uses.
  • Yellow: (125 countries) – these countries have exceptions that allow some uses of some types of some works sufficient to permit some TDM projects. But researchers would need to carefully examine myriad restrictions on their use rights in relation to a given project. Use only with caution.
  • Red (10 countries) – these countries no not have any exceptions that permit uses of whole works for research or education. They are essentially closed to lawful TDM research except where licensed from copyright holders.

PIJIP’s work analyzing educational and research rights is not finished. We are currently mapping education exception and are finding a similar pattern where only a small fraction of the world permits online uses of materials for education.

Towards copyright law reform permitting Text and Data Mining and other forms of research

There are a number of measures that policy makers can take to ensure that TDM research is unambiguously authorized under intellectual property laws.

International Treaty. One way to ensure that all countries permit TDM research is through a binding international treaty. An international treaty could require that all members meet a minimum standard for limitations and exceptions in copyright and require that such exceptions apply across borders. This is a long standing position of Education International.

Domestic reform. Laws can also be changes at the domestic level. The studies note above show where the most glaring problems are. Advocates could use the information to target requests for reforms of clarifications to the law to permit research and other essential uses. Where they do so – a key criteria for a good law is that the exceptions be open to all works, all protected uses, and all users. Highly specific exceptions are less likely be useful in the digital environment.

The way forward: education and research exceptions in the Digital Age

The youth can help solve the problems in intellectual property law that most affect them. Youth and their advocates and representatives should ensure that policy makers in their countries and in international forums prioritize updating laws for the digital age. Only then can we truly declare that we have done all we can to liberate the knowledge and energy of youth to promote the best vision of our shared future.


1. ^ Marc Prosser, How AI Helped Predict the Coronavirus Outbreak Before it Happened, Singularity Hub (Feb. 05, 2020), https://singularityhub.com/2020/02/05/how-ai-helped-predict-the-coronavirus-outbreak-before-it-happened/

2. ^ Will Knight, Researchers Will Deploy AI to Better Understand Coronavirus, WIRED (Mar. 17, 2020), https://www.wired.com/story/researchers-deploy-ai-better-understand-coronavirus/

3. ^ “TDM Stories”. OPENMINTED, http://openminted.eu/blog/ (last visited Mar. 29, 2022)

4. ^ Sean Flynn, Michael Palmedo, Andrés Izquierdo, Research Exceptions in Comparative Copyright Law (PIJIP/TLS Research Paper Series no. 72-2021), https://digitalcommons.wcl.american.edu/research/72