Report on copyright and AI: a tale of creativity and innovation

The government consulted on AI and copyright in December 2024 (see News brief “Copyright and AI: tackling difficult choices”, www.practicallaw.com/w-045-6211). The consultation resulted in 11,520 responses from across the creative and technology sectors, whose content was summarised in the report and the impact assessment. The motivating force behind the consultation was the critical tension between the UK creative industry and AI model developers, and the fact that neither side’s interests are well served by the status quo.

AI developers need access to vast quantities of copyright-protected works to train their models. These models learn from the information provided by copyright works and translate it into statistical representations of the world. However, in the words of the report: “they would not be able to learn without human creativity, and their outputs may compete with the very creators they learn from”. To date, many AI model developers have ignored the requirement to negotiate, and pay for, the billions of licenses that are technically required for foundation model training, citing overseas regimes like the US fair use doctrine to absolve their actions.

Unsurprisingly, this activity continues to have a hugely detrimental impact on the creative industry, which produces £146 billion (or 6%) of the UK’s gross value added (GVA) every year. This can be contrasted with the AI sector’s current £12 billion GVA. Creators are left without remuneration for the economic value that their works have created, and without incentive to create further works. This is illustrated by the report “Brave New World? Justice for creators in the age of Gen AI”, which was published in January 2026 and contains evidence from over 10,000 creators demonstrating the erosion of their industry due to AI (https://societyofauthors.org/download/brave-new-world-justice-for-creators/).

The government is caught in the middle of this tension: it wants to protect the value generated by the UK’s creative industry while positioning the UK to harness the extraordinary potential of AI to drive economic growth, generate employment and raise living standards. The central problem is how to do this.

The report provides a helpful grounding in some of the key principles of AI model training. It starts by explaining the foundations that underlie generative AI models: neural networks, with nodes arranged in layers from input to output, connected by nodes in a similar way to biological neurons. By exposing the input layer to billions of examples of creative works, a neural network can be trained to learn language, concepts and ideas.

The report also explains some of the work involved in compiling training datasets, including tokenisation, which is where creative data is broken down into standardised units to allow a model to digest it. The report covers the techniques that model developers use to allow efficient training on huge datasets, such as identifying and storing representations of features that are common across training data.

It also covers fine-tuning and post-release fine-tuning, which tend to use specific factual content rather than generalised patterns derived from large datasets. The aim is to fine tune an existing model to deliver a consistent, factually correct output, such as those needed for legal or medical assistance. These are the training activities that are most likely to be performed in the UK by small businesses tailoring an existing foundation model to the needs of their intended market. The report also touches on retrieval augmented generation (RAG), which is also a process that extracts value from an external knowledge base (that is, it uses identifiable copyright works) and is commonly used for AI search and by AI assistants.

Although it is largely acknowledged that AI model developers use unlicensed material in training (and, to an extent, in fine-tuning and RAG), given the size of the training datasets and their methods of collection, it is often unclear precisely which works have been used and how. This presents a real issue for creators, who cannot address an infringement that they do not know about.

The EU has tried to address this problem in its Artificial Intelligence Act (2024/1689/EU) by requiring high-level disclosure of which datasets were used and what type of data they contained (see feature article “Copyright ownersand AI: a brave new world?”, www.practicallaw.com/w-039-8519). While this is a good start, creators want specifics; ideally, identification on a file-by-file basis using digital object identifiers and metadata. This presents huge technical difficulties for foundation model providers, whose training runs routinely involve billions of datapoints, many of them with uncertain origins. They argue that overly detailed transparency will compromise the security of AI systems and have a chilling effect on the UK technology industry, pushing small AI businesses out of the market.

Transparency is not only relevant to inputs, it also relates to AI-created outputs. While there is a general consensus that outputs should be labelled, the question is how to do it. The report contains a helpful summary of the approach taken in various jurisdictions, from the EU and California to China and South Korea. Many of these regimes apply the most stringent regulations to content that is difficult to identify as AI-generated or seeks to influence public opinion. The UK government has committed to work with experts to develop best practice on both input and output transparency. Its aim in relation to input transparency is to help rights holders to assert their rights in a proportionate way that does not impose unreasonable burdens on small developers. For output transparency, its aim is to establish consumer confidence and public trust in AI outputs. While these are both laudable aims, they are also both very difficult to achieve.

Even if transparency requirements are resolved, there is still a dispute over whether UK law should allow the use of copyright works for AI model training. There is general agreement that the current law requires consent before AI models can be trained using copyright works, provided that the training happens in the UK. The treatment of AI models that have been trained in other countries is another question entirely, and rights holders continue to suffer the effects of uncertainty in this area.

When the consultation was issued, the government proposed four possible future options for this area of UK copyright law. These ranged from keeping the status quo to introducing a new exception allowing the use of copyright works for model training, with or without the ability for creators to opt out. The government framed the opt-out as its preference. Among the consultation respondents, the preference split was striking: 81% wanted to keep copyright broadly as it is now. Only 3% were in favour of allowing the use of protected works for model training where creators had not opted out. Most creative respondents were clear that a requirement to opt out would place an unreasonable administrative burden on rights holders, who could face significant costs in implementing any standard across multiple platforms and in monitoring whether AI developers were respecting the opt-outs.

The government has taken this on board and concluded that, in light of the strong views in the consultation responses, the gaps in evidence and the rapidly evolving AI sector and international context, its preferred way forward is no longer a broad copyright exception with an opt-out. Instead, the government has agreed to pause any reforms until it is confident that it can “get this right”. It wants any changes to copyright law to ensure both that rights holders can be fairly rewarded for the economic value that their work creates and that AI developers can access high-quality content. Although this is a worthwhile aim, it leaves both sides unsure: rights holders are still suffering at the hands of imported models while AI developers face a nebulous copyright infringement risk.

One of the possible answers to the above tension is licensing. However, the report indicates that the appetite for licensing on the creative side is mixed. Institutional bodies such as publishers and collecting societies, which have the catalogue size and legal resources to negotiate meaningful deals, are largely supportive. On the other hand, many individual creators are vehemently against the idea and, understandably, refuse to agree to licensing arrangements that would legitimise the widespread unauthorised use of their works. Technology sector respondents were also broadly sceptical of licensing as the primary solution. They flagged the practical issues with ascertaining who the rights holder is and negating a fair market value, and warned that it would chill innovation, especially among new and smaller market entrants.

The report helpfully summarises which creative sectors are seeing the most licensing deals. Between March 2023 and February 2025, most of the announced AI training deals were in news publishing (68%), then images (14%) and academic publishing (7%). Most of the licensing deals were announced by AI-specific companies such as OpenAI and Perplexity, with large technology companies basing their AI offerings on other means, such
as data mining their own platforms.

Two further sections of the report indicate where the government might legislate next and consider comparable international regimes. The section on computer-generated works discusses the history, and lack of economic relevance, of the UK’s computer-generated works provisions, while the section on digital replicas and deepfakes looks at the potential for a new image rights regime.

Although the ending to this story seems predictable: two opposing sides and the government mediating when perhaps it needs to pick a side, there may be merit in co-operation. The UK is the home of numerous start-ups operating in the application layer, whose business models are often dependent on fine-tuning and RAG datasets made up of identifiable specialist copyright works. As the government noted in the impact assessment, the success of the AI sector and the creative industries are intertwined. The government now just needs to find a way to keep the UK creative industries economically valuable without stifling the next generation of AI start-ups. This is no mean feat.

Report on copyright and AI: a tale of creativity and innovation

The report at a glance

Key contact

Related specialisms

Sectors

Specialisms

Asia offices:

Europe offices:

Middle East offices:

Regional business groups:

FEATURED PODCAST

Podcast: Cybersecurity lessons for today's C-suite

Insights & Articles

Hot topics

Report on copyright and AI: a tale of creativity and innovation

The report at a glance

Related insights

INSIGHTS

Data Diaries - April 2026

INSIGHTS

GDPR gold diggers beware: CJEU rules data access requests aren’t a free ticket to compensation

INSIGHTS

EU Digital Omnibus on AI update: the Council and Parliament agreed positions

Key contact

Related specialisms

To the Point