The UK’s AI & copyright proposals would irreparably harm the country’s creators.

Dec 17, 2024

Today is a difficult day for the UK’s creators and creative industries.

After two failed attempts to change copyright law in favour of AI companies, the government has launched a consultation on another proposal to allow them to train on British copyrighted work without a licence. And this time, it might be determined enough to force it through, despite the untold damage it will do to the creative industries.

The government has introduced a package of proposals, to be consulted on over the next 10 weeks, that boils down to the following:

AI companies will be allowed to train commercial AI models on copyrighted work without a licence.
Rights holders will be able to ‘reserve their rights’, i.e. opt out of training.
AI companies will need to embrace some level of transparency over the data they use to train their models.

Elements of this sound like good news for creators, and that’s how you’re meant to read it. The government have called it “win-win”. But this is very far from the truth. It would be a huge coup for AI companies, and the most damaging legislation for the creative industries in decades.

Generative AI competes with its training data

Introducing a broad copyright exception that permits AI companies to train commercial AI models on copyrighted work without a licence would be disastrous for the creative industries. This is for the simple reason that generative AI competes with the work it’s trained on.

This is not the narrative that generative AI companies like to portray. We like to talk about democratization, about AI letting more people be creative. But the fact that generative AI competes with its training data is inescapable.

A large language model trained on short stories will be able to create competing short stories. An AI image model trained on stock images will be able to create competing stock images. An AI music model trained on music that’s licensed to TV shows will be able to create competing music to license to TV shows. And these models, even if they’re imperfect, can be used so quickly and cheaply that this competition is inevitable.

There is ample data to back this up. Data in the Harvard Business Review shows that the introduction of ChatGPT decreased writing jobs by 30% and coding jobs by 20%, and AI image generators decreased image creation jobs by 17%. Some artists’ income fell by 1/3 after Midjourney was trained on their work. Filmmakers are abandoning human-composed music in favour of AI music. But data is hardly needed — the fact that generative AI will compete with its training data is self-evident.

This is why a broad copyright exception that allows unlicensed training on copyrighted work is so pernicious. It would in effect hand the life’s work of the UK’s creators to AI companies, letting them use it to build highly scalable competitors to those creators with impunity.

‘Rights reservations’ — or rather, opt-outs — don’t work

Under the government proposal, rights holders will need to proactively opt out their works in order to stop AI companies training on them.

But opt-out schemes for generative AI training do not let rights holders successfully opt out their works. This is because opt-out schemes only let you opt out the works where you control them — they don’t let you successfully opt out downstream copies of your works. These copies are out in the wild — you have no control over them. A photographer’s image being used in an ad; a journalist’s article being screenshotted and shared online. The creative industries are built on downstream copies.

URL-based opt-out schemes only opt out specific URLs from training, but you can only opt out works at URLs you control. Metadata-based schemes add information to files themselves, but this information is often and easily removed, and some media types (e.g. text) cannot have metadata added. The best hope for a working solution is automatic content recognition (ACR) — some centralised repository of opted-out content that is scanned at point of training — but ACR technology is woefully inadequate for these purposes, and is particularly unhelpful for copyrighted works that are themselves embedded in other works.

Here is one example. I’m a composer, and recordings of my music exist in various places online, out of my control. I have no way of opting these out of training using any existing or hypothetical opt-out scheme. And there are millions of examples like this across the creative industries.

People in government have said the “technology has moved on”, suggesting it will somehow be possible as a rights holder to successfully keep your works from being trained on. I’ve run opt-out schemes at generative AI companies myself, and I am confident they are mistaken. The most widely-used opt-out scheme, robots.txt, is totally unfit for purpose — it gives rights holders no control whatsoever over whether downstream copies of their works are trained on. There are others, but none come close to solving the downstream copies problem. No one has even suggested a hypothetical opt-out scheme that would solve this. At the very least, no change to copyright law should be made that relies on opt-outs until an effective opt-out system is built and rigorously tested.

The government’s apparent determination to run this consultation, and presumably change the law, before such a system exists, is extremely worrying.

Rights reservations / opt-outs are hugely unfair regardless

Even if some solution to the downstream copies issue can be found — which is highly unlikely — opt-out schemes are incredibly unfair to creators and rights holders.

There are many reasons for this. There is the fact that all the data suggests <10% of people eligible to opt out actually do, because many don’t realise they have the chance (consider recent research showing that 60% of artists still don’t know about robots.txt), and those who do face a huge administrative burden. There is the fact that the effect of opting out is nowhere near immediate: opt-out schemes don’t tend to impose deadlines for existing models being retrained and/or retired, meaning models are often live for months or even years after a rights holder has opted out. There is the fact that opt-out schemes are disproportionately unfair for small creators, who are that much less likely to understand their rights and have the bandwidth to go through the opt-out process, despite being precisely the people who need our protection the most. I’ve gone over these and other reasons opt-outs are unfair in this essay.

The government seems determined to push this through

The government has framed this as a consultation, and Culture Secretary Lisa Nandy recently said nothing has been decided. But it is common knowledge that the government is very keen to push this legislation through.

One sign it is a foregone conclusion is the continued reappearance of the idea that there is ‘uncertainty’ around existing UK copyright law. The government’s proposal echoes comments from various government sources in recent weeks, saying the aim is to “bring legal certainty to creative and AI sectors over how copyright protected materials are used in model training”. But the suggestion that there is currently any legal uncertainty over generative AI training is, whether intentionally or not, false. It is illegal to train commercial generative AI models on copyrighted work in the UK — even AI companies agree.

Why cite legal uncertainty where there is none? A generous interpretation is a simple misunderstanding of current law. A less generous one is that it provides a helpful pretext to change copyright law.

Why is the government so keen to change copyright law?

It’s important to consider the government’s motivations here.

Previous UK governments have failed to get this legislation through twice. Now they see an incoming Trump administration, with key adviser positions populated by people keen to accelerate AI development at all costs. They see talk of a race towards artificial general intelligence between the US and China. They see a UK AI ecosystem that is admittedly lagging far behind development in the US. They observe that AI is a huge area of economic growth around the world, and they want to participate in this growth. In this context, they think changing copyright law is required in order to compete.

But I believe this is misguided. The government does not need to change copyright law to build a prosperous AI industry. We’re not behind in AI because of copyright law — we’re behind in AI for the same structural reasons that our tech startup ecosystem has been behind the US’ for years. As Ian Hogarth put it recently, this comes down to issues like a lack of encouragement for founders, and a lack of ‘audacious capital’. And as I said to the Culture, Media and Sport Committee last week:

It is absolutely possible [to be a global leader in AI without damaging our creative industries]. Some AI companies like to elide all of AI together and suggest that you need to deregulate all of it if you want any progress at all, but this simply is not true.
The major economic opportunity from AI does not come from exploiting the life’s work of the world’s creators without permission. If you look at the AI work that Sir Demis Hassabis won the Nobel Prize in Chemistry for, AlphaFold, it was not trained on creative work. Not a single important scientific discovery has come from AI trained on creative work. You train on creative work you haven’t licensed if you want to replace the creative industries with AI without paying them — not if you want to cure cancer with AI.
We should invest in data centres. We should provide AI companies with access to these. We should invest in AI education and training. We should give grants and tax breaks to AI companies. We should encourage the best researchers to come here through visa schemes. We can be world leaders in AI for healthcare, defence, logistics and science.
As regards AI in the creative industries, finally, we can be the home of responsible AI development and responsible AI companies. We can do all of that. We can be a global leader in all of that without destroying our creative industries by upending copyright law.

What is the ideal outcome from the consultation?

A broad copyright exception should not be introduced for generative AI training. This proposal should be dropped, along with the rights reservation (opt-out). In its place, the government should recommit to existing copyright law, and fully embrace the existing training data licensing regime that represents the only way the two industries can work together in a way that’s fair to both sides.

It’s also worth focusing on getting the transparency requirements right. The EU AI Act is on track to get this very badly wrong. In its draft code of practice, promised data transparency requirements for AI companies had been replaced with a requirement for data being made available to the AI office, on request. This is all but useless to rights holders: they can’t take action on their data being used if they don’t know it’s being used. Any transparency requirements that come out of the UK’s consultation should be clear requirements to make all training data public.

What can the creative industries do?

This will be the major question of the next 10 weeks. In the face of a government that seems to have made its mind up, what can be done?

What’s more, the government has priced in backlash from the creative industries. It already knows that creators vehemently object to this change to copyright law. 37,000 people signed a statement saying so, including many of the UK’s leading figures in the arts. The government knows this. Any action on the part of the creative industries must now go beyond statements of disapproval.

It is not going to be easy for the creative industries to modify the government’s proposal and turn it into something that is fair to both sides. I think it’s possible, but it’s going to require a huge show of unity — despite sometimes competing motivations. This consultation is a generational and existential threat to the combined creative industries. The future of the livelihoods of the country’s creators rests on the shoulders of those involved in the consultation in the coming weeks.

Get involved in the consultation here: https://www.gov.uk/government/consultations/copyright-and-artificial-intelligence/copyright-and-artificial-intelligence

Fair Training

Discussion about this post