Press "Enter" to skip to content

Do AI generators infringe copyright?

The legality of training AI generators with large online image datasets without consent from creators will be tested in court through two separate copyright lawsuits. One is filed by Getty Images in London and another class action suit by photographers in California.

The Verge was able to partially generate Getty Images watermark, suggesting Stable Diffusion used pictures from Getty’s library. Source: The Verge.

Both lawsuits are against Stable Diffusion, an AI-driven text-to-image generator that’s capable of producing photo realistic images from text prompts. Stable Diffusion – like AI image generators tools such as OpenAI’s Dall-E 2 and Google’s Imagen– is trained using millions or possibly billions of images.

One of the concerns is how AI generator developers are sourcing images without consent for a commercial operation, which may soon compete with content creators.

Both lawsuits are filed against Stable Diffusion due to its training dataset being open source, meaning its publicly accessible. On the contrary, OpenAI hasn’t released the proprietary data although it claims to have trained Dall-E 2 using ‘hundreds of millions of captioned images’.

Waxy, a research blog by American tech writer Andy Baio, indexed the domains of 12 million images used to train Stable Diffusion. It’s only a small fraction of the 2.3 billion images used by Stable Diffusion – just .5 percent – although it provides insight into the source of images.

Lo and behold, 47 percent of images came from just 100 domains. The majority, 8.5 percent, came from Pinterest, the second largest domain is 698K images from art sales platform, Fine Art America. All the top stock agency websites were scraped, with 497K images from 123RF; 171K from Adobe Stock; 117K from PhotoShelter; 35K from Dreamstime; 23K from iStockPhoto; 15K from Getty Images; and 10K from Shutterstock.

Hundreds of thousands of images were also sourced from popular photo sharing websites like SmugMug and Flickr.

Baio’s research concludes that many artists including photographers have, without doubt, had their images train a tool which may one day threaten their business.

Getty strikes!

Getty’s lawsuit claims Stable Diffusion ‘unlawfully copied and processed millions of images protected by copyright and the associated metadata owned or represented by Getty Images’ for commercial benefit.

It’s interesting to note that Getty filed the lawsuit in the UK, rather than on its home turf in the US. This is probably because UK copyright law is more favourable to creators than in the US, where Stable Diffusion could somehow twist the Fair Use exception to copyright infringement into its usage.

In the UK, as is the case of Australia, the best available defence is Fair Dealing. This only applies in cases where copyrighted material is used for the purpose of research, private study, criticism, review, news reporting, education, satire or parody. The courts must be satisfied the usage is considered ‘fair’.

‘We don’t believe this specific deployment of Stability’s commercial offering is covered by fair dealing in the UK or fair use in the US,’ said Getty Images CEO Craig Peters to The Verge. ‘The company made no outreach to Getty Images to utilise our or our contributors’ material so we’re taking an action to protect our and our contributors’ intellectual property rights.’

Peters compares the current rise of AI generators with the dawn of digital music and online file sharing, when online piracy through file sharing was immensely popular and untested in the courts.

‘I think there are ways of building generative models that respect intellectual property. I equate [this to] Napster and Spotify. Spotify negotiated with intellectual property rights holders – labels and artists – to create a service. You can debate over whether they’re fairly compensated in that or not, but it’s a negotiation based of the rights of individuals and entities. And that’s what we’re looking for, rather than a singular entity benefiting of the backs of others. That’s the long term goal of this action.’

Three artists challenge the AI revolution

Artists Sarah Andersen, Kelly McKernan, and Karloa Ortiz – represented by Matthew Butterick and the Joseph Saveryi Law Firm – are also suing Stable Diffusion. Butterick, who is also a writer and programmer, set up a website, stablediffusionlitigation.com, to provide further background.

‘Hav­ing copied the five bil­lion images – with­out the con­sent of the orig­i­nal artists – Sta­ble Dif­fu­sion relies on a math­e­mat­i­cal process called dif­fu­sion to store com­pressed copies of these train­ing images, which in turn are recom­bined to derive other images. It is, in short, a 21st-cen­tury col­lage tool,’ he writes.

‘These result­ing images may or may not out­wardly resem­ble the train­ing images. Nev­er­the­less, they are derived from copies of the train­ing images, and com­pete with them in the mar­ket­place. At minimum, Sta­ble Dif­fu­sion’s abil­ity to flood the mar­ket with an essen­tially unlim­ited num­ber of infring­ing images will inflict per­ma­nent dam­age on the mar­ket for art and artists.’

Plaintiff, Karloa Ortiz, a full-time painter, feels deeply exploited by AI media models and wishes to set legal precedent to ‘set this right’.

‘I am proud to be one of the plaintiffs named for this class action suit,’ she wrote in Twitter. ‘I am proud to do this with fellow peers, that we’ll give a voice to potentially thousands of affected artists. I’m proud that now we fight for our rights not just in the public sphere but in the courts.’

There is something of a legal PR war going on here, with Stable Diffusion fans publicly responding to Butterick’s claims through its a dedicated debunking website.

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Our Business Partners

Top