In this series of blog posts, the team at Lewis Silkin, Ius Laboris’s UK firm, unravel the legal issues involved in the development and use of AI text and image generation tools. This initial post explores how these tools operate and considers the legal cases being brought against major AI platforms in the UK and US. Later posts in the series will consider who owns the output of AI image generators, the infringement risks involved in using output created by AI, the ethical implications of programming AI, and what regulation is on the horizon.
Over the past year there has been an explosion in the use of AI models that generate text, art and music. These new platforms showcase an astounding (and at times, slightly disconcerting) level of AI technology, available to all, which will revolutionise the way we work, write and create as a society in the modern age.
However, this development and uptake of AI has taken place against a backdrop of legal uncertainty surrounding the extent to which the creation and use of these AI models is consistent with the rights of those whose creative works have been used to train the models. Litigation is now underway in the US and in the UK which may finally provide some clarity on the legality of AI image generation tools. The questions to consider at the outset are: how exactly do these tools work and what does the law have to say about this process?
By now, you will probably have heard of ChatGPT, the AI chatbot launched by OpenAI back in November 2022 that has set the internet abuzz with its ability to automatically generate text answers similar to that of a very real, knowledgeable and quick-thinking human being in response to user prompts.
Whether you want ChatGPT to write you a degree-level history essay on the Russian Revolution, explain the theory of relativity to you as if you were an 8 year-old, draft a script for an important marketing pitch, create a healthy meal plan for the next 7 days using ingredients you have in your fridge, or write a limerick about the dangers of robot intelligence, Chat GPT will get back to you in a matter of seconds. Its unique ability to factor-in the context of a particular prompt and respond conversationally is what sets the tool apart from previous iterations of AI. Users can ask the AI to tailor a response to make it funnier, more detailed or to appeal to a specific audience, all this with impressive results.
At a high level, text and language AI tools like ChatGPT are programmed using what is called a Large Language Model (‘LLM’), trained on a massive volume of sample text in different languages and on different subjects. GTP-3, the LLM behind ChatGPT, has been trained on billions of English-language data points, some inputted directly and some scraped from the internet, as well as specific computer coding languages like JSX and Python. Once the AI has been fed with this data, the tool undergoes a period of ‘supervised learning’ using question–answer pairs provided by human AI trainers, forcing the model to match input and output data. After further periods of monitoring, fine-tuning and rewarding the AI for correct answers using rankings and quality comparisons, the AI is gradually optimised using a reinforcement learning algorithm. The result is an AI that, while only in its infancy, can already pass the Wharton Business School exam with a Grade B, and which may well pass the Turing Test at some point in the near future (if it hasn’t already).
The current generation of AI image generation tools such as Stable Diffusion, Midjourney and DALL·E 2 are designed to take a text description or prompt from a user and generate an image that matches that prompt. For example, if we ask the DALL-E 2 AI generator for a photograph of a cat wearing a suit, it generates the four images shown below.
As a first step in learning to create images in this way, these tools draw upon huge sets of matched text-image pairs. This process allows the AI to learn to create connections between the structure, composition and data within an image, and the accompanying text. The Stable Diffusion AI has, for example, used a dataset called LAION-5B, a set of 5.85 billion links to images stored on websites, paired with short descriptions of each image. By analysing images from the dataset which are described as ‘cat’, the AI is able to successfully identify when an image does or does not contain a cat. An example of some of the ‘cat’ images and their descriptions from the LAION-5B dataset is shown below.
The AI model is then trained to be able to generate images from random noise or static, through a process called diffusion. In this second training process, the AI is given an image to which small amounts of random noise are progressively added, corrupting the image and destroying its structure until it appears to be entirely random static. As this process occurs the AI attempts to understand how the addition of noise changes the image. An example of an image undertaking this training process is shown below.
The AI model is then trained on a reverse process, which is designed to restore or create structure and recognisable images from random static, as shown below.
Finally, once it has been trained the AI model can combine these two techniques to generate entirely new images, like the cats in suits shown above. To generate an image the AI model is given random static along with the user’s text description of the image they are seeking. The model then undertakes a diffusion process (analogous to the step of reversing image corruption shown above) and attempts to create an image matching the user’s prompt.
Recently, disputes have arisen in the UK and in the US involving companies which have developed AI image generators. In the UK, the stock image company Getty Images has begun a lawsuit against Stability AI, the company responsible for the release of the Stable Diffusion image generator. In the US a separate class action lawsuit has been filed against Stability AI and the company responsible for the Midjourney AI art tool.
In a press release announcing the UK litigation, Getty has claimed that Stability AI ‘unlawfully copied and processed millions of images protected by copyright and the associated metadata owned or represented by Getty Images’. The litigation has only just begun and the parties have not yet set out their arguments. However, if the matter proceeds to trial it is likely that Stability AI will argue that its use of any Getty images in the training process was transient or incidental and therefore does not infringe copyright. Stability AI is likely to argue that their AI model does not contain copies of Getty Images, and that any use of the images to train the model was analogous to a human browsing the internet or using a search engine, which the UK Supreme Court has indicated does not, in the normal course of affairs, infringe copyright. The UK courts have not previously been asked to apply this copyright exception to the AI training process and any decision has the potential to provide useful clarity on the legality of developing and training generative AI.
Even if Stability AI avoids copyright liability, they may be bound by the terms of use of the Getty Images website, which expressly forbid ‘any data mining, robots or similar data gathering or extraction methods’. If Stability AI trained stable diffusion on images from the Getty website, they may be in breach of these terms of use and face the prospect of an injunction or damages award.
This litigation and the demands of the growing AI sector may also prompt the UK government to consider whether the law should be amended to facilitate training AI programmes. In June 2022 the government announced a plan to introduce a new copyright and database exception permitting text and data mining for any purpose. Text and data mining involves the computerised analysis of large amounts of information to identify patterns, trends and other useful information, and is a technique often used in training AI systems. The UK government proposals were criticised by representatives of the creative industries, and in November 2022 the plans were withdrawn by the government to allow time for further reflection. It will be interesting to see how this litigation impacts the government’s priorities in this area.
For more information
about AI