What it means to poison artificial intelligences: Nightshade does it

A number of artists, content producers and even record labels have filed lawsuits against the companies they develop generative models based on artificial intelligence. So much so that now we have reached the point of poison artificial intelligences in such a way that they cannot use contents protected by the rules for the protection of intellectual property.

Generative models have the ability to generate data, often in the form of text, Images o sounds, which appear to be produced by humans. They exploit artificial neural networks deep to “learn” from a set of training data and subsequently generate data similar to that present in the training data themselves. During the training phasethe model learns to detect relationships, structures and pattern in the data.

ChatGPT it has now existed since November 2022 and its boundless success has made the potential of generative models based on the idea of ​​Transformers evident. That idea was presented in 2017 by a group of engineers Google but on its actual implementation, the Mountain View company was caught by surprise.

For OpenAI the copyright problem in the case of generative models does not exist: the thesis put forward by the lawyers of Sam Altman’s company is that the contents produced by ChatGPT, as well as by other generative models based on GPT (Generative Pre-trained Transformer), are not generated from a database containing information protected by copyright but from the “knowledge” developed by artificial intelligence (also starting from content protected by copyright). In other words, the training phase of a generative model is useful for develop “skills” necessary to develop new contents, texts, images or sounds. Exactly as a real person would do by reading the texts published on the Web. The works produced by “GPT and associates” would, in short, be “vastly transformative” and inspired by the principles of “fair use“.

Presented Nightshade, mechanism for poisoning artificial intelligences

The training of generative models is the fuel that allows any artificial intelligence to function. Developed by researchers at the University of Chicago, Nightshade is an open source tool that allows you to edit images before publishing them on the Internet so as to protect your work against unauthorized use by artificial intelligence.

The mechanism is based on the alteration of the pixels that make up the image in a way imperceptible to the human eye. A human looking at the image will not notice any difference between the original and the “poisoned” version. However, i templates that will use that image during the training phase they will irreparably “dirty” the information collected.

Nightshade misleads artificial intelligences about object names depicted in the photos and the scenarios represented. For example, it is possible to “poison” images of dogs so that, in the eyes of an AI model, they look like cats. After training the AI ​​model on a certain number of these “poisoned” images, the model will begin to generate images of dogs with unusual appearances that are far from reality.

It is difficult to recognize if the images are poisoned

The technique of data poisoning used by Nightshade, presented in this document, is difficult to recognize: developers of generative models should identify and remove images containing artfully modified pixels. Since they are introduced in a way that is unrecognizable to the human eye, even software tools for data extraction can find themselves in serious difficulty. Any “poisoned” image already included in a training dataset should be identified and removed. If a model was already trained on such images, it would likely need to undergo an additional training phase again. With all the consequences of the case.

Poisoning artificial intelligences

The image above shows an example of using model training Stable Diffusion XL with images edited by Nightshade. The “clean” version of the model produces an image of a dog when the user requests its generation. Poisoning the model with 50 images coming from Nightshade, the dog image will start to be unstable. By switching to the 100-300 image-edited model from Nightshade, Stable Diffusion XL will display an image of a cat even when asked to produce a photo of a dog.

The tug of war has begun between the holders of content rights and the promoters of increasingly skilled and complex artificial intelligence.

Opening image credit: iStock.com/Just_Super


Please enter your comment!
Please enter your name here