Artists Are Fighting Back Against AI
The antagonism of the artistic community towards AI is on the rise – what started with an informal boycott of visual artists against AI-generated content, now escalated into dozens of class and individual lawsuits in the USA.
While the courts are yet to give their say in this matter, the artists are not idle – they are starting to rely on tools that directly contaminate and confuse the AI systems, such as Glaze, Nightshade, and Kudurru. Nightshade, for example, confuses the matching of images with textual prompts by creating a discrepancy between the image and text, thereby confusing the AI to pair, for example, the prompt “car” with an image of a cow. “You can think of Nightshade as adding a small poison pill inside an artwork in such a way that it’s literally trying to confuse the training model on what is actually in the image,” says Zhao, the leader of the research team that built the tool. Working on the same principle, Glaze modifies the pixels in the artwork so that AI cannot produce works in the style of a specific artist, while Kudurru tracks scrapers’ IP addresses to either block them or send them back unsolicited content (such as the middle finger).
These digital tools allow artists to disrupt future AI by “poisoning” the copyright works that may be included in the training datasets. Data poisoning attacks manipulate training data to introduce unexpected behaviour into machine learning models at training time. As such, they will not help artists revert and “untrain” the existing AI models who have already digested tons of artworks, but they might be able to prevent future training of their creative works without permission. The idea is to eventually taint and break future AI models to such an extent that AI companies will be forced to either stop training on copyright works or be forced to seek permission from authors for data scraping.
The above tools come at a moment when the debate on the use of copyright works for training purposes intensifies between two contrasting positions – one, asserted by AI developers which rely on “fair use” or other legal ground to train AI tools on a vast amount of copyright materials without the consent of the authors, and the other voiced by the authors that such use requires the triple C (Consent, Credit, Compensate). And it seems that the authors are not entirely alone in their position; for example, an executive of Stability AI, Ed Newton-Rex, recently resigned over the company’s view that it is acceptable to use artwork for training purposes without the artists’ permission. He told the BBC he thought it was “exploitative” for AI developers to use creative work without consent: “I think that ethically, morally, globally, I hope we’ll all adopt this approach of saying, ‘you need to get permission to do this from the people who wrote it, otherwise, that’s not okay'” he said.
The use of copyright materials for training purposes, thus, remains one of the key issues in the present IP v. AI debate. While we await the first decisions of the USA courts that should finally clarify whether the use of copyright works for training purposes will be allowed in the USA under the “fair use” principle, Europe remains silent as there is still no indication of how this question will unroll in practice. At the same time, it seems that European countries may end up with a heavily fragmented approach, as certain individual countries (such as France) are considering very author-friendly approaches requiring the AI to seek prior permission, credit all individual authors and pay back a fair tax for the works used for training purposes, as opposed to the current “text and data mining” exception introduced by the EU Digital Single Market Directive which allows scraping of copyright works for certain purposes.
The evolving AI landscape raises questions about the ethical use of creative works in AI development, emphasizing the need for a balanced and comprehensive legal framework.