This new data poisoning tool lets artists fight back against generative AI

Technology@lemmy.world – 546 points – 9 months ago

This new data poisoning tool lets artists fight back against generative AI

A new tool lets artists add invisible changes to the pixels in their art before they upload it online so that if it’s scraped into an AI training set, it can cause the resulting model to break in chaotic and unpredictable ways.

The tool, called Nightshade, is intended as a way to fight back against AI companies that use artists’ work to train their models without the creator’s permission.
[...]
Zhao’s team also developed Glaze, a tool that allows artists to “mask” their own personal style to prevent it from being scraped by AI companies. It works in a similar way to Nightshade: by changing the pixels of images in subtle ways that are invisible to the human eye but manipulate machine-learning models to interpret the image as something different from what it actually shows.

You are viewing a single comment

View all comments

Obviously this is using some bug and/or weakness in the existing training process, so couldn't they just patch the mechanism being exploited?

Or at the very least you could take a bunch of images, purposely poison them, and now you have a set of poisoned images and their non-poisoned counterparts allowing you to train another model to undo it.

Sure you've set up a speedbump but this is hardly a solution.

No! It's not using an internal exploit, it's rather about finding a way to visually represent almost the same image, but instead using latent features with different artists (e.g, which would confuse a dreambooth+lora training), however, the method they proposed is flawed, I commented more on https://lemmy.world/comment/4770884

Obviously this is using some bug and/or weakness in the existing training process, so couldn’t they just patch the mechanism being exploited?

I'd assume the issue is that if someone tried to patch it out, it could legally be shown they were disregarding people's copyright.

It isn't against copyright to train models on published art.

The general argument legally is that the AI has no exact memory of the copyrighted material.

But if that's the case, then these pixels shouldn't need be patched. Because it wouldn't remember the material that spawned them.

Is just the argument I assume would be used.

It's like training an artist who's never seen a banana or a fire hydrant, by passing them pictures of fire hydrants labelled "this is a banana". When you ask for a banana, you'll get a fire hydrant. Correcting that mistake doesn't mean "undoing pixels", it means teaching the AI what bananas and fire hydrants are.

Well, I guess we'll see how that argument plays in court. I don't see how it follows, myself.

What is "patching pixels" and who would do it?

Is that not answered in the original article?

Explain

In order to violate copyright you need to copy the copyrighted material. Training an AI model doesn't do that.

Obviously, with so many different AIs, this can not be a factor (a bug).

If you have no problem looking at the image, then AI would not either. After all both you and AI are neural networks.

The neural network of a human and of an AI operate in fundamentally different ways. They also interact with an image in fundamentally different ways.

I would not call it “fundamentally” different at all. Compared to, say, regular computer running non-neural network based program, they are quite similar, and have similar properties. They can make a mistake, hallucinate, etc.

As a person who has done machine learning, and some ai training and who has a psychotic disorder I hate they call it hallucinations. It’s not hallucinations. Human hallucinations and ai hallucinations are different things. One is based of limited data , bias, or a bad data set with builds a fundamentally bad neural network connection which can be repaired. The other is something that can not be repaired, you are not working with bad data, your brain can’t filter out data correctly and you are building wrong connections. It’s like an overdrive of input and connections that are all wrong. So you’re seeing things, hearing things, or believing things that aren’t real. You make logical leaps that are irrational and not true and reality splits for you. While similarities exist, one is because people input data wrong, or because they cleaned it wrong, or didn’t have enough. And the other is because the human brain has wiring problem caused by a variety of factors. It’s insulting and it also humanizes computers to much and degrades people with this illness.

As I understand, healthy people hallucinate all the time, but in different sense, non-psychiatric sense. It is just healthy brain has this extra filter that rejects all hallucinations that do not correspond to the signal coming from reality, that is our brain performs extra checks constantly. But we often get fooled if we do not have checks done correctly. For example, you can think that you saw some animal, while it was just a shade. There is even statement that our perception of the world is “controlled hallucination” because we mostly imagine the world and then best fit it to minimize the error from external stimuli.

Of course, current ANNs do not have such extensive error checking, thus they are more prone to those “hallucinations”. But fundamentally those are very similar to what we have in those “generative suggestions” our brain generates.

Those aren’t quite the same as a hallucination. We don’t actually call them hallucinations. Hallucinations are a medical term. Those are visual disturbances not “controlled hallucinations”. Your brain filtering it out and the ability to ignore it makes it not a hallucination. It’s hallucinations in a colloquial sense not medical.

Fundamentally AI is not working the same, you are having a moment of where a process from when in the past every shadow was a potential danger so seeing a threat in the shadow first and triggering fight or flight is best for you as a species. AI has no fight or flight. AI has no motivation, AI just had limited, bad, or biased data that we put there and spits out garbage. It is a computer with no sentience. You are not really error checking, you are processing more information, or reassessing once the fight/flight goes down. AI doesn’t have more information to process.

Many don’t see people with psychotic disorders as equal people. They see them as dangerous, and and people to be locked away. They use their illnesses and problems as jokes and slurs. Using terms for their illness in things like this only adds to their stigma.

You are arguing about terminology use. Please google "controlled hallucinations" to see how people use the term in non-psychiatric way.

I know how it is used in a non psychiatric way, I brought that up it can be used in colloquially. That doesn’t diminish the way that it can be used to harm and stigmatize an already stigmatized group of people. There are other terms that can be used, but this is used because people want to humanize AI and do not care about dehumanizing people who have psychotic disorders.

The fact of the matter remains that AI creators are not people who specialize in human brains, but they act like computers and human brains are one and the same. Similarity doesn’t equal the same processes. They can choose different language but they do not. They could call it a processing error, a glitch, a distortion. All would be accurate, but no, they chose a term that is harmful to a minority group because no one cares about stigmatizing them.

Look at 2 and 3: https://www.merriam-webster.com/dictionary/hallucination

And I just do not see how that can stigmatize a group of people. It is like saying that the use of the word "headache" in non-medical contexts (e.g., "this homework is a headache") stigmatizes people with migraines. It just does not.

Listen, I live in a state where anyone who commits a violent crime, before they catch the person the police say, “he was hallucinating, they were hearing voices” aka mental illness is why they are doing this as a way to take away more rights. Also in this state if you are in a conservatorship for mental illness you legally are barred from voting. How can you say hallucination is not a loaded term? It is different from headache because people are not stigmatized for migraines. No one is taking away your voting rights for migraines. No one is saying you are a murderer for migraines.

You can use nearly any word in derogative sense so that it becomes offensive. "he had headache so strong, he went crazy". Context matters. And I personally do not even associate hallucination with mental illness. If anything, I associate it with psychedelics. Words are like tools - you can harm with them, but you can use them appropriately.

1 more...

An AI don't see the images like we do, an AI see a matrix of RGB values and the relationship they have with each other and create an statistical model of the color value of each pixel for a determined prompt.

1 more...

2 more...