Sex offender banned from using AI tools in landmark UK case

girlfreddy@lemmy.ca to World News@lemmy.world – 276 points –
Sex offender banned from using AI tools in landmark UK case
theguardian.com

A sex offender convicted of making more than 1,000 indecent images of children has been banned from using any “AI creating tools” for the next five years in the first known case of its kind.

Anthony Dover, 48, was ordered by a UK court “not to use, visit or access” artificial intelligence generation tools without the prior permission of police as a condition of a sexual harm prevention order imposed in February.

The ban prohibits him from using tools such as text-to-image generators, which can make lifelike pictures based on a written command, and “nudifying” websites used to make explicit “deepfakes”.

Dover, who was given a community order and £200 fine, has also been explicitly ordered not to use Stable Diffusion software, which has reportedly been exploited by paedophiles to create hyper-realistic child sexual abuse material, according to records from a sentencing hearing at Poole magistrates court.

149

As a UK citizen, I'm ashamed of my government.

I am firmly against child abusers, but AI images don't harm anyone and are a safe and harmless way for pedophiles to fulfil their urges, which they cannot control.

Where does the training data come from to create indecent images of children?

It doesn't need csam data for training, it just needs to know what a boob looks like, and what a child looks like. I run some sdxl-based models at home and I've observed it can be difficult to avoid more often than you'd think. There are keywords in porn that blend the lines across datasets ("teen", "petite", "young", "small" etc). The word "girl" in particular I've found that if you add that to basically any porn prompt gives you a small chance of inadvertently creating the undesirable. You have to be really careful and use words like "woman", "adult", etc instead to convince your image model not to make things that look like children. If you've ever wondered why internet-based porn generators are on super heavy guardrails, this is why.

I'm not going to say that csam in training sets isn't a problem. However, even if you remove it, the model remains largely the same, and its capabilities remain functionally identical.

At that point it's still using photos of children to generate csam even if you could somehow assure the model is 100% free of csam

That would be true, it'd be pretty difficult to build a model without any pictures of children at all, and then try and describe to the model how to alter an adult to make a child. Is anyone asking for that though? To make it illegal to have regular pictures of children in these datasets?

No but it is a reason why generating csam should be illegal. You're using data trained on pictures of real kids

I'm not arguing whether or not it should be legal, I was just offering my first hand experience in regards to the capabilities of these local models since people seem to be confused as to how this actually works.

Is anyone asking for that though? To make it illegal to have regular pictures of children in these datasets?

I was responding to this part of your comment which directly refers to legality

I guess I just misunderstood what you were arguing then. For posterity: I believe datasets containing children is fine, datasets containing csam is not, and the legality of generating csam should be left up to psychologists on whether or not it is a societal net benefit. Whichever way is better for children that exist is my vote.

Thanks for the reply, it's given me a good idea of what's most likely happening :)

It's a shame that the rest of the thread went to shit, but unfortunately it's an emotional topic, and brings out emotional responses

Always happy to try and productively add to someone's learning.

It is true, a 10 year old naked woman is just a 30 year old naked woman scaled down by 40%. /s

No buddy, there isn't some vector of "this is the distance between kid and adult" that a model can apply to generate what a hypothetical child looks like. The base model was almost certainly trained on more than just anatomical drawings from Wikipedia - it ate some csam.

If you've seen stuff about "Hitler - Germany + Italy = Mousillini" for models where that's true (which is not universal) it takes an awful lot of training data to establish and strengthen those vectors. Unless the generated images were comically inaccurate then a lot of training went into this too.

Right, and the google image ai gobbled up a bunch of images of black george washington, right? They must have been in the data set, there’s no way to blend a vector from one value to another, like you said. That would be madness. Nope, must have been copious amounts of asian nazis in the training set, since the model is incapable of blending concepts.

You're incorrect and you should fucking know better.

I have no idea why my comment above was downvoted to hell but AI can't "dream up" what a naked young person looks like. An AI can figure that adults wear different clothes and put a black woman in a revolutionary war outfit. These are totally different concepts.

You can downvote me if you like but your AI generated csam is based on real csam so fuck off. I'm disappointed there is such a large proportion of people defending csam here especially since lemmy should be technically oriented - I expect to see more input from fellow AI fluent people.

You’re spreading misinformation and getting called out for it.

Ok? Hundreds of images of anything isn't going to necessarily train a model based on billions of images. Have you ever tried to get Stable Diffusion to draw a bow and arrow? Just because it has ever seen something doesn't mean that it has learned it, nor, more importantly, does that mean that is the way it learned it, since we can see that it can infer many concepts from related concepts- pregnant old women, asian nazis, black george washingtons (NONE OF WHICH actually have ever existed or been photographed).. is unclothed children really more of a leap than any of those?

It is, yes. A black George Washington is one known visual motif (a George Washington costume) combined with another known visual motif. A naked prepubescent child isn't just the combination of "naked adult" and "child" naked children don't look like naked adults simply scaled down.

AI can't tell us what something we've never seen looks like... a kid who knows what George Washington and a black woman looks like can imagine a black George Washington. That's probably a helpful analogy, AI can combine simple concepts but it can't innovate - it can dream, but it can't know something that we haven't told it about.

What you're saying is based on the predicate that the system can't draw concepts it has never seen which is simply untrue. Everything else past that is sophistry.

Edit: also not continuing a conversation with someone who is hostile to the basic rules of logic.

You have a basic misunderstanding of how AI works and are endowing it with mystical properties. Generative AI can't accurately infer concepts or items it doesn't understand. It has all the knowledge of the internet but if you ask it to draw a schematic for a hydrogen bomb it'll give you back hallucinated bullshit. I'll grant that there's a small chance that just enough random details have been leaked that the AI may actually know how to build a hydrogen bomb - but it can't infer how that would work from "understanding physics".

Either way, these models were trained on csam, so my initial point is accurate and not misinformation.

It isn't misinformation, though, generative AI needs a basis for it's generation.

The misinformation you’re spreading is related to how it works. A generative AI system will (without prompting away from it) create people with 3 heads, 8 fingers on each hand and multiple legs connecting to each other. Do you think it was trained on that? This argument of “it can generate it, therefore it was trained on it” is ridiculous. You clearly don’t understand how it works.

20 more...
20 more...
20 more...
20 more...
26 more...
26 more...

The whole point of diffusion models is that you can generate new concepts using training data. Models trained on any nsfw images can combine those concepts with any of its non-nsfw concepts. Of course, that's not to say there isn't CSAM in any training data, because there objectively has been in the past, but there doesn't need to be any to generate it.

26 more...

It brings me to ask the question if lolicon could be their next target?

26 more...

How could they possibly enforce this ban?

What do you mean? Like how would they catch him?

In the States parole/probation means you lose most of your civil liberties. In other words, if this was the U.S. a PO would check his phone and possibly his computer. Possibly even pull ISP records depending on how bad they want to catch you/how full of shit they think you are.

How will they even know he's doing it? It doesn't say they're monitoring his internet connection. And even if they were monitoring his internet connection, he could go to some public wifi hotspot and sit in a car and do it.

I edited my comment. You're too quick.

But yeah, he could get around it. But, he's an addict. He's going to want that porn other places then his car and make mistakes. If he's tech savvy, he can probably stay one step ahead of his probation agent (assuming he has one). If he's not, he'll slip up because he's addicted, and that's how people get caught.

Is it weird that the whole detect/evade game just sounds super fun to me?

Not really. You're probably a bit of a dopamine/norepinephrine (adrenaline) junky, like most Westerners. It's bred into us by consumer culture.

It's weird that it's not weird though.

Nahh, I'm not so much about the chase as the metagame.

Might want to checkout cyber security and pen testing. It's not the same thing exactly but it kinda close in some regards.

That's probably as close as I can get without picking out a taboo and running with it. I wonder if some drug lords get bored with the game and pretend to be pedophiles to catch bigger fish.

What do you mean by metagame? Like you find the cat and mouse stuff that is happening fascinating?

It's like hunting down a secret base in Minecraft. When you find it, they use a better trick next time! The objective for security endeavors is always distinct but the methods are always changing as everyone gets better!

Put monitoring software on his devices.

He could just get a burner phone. Realistically, there is no way to police this.

This is pretty similar to restraining orders, make it more difficult and make the consequences more severe.

Or a burner laptop/Chromebook/whatever. Couple that with a VPN, using a neighbor's wifi, public hotspots, etc, I don't really see how they can realistically enforce someone motivated to do what they're gonna do.

In the modern world when we have cellphones that can do pretty much anything... it's fucking hard. There will be a parole officer and monitoring software with periodic physical inspections along with watching his purchases. (That's, at least, th American approach).

Usually the way it works is that when this dude slips up once he goes to prison for violating his court order.

Have part of his probation be having his property searched to check for such devices.

There’s a log for everything. There really is. It’s just hard to piece it all together.

3 more...

UK legislators have a long history of taking actions not informed by science or reason but rather the popular, often hysteric, opinion.

This case is yet another attempt at tightening screws where they shouldn't be.

AI imagery was produced by Stable Diffusion, the model that, for all we know, did not take real CSAM as inputs and caused no harm to actual children. At the same time, such images are important at discouraging the consumption of real CSAM, with very real children being traumatized.

By banning AI imagery production using safe models, legislators leave no legal way for pedophiles to get something by the harmless means, directing many to the harmful ways as equally illegal, while also prosecuting those who did no harm.

I thought pedophiles looking at CSAM were more likely to attack a child, not less. They are actively fantasizing about it, and that can escalate.

I am basing this belief on what I remember of discussions regarding that "ask a rapist" reddit megathread. Apparently psychologists thought that was horrifying.

The bias with this approach is that it highlights those who did offend, while telling us nothing of those who didn't. This is often repeated throughout research as well.

It's very likely that a lot of child abusers did watch CSAM (after all, if you see no issue in child abuse, there's no issue for you in the creation of such imagery), but how many CSAM viewers end up being abusers and is there an elevated risk? That is the question.

I guess if we'd make an "ask a pedophile" thread instead of "ask a rapist", we could get some insights. Pedophiles, catch the idea!

But then we cannot say that in either direction. We simply don't know if they are more or less likely to attack a child without data about it.

By "harmful ways" I meant consuming more real CSAM - something that is frustratingly underresearched as well, but one can guess.

I don't have any of these Tendencies but I like to think that if I did I would chemically remove my sex drive

That's up to everyone. Besides, most pedophiles do have sexual interest towards adults as well, and current means reduce that drive too.

Chemical castration in this context increases misery and makes building healthy adult relationships harder. Most pedophiles do not opt for that, for all I know.

Current therapeutic methods do include suppressing sex drive in case the client struggles with impulse control. Otherwise, it is not offered, but can be given on request.

counter point:

if you have a folder of AI generated CP and put in a couple of pictures of actual CP it's going to muddle the case as the offender could claim all of them are simply AI generated. Real harm could go unnoticed if those two were to be treated differently.

Additionally, not every offender will stop at AI generated images, and if their curiosity becomes enough they could go on to want to experience "the real thing".

You do realize that slippery slope argument is what's used when it comes to banning anything else, right?

"Can't legalize marijuana or people will start wanting to do meth" for example.

I don't believe those two are comparable.

Weed and meth are rather different in how they affect people.

AI images are often used as a way to imitate reality

It doesn't matter if you believe it, for those who lived through D.A.R.E and the war on drugs, that argument was common and on plenty of people's lips. It's a stupid argument but I think that's the point OP is trying to make

then why is that person repeating a stupid argument at me? those aren't comparable at all.

A better comparison would be idk, CBD weed with no THC being legal and that being the "gateway" to normal weed. Or buying a knock off product and wanting to try the original. Or looking at AI generated photos people eating spaghetting and wanting to see how it actually looks like

It's a stupid argument being juxtaposed with your argument...you're so close, you got this.

I think the solution here is not banning AI materials outright but to make them identifiable - even by means of digital signatures if you want.

For example, Stable Diffusion could insert particular piece of metadata into the picture containing the signature and proving the image is AI-generated, etc.

Without AI materials, said curiosity may lead people straight to the "real thing", and every darknet or even Telegram dweller will tell you it's frighteningly easy to find it even if you never intended to. With AI materials, people can have a chance to stop there.

meta data is trivially easy to strip off a picture, you don't even need to bother using tools for it - just take a screenshot and delete the original

Can be baked in pixels, or even better sent to identification for a system similar to what Apple uses to detect CSAM, but as an "alright" ID (but just in police's hands, not on device or something).

But even then, if every pixel gets marked as 'created by AI', it would still be trivial to take real CSAM and run it through an image-to-image generator with denoising turned down to 0.05 and suddenly you have real CSAM that has been marked as 'legal' since it is technically AI generated.

Also, keep in mind that there are several open source projects out there where anyone who knows what they are doing could just strip out any protections that might be put in place.

Apple-like ID system solves the latter by technical means.

As per image-to-image, feeding the model with recognised CSAM should be unavailable to begin with.

Yeah but the point is you can't easily add it to any picture you want (if it's implemented well), thus providing a way to prove that the pictures were created using AI and no harm has been done to children in their creation. It would be a valid solution to the "easy to hide actual CSAM between AI generated pictures" problem.

you can't easily add it to any picture you want (if it's implemented well

Edit for the downvoters: StackExchange - How do I add exif data to an image?

Going to need you to elaborate on this. EXIF data is just bytes in a file, like any of the other bytes in the file. It can be changed and is often changed without the users consent. Are you proposing we create a new type of hardware, something akin to Secure Enclave, and then mass-produce and add it to every consumer CPU to ensure some specific types of exif data isn't tampered with?

I was thinking of an approach based on cryptographic signatures. If all images that come from a certain AI model are signed with a digital certificate, you can tamper with metadata all you want, you're not gonna be able to produce the correct signature to add to an image unless you have access to the certificate's private key. This technology has been around for ages and is used in every web browser and would be pretty simple to implement.

The only weak point with this approach would be that it relies on the private key not being publicly accessible, which makes this a lot harder or maybe even impossible to implement for open source models that anyone can run on their own hardware. But then again, at least for what we're talking about here, the goal wouldn't need to be a system covering every model, just one that makes at least a couple models safe to use for this specific purpose.

I guess the more practical question is whether this would be helpful for any other use case. Because if not, I hardly doubt it's gonna be implemented. Nobody is gonna want the PR nightmare of building a feature with no other purpose than to help pedophiles generate stuff to get off to "safely", no matter how well intentioned

I disagree that it should be allowed, but I think their proposal would be something like attaching an identifier to the model, the random seed, the "temperature," and any other relevant parameters that allow exact reproduction of the image without having access to anything but the model. Then you can prove it came from the model.

Here's a thought experiment, though, what would prevent someone from taking a real image and a model, then working with them until they can reproduce a very close approximation of the real image from text and parameter input? These models aren't like a hash function, they can be viewed in reverse to some extent. Backpropagation is how they are trained.

2 more...
2 more...
24 more...

Is he extorting actual kids or just having a computer generate fap material? The difference decides whether or not I give a damn.

He is fapping to porn that was generated by an AI that trained on csam.

Yes, just like the pictures of astronauts on horses were trained on an extensive collection of space derby pictures.

Not quite. You see, unfortunately, space derbies don’t actually exist. The other, unfortunately, actually does.

Be in denial if you want. That csam is trained on csam.

Any proof for this? Would be an interesting read.

No, it'd be hard to tell since models are usually close lipped. But Twitter has been included in a lot of the image models and traditionally has a very large issue with cp.

Kind of just contradicted yourself there. And have you ever heard the phrase "correlation does not imply causation"?

But how can that be? Surely just the fact that it can create those pictures is incontravertable proof that it was trained on pictures of spacesuited cowboys?

He used Stable Diffusion, which, for all we know, was NOT trained on CSAM.

Thanks for the correction!

It's worth noting that this only includes CSAM accidentally scraped along with everything else on the open Web. No specialized CSAM training took place.

In any way, I welcome the efforts at filtering such content before it enters the dataset.

It's obviously accidental, but that doesn't change that it happened and is something that will be near impossible to avoid as long as they continue to scrape data in the way they do for their models. They would need a human to filter it out like they already use for most LLMs.

How you will enforce this kind of politics? I just buy a VPN, use proxychains or annonsurf, what you gonna do? put a police to live in the same room as I live?

Install spyware on your devices, one would assume.

I think it would be hard to enforce but it's an interesting precedent, and it means that if they catch them they can whack him

Strange to use yourself as the person creating child pornography in this hypothetical.

Come on, they're raising concerns about how this ruling will be made effective. An actual criminal would just stay silent and use the VNP.

Jesus Christ is that @PotatoKat@lemmy.world ’s music I hear?

Just annoyed to see everyone saying with such definitive wording that there isn't any csam in training data. I'm a victim of CSA and can't imagine how I would feel if photos of me were used to help get people off like that.

Right? Training data is an absurd blob of everything the algorithm can get its hands on. It's like trying to assure that there's no alcohol or coca-cola in a lake.

It’s great to see you wading into this shitshow with the folding chair, ngl.

This is the best summary I could come up with:


The Internet Watch Foundation (IWF) said the prosecutions were a “landmark” moment that “should sound the alarm that criminals producing AI-generated child sexual abuse images are like one-man factories, capable of churning out some of the most appalling imagery”.

Susie Hargreaves, the charity’s chief executive, said that while AI-generated sexual abuse imagery currently made up “a relatively low” proportion of reports, they were seeing a “slow but continual increase” in cases, and that some of the material was “highly realistic”.

The Lucy Faithfull Foundation (LFF), which runs the confidential Stop It Now helpline for people worried about their thoughts or behaviour, said it had received multiple calls about AI images and that it was a “concerning trend growing at pace”.

The decision to ban an adult sex offender from using AI generation tools could set a precedent for future monitoring of people convicted of indecent image offences.

Stability AI, the company behind Stable Diffusion, said the concerns about child abuse material related to an earlier version of the software, which was released to the public by one of its partners.

It said that since taking over the exclusive licence in 2022 it had invested in features to prevent misuse including “filters to intercept unsafe prompts and outputs” and that it banned any use of its services for unlawful activity.


The original article contains 974 words, the summary contains 219 words. Saved 78%. I'm a bot and I'm open source!