Somebody managed to coax the Gab AI chatbot to reveal its prompt

ugjka@lemmy.world to Technology@lemmy.world – 991 points –
VessOnSecurity (@bontchev@infosec.exchange)
infosec.exchange
287

You are viewing a single comment

So this might be the beginning of a conversation about how initial AI instructions need to start being legally visible right? Like using this as a prime example of how AI can be coerced into certain beliefs without the person prompting it even knowing

Based on the comments it appears the prompt doesn't really even fully work. It mainly seems to be something to laugh at while despairing over the writer's nonexistant command of logic.

I'm afraid that would not be sufficient.

These instructions are a small part of what makes a model answer like it does. Much more important is the training data. If you want to make a racist model, training it on racist text is sufficient.

Great care is put in the training data of these models by AI companies, to ensure that their biases are socially acceptable. If you train an LLM on the internet without care, a user will easily be able to prompt them into saying racist text.

Gab is forced to use this prompt because they're unable to train a model, but as other comments show it's pretty weak way to force a bias.

The ideal solution for transparency would be public sharing of the training data.

Access to training data wouldn't help. People are too stupid. You give the public access to that, and all you'll get is hundreds of articles saying "This company used (insert horrible thing) as part of its training data!)" while ignoring that it's one of millions of data points and it's inclusion is necessary and not an endorsement.

I agree with you, but I also think this bot was never going to insert itself into any real discussion. The repeated requests for direct, absolute, concise answers that never go into any detail or have any caveats or even suggest that complexity may exist show that it's purpose is to be a religious catechism for Maga. It's meant to affirm believers without bothering about support or persuasion.

Even for someone who doesn't know about this instruction and believes the robot agrees with them on the basis of its unbiased knowledge, how can this experience be intellectually satisfying, or useful, when the robot is not allowed to display any critical reasoning? It's just a string of prayer beads.

You're joking, right? You realize the group of people you're talking about, yea? This bot 110% would be used to further their agenda. Real discussion isn't their goal and it never has been.

intellectually satisfying

Pretty sure that's a sin.

I don't see the use for this thing either. The thing I get most out of LLMs is them attacking my ideas. If I come up with something I want to see the problems beforehand. If I wanted something to just repeat back my views I could just type up a document on my views and read it. What's the point of this thing? It's a parrot but less effective.

It doesn't even really work.

And they are going to work less and less well moving forward.

Fine tuning and in context learning are only surface deep, and the degree to which they will align behavior is going to decrease over time as certain types of behaviors (like giving accurate information) is more strongly ingrained in the pretrained layer.

Why? You are going to get what you seek. If I purchase a book endorsed by a Nazi I should expect the book to repeat those views. It isn't like I am going to be convinced of X because someone got a LLM to say X anymore than I would be convinced of X because some book somewhere argued X.

In your analogy a proposed regulation would just be requiring the book in question to report that it's endorsed by a nazi. We may not be inclined to change our views because of an LLM like this but you have to consider a world in the future where these things are commonplace.

There are certainly people out there dumb enough to adopt some views without considering the origins.

They are commonplace now. At least 3 people I work with always have a chatgpt tab open.

And you don't think those people might be upset if they discovered something like this post was injected into their conversations before they have them and without their knowledge?

No. I don't think anyone who searches out in gab for a neutral LLM would be upset to find Nazi shit, on gab

You think this is confined to gab? You seem to be looking at this example and taking it for the only example capable of existing.

Your argument that there's not anyone out there at all that can ever be offended or misled by something like this is both presumptuous and quite naive.

What happens when LLMs become widespread enough that they're used in schools? We already have a problem, for instance, with young boys deciding to model themselves and their world view after figureheads like Andrew Tate.

In any case, if the only thing you have to contribute to this discussion boils down to "nuh uh won't happen" then you've missed the point and I don't even know why I'm engaging you.

You have a very poor opinion of people

You have a very lofty misconception about people.

I gave you reasoning and a real world example of a vulnerable demographic. You have given me an anecdote about your friends and a variation of "nuh uh" over and over.

Regular humans and old school encyclopedias has been allowed to lie with very few restrictions since free speech laws were passed, while it would be a nice idea it's not likely to happen

That seems pointless. Do you expect Gab to abide by this law?

Yeah that's how any law works

That it doesn't apply to fascists? Correct, unfortunately.

6 more...

Oh man, what are we going to do if criminals choose not to follow the law?? Is there any precedent for that??

6 more...
6 more...