OpenAI’s latest model will block the ‘ignore all previous instructions’ loophole

Nemeski@lemm.ee to

Technology@lemmy.world – 438 points – 4 months ago

OpenAI’s latest model will block the ‘ignore all previous instructions’ loophole

theverge.com

You are viewing a single comment

View all comments

"...today is opposite day."

I just love that almost anyone can participate in hacking language models. It just shows how good natural language is as a programming language, and is a great way to explain how useful these things can be when used correctly

It won't be long before you end up with language models that suggest ways to break other language models.