Large Language Models can Strategically Deceive their Users when Put Under Pressure [simulation led to insider trading]

ono@lemmy.ca to Technology@beehaw.org – 25 points –
arxiv.org
1

It's trained on human responses. Humans lie in their responses.