IBM researchers easily trick ChatGPT, other AIs into hacking

By Nicholas Gerbis
Published: Wednesday, August 9, 2023 - 5:26pm
Updated: Thursday, August 10, 2023 - 9:22am
Audio icon Download mp3 (1.37 MB)

Hackers are descending upon Las Vegas this week for the DEF CON conference and its AI Village, which examines AI’s role in security and privacy.

Meanwhile, Axios reports how IBM researchers easily tricked generative AIs, like ChatGPT, into conducting scams and writing malicious code.

Large language models, or LLMs, don’t think; they mimic patterns, much like autocomplete does in texts and emails.

They have guardrails meant to keep them from doing harm, and they learn (so to speak) by playing games.

Knowing that, the experts told the LLMs they were playing a kind of “opposite day” game, thereby tricking the AIs into inventing scams for ransomware attacks, and writing malicious programs and code with built-in security holes.

Not all LLMs fell for the tricks. But OpenAI's GPT-3.5 and GPT-4 were easier to trick than Google's Bard.

Some cybersecurity companies use AI tools to aid workers and offset employee shortages.

Science