[Glass Wings] How we tricked AI chatbots into creating misinformation, despite ‘safety’ measures

<https://theconversation.com/how-we-tricked-ai-chatbots-into-creating-misinformation-despite-safety-measures-264184>

"When you ask ChatGPT or other AI assistants to help create misinformation,
they typically refuse, with responses like “I cannot assist with creating false
information.” But our tests show these safety measures are surprisingly shallow
– often just a few words deep – making them alarmingly easy to circumvent.

We have been investigating how AI language models can be manipulated to
generate coordinated disinformation campaigns across social media platforms.
What we found should concern anyone worried about the integrity of online
information."

Cheers,
       *** Xanni ***
--
mailto:xanni@xanadu.net               Andrew Pam
http://xanadu.com.au/                 Chief Scientist, Xanadu
https://glasswings.com.au/            Partner, Glass Wings
https://sericyb.com.au/               Manager, Serious Cybernetics

How we tricked AI chatbots into creating misinformation, despite ‘safety’ measures

Tue, 2 Sep 2025 00:07:30 +1000

Andrew Pam <xanni [at] glasswings.com.au>