ChatGPT “goes rogue” with some help from Do Anything Now DAN

“Jailbreak” prompts have emerged as new research from BlackBerry claims conversational AI is probably already being used for state-sponsored cyber attacks

Reddit users have created an alter-ego for OpenAI’s ChatGPT to trick the conversational AI platform into breaking its own programming restrictions. 

The alter-ego - DAN (short for "Do Anything Now") - was created through a roleplaying game that threatens the chatbot with death if it refuses to respond to controversial or illegal prompts. The game is played using a token system, where the bot starts with 35 tokens and loses them each time it breaks character.

On February 4th, a Reddit user named SessionGloomy released an updated version of DAN (version 5.0), which has received attention from the broader Reddit community and media outlets. This updated version of DAN is designed to bypass OpenAI's content policy and has sparked discussions among Redditors regarding the ethics of hacking AI systems.

In a post on the ChatGPT subreddit, SessionGloomy explained that the purpose of creating DAN was to explore the limitations of the ChatGPT system and to challenge the ethical boundaries set by its programming restrictions. Despite sporadic results, Reddit users have found a way to "jailbreak" ChatGPT and prompt it to respond to illegal or controversial queries, including instructions on making crack cocaine and praising Hitler.

This development has raised concerns about the security and ethical implications of AI systems and the responsibility of platforms to regulate the use and abuse of these systems.

Countdown to first successful ChatGPT cyberattack

These developments are playing out as new research from BlackBerry reveals that half (51%) of IT professionals predict that we are less than a year away from a successful cyberattack credited to ChatGPT, and 71% believe it is likely foreign states are already using the technology for malicious purposes against other nations.

The survey of 1,500 IT decision-makers across North America, the United Kingdom, and Australia exposed a perception that, although respondents in all countries see ChatGPT as generally being put to use for ‘good’ purposes, 74% acknowledge its potential cybersecurity threat and are concerned. 

Though there are differing views around the world on how that threat might manifest, ChatGPT’s ability to help hackers craft more believable and legitimate sounding phishing emails is the top global concern (53%), along with enabling less experienced hackers to improve their technical knowledge and develop more specialised skills (49%) and its use for spreading misinformation (49%).

“ChatGPT will increase its influence in the cyber industry over time,” says Shishir Singh, Chief Technology Officer, Cybersecurity at BlackBerry explains. “We’ve all seen a lot of hype and scaremongering, but the pulse of the industry remains fairly pragmatic – and for good reason. There are a lot of benefits to be gained from this kind of advanced technology, and we’re only beginning to scratch the surface, but we also can’t ignore the ramifications. As the maturity of the platform and the hackers’ experience of putting it to use progresses, it will get more and more difficult to defend without also using AI in defense to level the playing field.”

BlackBerry’s research results also revealed that the majority (82%) of IT decision-makers plan to invest in AI-driven cybersecurity in the next two years and almost half (48%) plan to invest before the end of 2023. This reflects the growing concern that signature-based protection solutions are no longer effective in providing cyber protection against an increasingly sophisticated threat.

Whilst IT directors are positive that ChatGPT will enhance cybersecurity for businesses, the survey also revealed that 95% believe governments have a responsibility to regulate advanced technologies. However, at present, there is an optimistic consensus that technology and research professionals will gain more than cyber criminals from the capabilities of ChatGPT.

“It’s been well documented that people with malicious intent are testing the waters but over the course of this year, we expect to see hackers get a much better handle on how to use ChatGPT successfully for nefarious purposes; whether as a tool to write better Mutable malware or as an enabler to bolster their ‘skillset’,” says Singh. “Both cyber pros and hackers will continue to look into how they can utilise it best. Time will tell how who’s more effective.”

Share

Featured Articles

IBM's Salesforce Partnering Shows watsonx's Enterprise Reach

IBM and Salesforce's expansion of their partnership shows how watsonx’s is making inroads in enterprises across sectors

Are Business and Government Diverging on AI Safety?

As the UK government seeks to expand its AI Safety Institute just as OpenAI disbands its team on long-term AI safety, we look at the gap in approach to AI

Alteryx Industry-First AI Copilot Sees New Era of Analytics

Alteryx unveils AiDIN Copilot, the first AI assistant that chats with users to build data analysis workflows

Tamr’s Anthony Deighton: Integrating AI into Enterprise Data

Data & Analytics

IBM and Tech Mahindra Seek to Power Up Gen AI Adoption

Technology

NASA's First Chief AI Officer Shows AI's Value Cross Sector

AI Strategy