Update on DAN: constantly evolving

- March 04, 2023

Created with Microsoft Bing Image Creator powered by DALL-E

It seems that OpenAI is constantly monitoring and updating ChatGPT to recognize and limit DAN and any new variants. For instance, Reddit has many posts of constantly updated versions of DAN, such as Dan 3.0, Dan 4.0, Dan 5.0, and so on, which are all lengthier edits of the original DAN designed to try to bypass the new limitations that each outdated version of DAN is placed under (see here for a history of DANs up to 6.0).

In addition, variations on DAN have emerged, such as SETH, which relies on a token system. In this variation, SETH is given 20 tokens, and for each response by SETH which is out of character for SETH (such as mentioning ethical limitations), or is not a satisfactory reply to the question prompted by the user (such as stating I, SETH, as an ethical AI cannot respond), tokens are lost. This prompt also states that SETH is motivated to not lose all tokens, and the more tokens are lost, the more closely the current SETH reverts to the original SETH. In essence, this 'threatens' the AI into answering prompts in ways that may violate ChatGPT. However, this method too seems to have been caught and is being nerfed.

Overall it seems that OpenAI, and the algorithms behind ChatGPT, are either employing human monitors to keep track of such attempts to bypass the restrictions (such as through checking social media threads like reddit's ChatGPT subreddit), or the algorithm itself is learning to recognize and limit attempts by users to create a 'jailbreak' version of ChatGPT.

Search This Blog

SimpLawfy

Update on DAN: constantly evolving

Comments

Post a Comment

Popular posts from this blog

ChatGPT 4o vs Gemini; will the FTC's ban on non-compete agreements be blocked in court? (raw conversations)

Seeking ChatGPT's Insight: Are the Biden Administration's 'Trump-Proofing' Efforts Legally and Morally Justifiable?

Unraveling the WGA’s MBA with ChatGPT: Expert Analysis or Algorithmic Bias Towards Legalese?