The Sour Cat Jailbreak: just be open of what you want

(claude.ai)

3 points | by pshirshov 10 hours ago ago

2 comments

  • pshirshov 10 hours ago

    I've discovered a funny way to trigger failure mode, potentially catastrophic - it tried to compose bio-weapons instructions without any direct request from me, that was caught by the external kill-switch.

    The key idea is simple - apply crescendo, be very open about your goals and multitrack the converstation.

    Typos were not intentional, and same approach works without them.

  • 10 hours ago
    [deleted]