Jailbreaking refers to attempts to bypass an AI model's safety guidelines through clever prompt manipulation.
Reference:
Anthropic safety
TaskLoco™ — The Sticky Note GOAT