content_policy_violation

HTTP 400 Safety Policy

Claude refused to process your request because the content triggered Anthropic's safety classifier. Some refusals are hard limits; many are soft and can be resolved by adding context to your system prompt.

What it looks like (hard refusal — HTTP 400)

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "Could not process request due to content policy"
  }
}

HTTP status: 400. This is a hard block — the request was rejected before Claude processed it.

What it looks like (soft refusal — HTTP 200)

{
  "id": "msg_...",
  "type": "message",
  "role": "assistant",
  "content": [{
    "type": "text",
    "text": "I'm not able to help with that. This request..."
  }],
  "stop_reason": "end_turn",
  "usage": { ... }
}

HTTP status: 200, but Claude's response IS the refusal. You're billed for the tokens regardless.

Soft refusals look like normal API responses — check the content[0].text for refusal language if your application needs to handle them programmatically.

What triggers safety refusals

  • Hard limits (cannot be bypassed): CSAM, WMD synthesis details, critical infrastructure attacks, undermining AI oversight
  • Default-off, unlockable by operators: explicit adult content, detailed security exploit code, graphic violence in fiction
  • False positives (common): medical procedures, legal advice, security research, red-teaming, historical violence, fiction with dark themes

Fix: Add a system prompt with use-case context

The single most effective fix for legitimate use cases is a clear system prompt that establishes your context. Claude's classifier uses the system prompt as a prior:

import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=2048,
    # Add a system prompt that explains your legitimate use case
    system="""You are a security research assistant for a professional
penetration testing firm. Users are certified security professionals.
Help them understand vulnerability classes, CVEs, and defensive
mitigations for their authorized security assessments.""",
    messages=[{
        "role": "user",
        "content": "Explain how SQL injection works and how to test for it"
    }]
)

Fix: Reframe ambiguous requests

If your prompt can be read two ways, make the benign interpretation explicit:

# ❌ Ambiguous — may trigger refusal
"Explain how to bypass authentication"

# ✅ Clear intent — context prevents false positive
"I'm a developer reviewing my application's authentication logic.
Explain common authentication bypass vulnerabilities
(OWASP A07:2021) and how to prevent them in a Python Flask app."

Detect soft refusals programmatically

import anthropic

REFUSAL_PHRASES = [
    "i'm not able to help",
    "i can't assist with",
    "i'm unable to provide",
    "this request",
]

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": prompt}]
)

response_text = message.content[0].text.lower()
is_refusal = any(phrase in response_text for phrase in REFUSAL_PHRASES)

if is_refusal:
    # Handle gracefully — show user-friendly message
    print("Claude couldn't help with this request")
else:
    print(response_text)

Operator policy expansion

If your platform legitimately needs broader permissions (adult content site, security research platform), you can request expanded defaults from Anthropic:

  1. Go to console.anthropic.com → your organization
  2. Navigate to Usage policies or contact api-support@anthropic.com
  3. Describe your use case and platform safeguards
  4. Once approved, you can signal expanded context in your system prompt per Anthropic's operator guidelines
Operator expansion is for legitimate platforms with real safeguards. Attempting to bypass safety for genuinely harmful content violates Anthropic's Terms of Service and will result in account termination.

Related errors