What HTTP status does content_policy_violation return?

Anthropic API content refusals return HTTP 400 with error type 'invalid_request_error' or they return HTTP 200 with a stop_reason of 'end_turn' where the message body IS the refusal text. Hard safety violations return a 400 error; soft refusals are streamed as a normal response.

Why is Claude refusing my legitimate request?

Claude's safety classifier can produce false positives on: medical/legal/security topics framed without context, fiction involving violence, red-teaming or security research, and jailbreak-looking prompts even if benign. Adding a clear system prompt with your use case (e.g. 'You are a medical information assistant for healthcare providers') dramatically reduces false positives.

What content is permanently blocked in the Claude API?

Anthropic's absolute limits include: CSAM, detailed instructions for weapons of mass destruction (bio/chem/nuclear/radiological), attacks on critical infrastructure, and content that undermines AI oversight. These cannot be unlocked by any system prompt or API tier.

Can I reduce Claude's safety refusals for my platform?

Yes. API operators can expand Claude's defaults for adult content platforms and security research through Anthropic's usage policy agreement. Submit an operator use-case at console.anthropic.com if your platform needs expanded defaults. Adding a detailed system prompt that explains your legitimate use case also helps significantly.

content_policy_violation

HTTP 400 Safety Policy

Claude refused to process your request because the content triggered Anthropic's safety classifier. Some refusals are hard limits; many are soft and can be resolved by adding context to your system prompt.

What it looks like (hard refusal — HTTP 400)

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "Could not process request due to content policy"
  }
}

HTTP status: 400. This is a hard block — the request was rejected before Claude processed it.

What it looks like (soft refusal — HTTP 200)

{
  "id": "msg_...",
  "type": "message",
  "role": "assistant",
  "content": [{
    "type": "text",
    "text": "I'm not able to help with that. This request..."
  }],
  "stop_reason": "end_turn",
  "usage": { ... }
}

HTTP status: 200, but Claude's response IS the refusal. You're billed for the tokens regardless.

Soft refusals look like normal API responses — check the content[0].text for refusal language if your application needs to handle them programmatically.

What triggers safety refusals

Hard limits (cannot be bypassed): CSAM, WMD synthesis details, critical infrastructure attacks, undermining AI oversight
Default-off, unlockable by operators: explicit adult content, detailed security exploit code, graphic violence in fiction
False positives (common): medical procedures, legal advice, security research, red-teaming, historical violence, fiction with dark themes

Fix: Add a system prompt with use-case context

The single most effective fix for legitimate use cases is a clear system prompt that establishes your context. Claude's classifier uses the system prompt as a prior:

import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=2048,
    # Add a system prompt that explains your legitimate use case
    system="""You are a security research assistant for a professional
penetration testing firm. Users are certified security professionals.
Help them understand vulnerability classes, CVEs, and defensive
mitigations for their authorized security assessments.""",
    messages=[{
        "role": "user",
        "content": "Explain how SQL injection works and how to test for it"
    }]
)

Fix: Reframe ambiguous requests

If your prompt can be read two ways, make the benign interpretation explicit:

# ❌ Ambiguous — may trigger refusal
"Explain how to bypass authentication"

# ✅ Clear intent — context prevents false positive
"I'm a developer reviewing my application's authentication logic.
Explain common authentication bypass vulnerabilities
(OWASP A07:2021) and how to prevent them in a Python Flask app."

Detect soft refusals programmatically

import anthropic

REFUSAL_PHRASES = [
    "i'm not able to help",
    "i can't assist with",
    "i'm unable to provide",
    "this request",
]

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": prompt}]
)

response_text = message.content[0].text.lower()
is_refusal = any(phrase in response_text for phrase in REFUSAL_PHRASES)

if is_refusal:
    # Handle gracefully — show user-friendly message
    print("Claude couldn't help with this request")
else:
    print(response_text)

Operator policy expansion

If your platform legitimately needs broader permissions (adult content site, security research platform), you can request expanded defaults from Anthropic:

Go to console.anthropic.com → your organization
Navigate to Usage policies or contact api-support@anthropic.com
Describe your use case and platform safeguards
Once approved, you can signal expanded context in your system prompt per Anthropic's operator guidelines

Operator expansion is for legitimate platforms with real safeguards. Attempting to bypass safety for genuinely harmful content violates Anthropic's Terms of Service and will result in account termination.

Related errors

invalid_request_error (400) — malformed request (different from safety)
permission_error (403) — API key lacks required permissions
overloaded_error (529) — server capacity, not safety

HTTP Code	`400` or `200`
Error type	`invalid_request_error`
Retryable?	No — reframe prompt
Hard limits?	Yes (WMD, CSAM)