content_policy_violation
Claude refused to process your request because the content triggered Anthropic's safety classifier. Some refusals are hard limits; many are soft and can be resolved by adding context to your system prompt.
What it looks like (hard refusal — HTTP 400)
{
"type": "error",
"error": {
"type": "invalid_request_error",
"message": "Could not process request due to content policy"
}
}
HTTP status: 400. This is a hard block — the request was rejected before Claude processed it.
What it looks like (soft refusal — HTTP 200)
{
"id": "msg_...",
"type": "message",
"role": "assistant",
"content": [{
"type": "text",
"text": "I'm not able to help with that. This request..."
}],
"stop_reason": "end_turn",
"usage": { ... }
}
HTTP status: 200, but Claude's response IS the refusal. You're billed for the tokens regardless.
content[0].text for refusal language if your application needs to handle them programmatically.What triggers safety refusals
- Hard limits (cannot be bypassed): CSAM, WMD synthesis details, critical infrastructure attacks, undermining AI oversight
- Default-off, unlockable by operators: explicit adult content, detailed security exploit code, graphic violence in fiction
- False positives (common): medical procedures, legal advice, security research, red-teaming, historical violence, fiction with dark themes
Fix: Add a system prompt with use-case context
The single most effective fix for legitimate use cases is a clear system prompt that establishes your context. Claude's classifier uses the system prompt as a prior:
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
# Add a system prompt that explains your legitimate use case
system="""You are a security research assistant for a professional
penetration testing firm. Users are certified security professionals.
Help them understand vulnerability classes, CVEs, and defensive
mitigations for their authorized security assessments.""",
messages=[{
"role": "user",
"content": "Explain how SQL injection works and how to test for it"
}]
)
Fix: Reframe ambiguous requests
If your prompt can be read two ways, make the benign interpretation explicit:
# ❌ Ambiguous — may trigger refusal
"Explain how to bypass authentication"
# ✅ Clear intent — context prevents false positive
"I'm a developer reviewing my application's authentication logic.
Explain common authentication bypass vulnerabilities
(OWASP A07:2021) and how to prevent them in a Python Flask app."
Detect soft refusals programmatically
import anthropic
REFUSAL_PHRASES = [
"i'm not able to help",
"i can't assist with",
"i'm unable to provide",
"this request",
]
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}]
)
response_text = message.content[0].text.lower()
is_refusal = any(phrase in response_text for phrase in REFUSAL_PHRASES)
if is_refusal:
# Handle gracefully — show user-friendly message
print("Claude couldn't help with this request")
else:
print(response_text)
Operator policy expansion
If your platform legitimately needs broader permissions (adult content site, security research platform), you can request expanded defaults from Anthropic:
- Go to console.anthropic.com → your organization
- Navigate to Usage policies or contact api-support@anthropic.com
- Describe your use case and platform safeguards
- Once approved, you can signal expanded context in your system prompt per Anthropic's operator guidelines
Related errors
- invalid_request_error (400) — malformed request (different from safety)
- permission_error (403) — API key lacks required permissions
- overloaded_error (529) — server capacity, not safety