overloaded_error
Claude's servers are temporarily at capacity and cannot accept your request. This is not a problem with your code — it's a transient Anthropic server state. Wait and retry.
What the error looks like
{
"type": "error",
"error": {
"type": "overloaded_error",
"message": "Overloaded"
}
}
HTTP status: 529. The response body always contains "type": "overloaded_error".
Why it happens
- Anthropic's inference servers are at peak capacity
- Spikes in global API usage (model releases, viral demos, batch jobs)
- Your request hit a server that was already at its queue limit
Unlike
rate_limit_error, this is not your fault — it affects all API users. The SDK retries it by default.Fix: Exponential Backoff (Python)
The Anthropic Python SDK retries 529 automatically (default: 2 retries). Raise max_retries for high-availability workloads:
import anthropic
import time
client = anthropic.Anthropic(
api_key="your-key",
max_retries=5, # SDK handles backoff automatically
)
# Manual backoff if you need full control:
def call_with_backoff(prompt, max_attempts=6):
delay = 30
for attempt in range(max_attempts):
try:
return client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
except anthropic.APIStatusError as e:
if e.status_code == 529 and attempt < max_attempts - 1:
print(f"Overloaded, waiting {delay}s (attempt {attempt+1})")
time.sleep(delay)
delay = min(delay * 2, 300) # cap at 5 min
else:
raise
Fix: Exponential Backoff (TypeScript)
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
maxRetries: 5, // SDK retries 529 automatically
});
// Manual control if needed:
async function callWithBackoff(prompt: string, maxAttempts = 6) {
let delay = 30_000; // 30s
for (let attempt = 0; attempt < maxAttempts; attempt++) {
try {
return await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: [{ role: "user", content: prompt }],
});
} catch (err: any) {
if (err?.status === 529 && attempt < maxAttempts - 1) {
console.log(`Overloaded, waiting ${delay/1000}s`);
await new Promise(r => setTimeout(r, delay));
delay = Math.min(delay * 2, 300_000); // cap 5 min
} else throw err;
}
}
}
Check the Retry-After header
import requests
response = requests.post(
"https://api.anthropic.com/v1/messages",
headers={"x-api-key": "...", "anthropic-version": "2023-06-01"},
json={...}
)
if response.status_code == 529:
retry_after = int(response.headers.get("Retry-After", 60))
print(f"Retry after {retry_after} seconds")
time.sleep(retry_after)
FAQ
Does the SDK retry overloaded_error automatically?
Yes. The official Anthropic Python and TypeScript SDKs retry 529 errors up to 2 times with backoff by default. Set
max_retries=5 for production workloads.overloaded_error vs rate_limit_error — what's the difference?
overloaded_error (529) = Anthropic's global capacity is full, affects all users. rate_limit_error (429) = you specifically exceeded your account's RPM/TPM limit. Both need backoff.How long until overloaded_error resolves?
Usually 30–120 seconds. Sustained overload during major incidents can last longer — check status.anthropic.com for active incidents.
Can I avoid overloaded_error by using a different model?
Overload is per-model. If claude-sonnet is overloaded, claude-haiku may be available. Add a fallback model if uptime is critical.