timeout_error
The request to Claude exceeded the configured timeout before a response was returned. Either the response was genuinely taking too long (use streaming), or your client timeout is shorter than needed for the request complexity.
What it looks like (SDK exception)
# Python SDK
anthropic.APITimeoutError: Request timed out after 600.0 seconds
# Node.js SDK
APIError: Request timed out
# Raw HTTP (if no SDK)
# The connection hangs, then drops — no JSON body returned
Why it happens
- High max_tokens with a large model — Claude Opus 4.7 with max_tokens=8192 on a complex prompt can take 60–120 seconds
- Upstream proxy timeout — Vercel Edge, Cloudflare Workers, API Gateways have their own timeouts (typically 30–60s) shorter than the SDK default
- Default client timeout in your framework — requests, fetch, axios — all have their own defaults that may conflict
- Network instability — transient; retry with backoff
Fix 1: Use streaming (best practice for long responses)
Streaming keeps the connection alive and delivers tokens as they arrive. No idle-connection timeout fires because data keeps flowing.
import anthropic
client = anthropic.Anthropic()
# ✅ Stream long responses — no timeout issues
with client.messages.stream(
model="claude-opus-4-7",
max_tokens=4096,
messages=[{"role": "user", "content": "Write a detailed analysis..."}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
# Or collect the full message after streaming
final_message = stream.get_final_message()
Fix 2: Increase timeout in the SDK
import anthropic
import httpx
# Per-client (applies to all requests from this client)
client = anthropic.Anthropic(
timeout=httpx.Timeout(
connect=10.0, # connection timeout
read=300.0, # read timeout (waiting for response)
write=10.0, # request write timeout
pool=10.0, # connection pool timeout
)
)
# Or a single float (sets all timeouts to this value)
client = anthropic.Anthropic(timeout=300.0)
# Per-request override
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
messages=[{"role": "user", "content": "..."}],
timeout=300.0, # override just for this call
)
Fix 2 (TypeScript / Node.js)
import Anthropic from "@anthropic-ai/sdk";
// Per-client
const client = new Anthropic({
timeout: 300 * 1000, // 300 seconds in ms
});
// Per-request
const message = await client.messages.create(
{
model: "claude-opus-4-7",
max_tokens: 4096,
messages: [{ role: "user", content: "..." }],
},
{ timeout: 300 * 1000 }
);
Fix 3: Reduce response length
Lower max_tokens to reduce generation time, or split a large generation task into smaller sequential requests:
# Instead of one 8192-token request:
response = client.messages.create(max_tokens=8192, ...)
# Split into smaller chunks with a continuation prompt:
part1 = client.messages.create(max_tokens=2048, ...)
# Extract last paragraph of part1 and continue
part2 = client.messages.create(
max_tokens=2048,
messages=[
{"role": "user", "content": original_prompt},
{"role": "assistant", "content": part1.content[0].text},
{"role": "user", "content": "Continue from where you left off."},
]
)
Retry on timeout
import anthropic
import time
client = anthropic.Anthropic()
def call_with_retry(prompt: str, max_attempts: int = 3) -> str:
for attempt in range(max_attempts):
try:
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
messages=[{"role": "user", "content": prompt}],
timeout=120.0,
)
return message.content[0].text
except anthropic.APITimeoutError:
if attempt == max_attempts - 1:
raise
wait = 2 ** attempt * 5 # 5s, 10s, 20s
print(f"Timeout, retrying in {wait}s...")
time.sleep(wait)
Vercel / serverless timeout workaround
Vercel Hobby functions have a 10s execution limit; Pro has 60s. For Claude generation that takes longer, use one of these strategies:
- Edge functions with streaming — Vercel Edge supports streaming responses. Stream Claude output directly to the client to avoid the function timeout
- Background job queue — accept the user request, enqueue a background job (Inngest, Upstash QStash), and poll for the result
- Upgrade to Vercel Pro — increases function timeout to 300s
Related errors
- overloaded_error (529) — server busy, also retryable
- api_error (500) — server error mid-response
- rate_limit_error (429) — too many requests