timeout_error

Network Retryable Config

The request to Claude exceeded the configured timeout before a response was returned. Either the response was genuinely taking too long (use streaming), or your client timeout is shorter than needed for the request complexity.

What it looks like (SDK exception)

# Python SDK
anthropic.APITimeoutError: Request timed out after 600.0 seconds

# Node.js SDK
APIError: Request timed out

# Raw HTTP (if no SDK)
# The connection hangs, then drops — no JSON body returned

Why it happens

  • High max_tokens with a large model — Claude Opus 4.7 with max_tokens=8192 on a complex prompt can take 60–120 seconds
  • Upstream proxy timeout — Vercel Edge, Cloudflare Workers, API Gateways have their own timeouts (typically 30–60s) shorter than the SDK default
  • Default client timeout in your framework — requests, fetch, axios — all have their own defaults that may conflict
  • Network instability — transient; retry with backoff

Fix 1: Use streaming (best practice for long responses)

Streaming keeps the connection alive and delivers tokens as they arrive. No idle-connection timeout fires because data keeps flowing.

import anthropic

client = anthropic.Anthropic()

# ✅ Stream long responses — no timeout issues
with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=[{"role": "user", "content": "Write a detailed analysis..."}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

# Or collect the full message after streaming
final_message = stream.get_final_message()

Fix 2: Increase timeout in the SDK

import anthropic
import httpx

# Per-client (applies to all requests from this client)
client = anthropic.Anthropic(
    timeout=httpx.Timeout(
        connect=10.0,    # connection timeout
        read=300.0,      # read timeout (waiting for response)
        write=10.0,      # request write timeout
        pool=10.0,       # connection pool timeout
    )
)

# Or a single float (sets all timeouts to this value)
client = anthropic.Anthropic(timeout=300.0)

# Per-request override
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=[{"role": "user", "content": "..."}],
    timeout=300.0,  # override just for this call
)

Fix 2 (TypeScript / Node.js)

import Anthropic from "@anthropic-ai/sdk";

// Per-client
const client = new Anthropic({
  timeout: 300 * 1000, // 300 seconds in ms
});

// Per-request
const message = await client.messages.create(
  {
    model: "claude-opus-4-7",
    max_tokens: 4096,
    messages: [{ role: "user", content: "..." }],
  },
  { timeout: 300 * 1000 }
);

Fix 3: Reduce response length

Lower max_tokens to reduce generation time, or split a large generation task into smaller sequential requests:

# Instead of one 8192-token request:
response = client.messages.create(max_tokens=8192, ...)

# Split into smaller chunks with a continuation prompt:
part1 = client.messages.create(max_tokens=2048, ...)
# Extract last paragraph of part1 and continue
part2 = client.messages.create(
    max_tokens=2048,
    messages=[
        {"role": "user", "content": original_prompt},
        {"role": "assistant", "content": part1.content[0].text},
        {"role": "user", "content": "Continue from where you left off."},
    ]
)

Retry on timeout

import anthropic
import time

client = anthropic.Anthropic()

def call_with_retry(prompt: str, max_attempts: int = 3) -> str:
    for attempt in range(max_attempts):
        try:
            message = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=2048,
                messages=[{"role": "user", "content": prompt}],
                timeout=120.0,
            )
            return message.content[0].text
        except anthropic.APITimeoutError:
            if attempt == max_attempts - 1:
                raise
            wait = 2 ** attempt * 5  # 5s, 10s, 20s
            print(f"Timeout, retrying in {wait}s...")
            time.sleep(wait)

Vercel / serverless timeout workaround

Vercel Hobby functions have a 10s execution limit; Pro has 60s. For Claude generation that takes longer, use one of these strategies:

  • Edge functions with streaming — Vercel Edge supports streaming responses. Stream Claude output directly to the client to avoid the function timeout
  • Background job queue — accept the user request, enqueue a background job (Inngest, Upstash QStash), and poll for the result
  • Upgrade to Vercel Pro — increases function timeout to 300s

Related errors