OpenClaw API Rate Limiting and 429 Error Handling

Problem Description

When using OpenClaw, if multiple users send messages simultaneously or message volume is high, the following errors may appear in the logs:

[openclaw:gateway] Error calling model API: 429 Too Many Requests
[openclaw:gateway] Rate limit exceeded. Retry after 20 seconds.
[openclaw:gateway] Headers: x-ratelimit-remaining-requests: 0, x-ratelimit-reset-requests: 20s

On the user side, this manifests as noticeably increased bot response times or an error message:

Bot reply: The service is temporarily busy. Please try again later.

A 429 error indicates that your API requests have exceeded the model provider's rate limits. Different AI model providers have different rate limiting policies, typically including requests per minute (RPM), tokens per minute (TPM), and daily total call limits.

Provider Rate Limits

Common limits (for free/lower tiers):

Provider	RPM	TPM	Daily Limit
OpenAI (Tier 1)	500	30,000	None
Anthropic (Build)	50	40,000	None
Google Gemini (Free)	15	32,000	1,500

Actual limits depend on your account tier and the specific model being used.

Diagnostic Steps

Check OpenClaw's rate limit statistics:

openclaw stats rate-limit

Example output:

Provider    Model           RPM Used    RPM Limit    Status
openai      gpt-4o          45/60       60           ⚠️ Warning
anthropic   claude-sonnet   12/50       50           ✓ OK

Enable verbose API logging to track request frequency:

DEBUG=openclaw:api* openclaw start

Check request logs from the past hour:

openclaw logs --filter "api" --since 1h

Solutions

Solution 1: Enable the Built-in Rate Limiter

OpenClaw has a built-in rate limiter that proactively controls request frequency before sending, preventing you from hitting provider limits:

{
  "api": {
    "rateLimiter": {
      "enabled": true,
      "strategy": "sliding-window",
      "limits": {
        "openai": {
          "requestsPerMinute": 50,
          "tokensPerMinute": 25000
        },
        "anthropic": {
          "requestsPerMinute": 40,
          "tokensPerMinute": 35000
        }
      }
    }
  }
}

It is recommended to set limits to 80% of the provider's actual quota to leave a buffer.

Solution 2: Configure Automatic Retry with Backoff

When a 429 error is encountered, have OpenClaw automatically wait and retry:

{
  "api": {
    "retry": {
      "enabled": true,
      "maxRetries": 3,
      "backoff": {
        "type": "exponential",
        "initialDelay": 5000,
        "maxDelay": 60000,
        "factor": 2
      },
      "retryOn": [429, 500, 502, 503]
    }
  }
}

With this configuration, when a 429 response is received, OpenClaw will wait 5 seconds before the first retry, 10 seconds before the second, and 20 seconds before the third.

Solution 3: Configure Per-User Rate Limits

Limit each user's message sending frequency to reduce API calls at the source:

{
  "users": {
    "rateLimit": {
      "messagesPerMinute": 10,
      "cooldownMessage": "You are sending messages too frequently. Please wait {remaining} seconds before trying again."
    }
  }
}

Solution 4: Multiple API Key Rotation

Configure multiple API keys for rotation to multiply your available quota:

{
  "providers": {
    "openai": {
      "apiKeys": [
        "sk-key-1-xxxxx",
        "sk-key-2-xxxxx",
        "sk-key-3-xxxxx"
      ],
      "keyRotation": "round-robin"
    }
  }
}

OpenClaw supports two rotation strategies:

round-robin: Use each key in sequential order
least-used: Prioritize the key with the lowest current usage

If a key encounters a 429 error, OpenClaw will automatically skip that key and use the next available one.

Solution 5: Set Up Model Fallback

When the primary model is rate-limited, automatically fall back to an alternative model:

{
  "models": {
    "default": {
      "provider": "openai",
      "model": "gpt-4o",
      "fallback": [
        {
          "provider": "anthropic",
          "model": "claude-sonnet-4-20250514"
        },
        {
          "provider": "openai",
          "model": "gpt-4o-mini"
        }
      ]
    }
  }
}

When gpt-4o returns a 429 error, OpenClaw will automatically try the next model in the fallback list to handle the request.

Solution 6: Configure a Request Queue

Queue concurrent requests and process them sequentially to avoid instantaneous request spikes:

{
  "api": {
    "queue": {
      "enabled": true,
      "concurrency": 3,
      "maxQueueSize": 100,
      "timeout": 120000
    }
  }
}

concurrency controls the number of simultaneous API requests; requests exceeding this limit will be queued.

Monitoring Recommendations

Set up dashboards or alerts to monitor API usage:

# View real-time API usage statistics
openclaw stats api --watch

# Export a usage report
openclaw stats api --export csv --period 7d > api-usage.csv

Regularly reviewing API usage reports helps you plan quotas appropriately and upgrade API account tiers in a timely manner.