Introduction
As a long-running AI gateway daemon, OpenClaw's resource consumption directly impacts server costs and service stability. This article systematically covers how to optimize OpenClaw's resource usage across four dimensions — memory, CPU, network, and storage — so your service runs efficiently within limited resources.
1. Memory Optimization
1.1 Understanding Memory Consumption Sources
OpenClaw's memory consumption primarily comes from:
- Conversation history cache: Context for each active user's conversations
- Channel connection pool: WebSocket persistent connections to various messaging platforms
- Skill runtime: Loaded skill modules
- Request queue: Message buffer awaiting processing
Check current memory usage:
# Via the health check endpoint
curl -s http://localhost:18789/health/detail | jq '.memory'
# Example output
# {
# "heapUsed": "128MB",
# "heapTotal": "256MB",
# "rss": "310MB",
# "external": "15MB"
# }
1.2 Limiting Node.js Heap Memory
Control the V8 engine's maximum heap memory via the NODE_OPTIONS environment variable:
// ~/.config/openclaw/openclaw.json5
{
"runtime": {
"nodeOptions": "--max-old-space-size=384" // in MB
}
}
Or set it via environment variable:
export NODE_OPTIONS="--max-old-space-size=384"
openclaw restart
Recommended values:
| Server Memory | Suggested Heap Limit | Use Case |
|---|---|---|
| 512MB | 256MB | Personal use, 1-2 channels |
| 1GB | 384MB | Small team, 3-5 channels |
| 2GB | 512MB | Medium load, multiple channels |
| 4GB+ | 1024MB | High-load production |
1.3 Controlling Conversation History Length
Conversation history is a major memory consumer. The longer each user's history, the greater the token consumption and memory usage per API call:
// ~/.config/openclaw/openclaw.json5
{
"conversation": {
// Maximum conversation turns to keep
"maxHistory": 20,
// Maximum token limit (older messages are truncated when exceeded)
"maxTokens": 8000,
// Conversation timeout (history cleared after this idle period)
"idleTimeout": "30m"
}
}
For resource-constrained environments, you can compress further:
{
"conversation": {
"maxHistory": 10,
"maxTokens": 4000,
"idleTimeout": "15m",
// Enable history summary compression: automatically summarizes old conversations when limits are exceeded
"summarize": true
}
}
1.4 Memory Cache Strategy
{
"cache": {
// User profile cache size
"userProfileMaxSize": 100,
// Skill result cache TTL
"skillResultTTL": "5m",
// Maximum cache entries
"maxEntries": 500
}
}
2. CPU Optimization
2.1 Limiting Concurrent Requests
Processing too many requests simultaneously causes CPU spikes and increased response latency. Control concurrency via queuing:
{
"gateway": {
// Maximum concurrent model requests
"maxConcurrentRequests": 5,
// Maximum request queue length
"queueMaxSize": 50,
// Queue wait timeout
"queueTimeout": "30s"
}
}
2.2 Optimizing Skill Loading
Unnecessary skills consume CPU and memory resources. Only load the skills you actually use:
{
"skills": {
// Explicitly specify enabled skills instead of loading all
"enabled": ["weather", "reminder", "rss"],
// Skill execution timeout
"timeout": 10000,
// Skill result caching (reduces redundant computation)
"cache": true
}
}
2.3 Using Systemd CPU Quotas
If you manage OpenClaw via Systemd, you can hard-limit CPU usage:
# /etc/systemd/system/openclaw.service
[Service]
CPUQuota=50% # Limit to 50% CPU max
CPUWeight=80 # Scheduling weight (100 is standard)
2.4 PM2 Cluster Mode
On multi-core servers, use PM2's cluster mode to distribute the load:
// openclaw-ecosystem.config.js
module.exports = {
apps: [{
name: "openclaw",
script: "openclaw",
args: "up",
instances: 2, // Start 2 instances
exec_mode: "cluster",
max_memory_restart: "400M"
}]
};
Note: Cluster mode requires OpenClaw to support multi-instance operation. Make sure shared session storage is configured.
3. Model Invocation Optimization
3.1 Choosing the Right Model
Model selection directly affects response speed and cost:
{
"model": {
// Use a lightweight model for everyday conversations
"default": "claude-3-5-haiku",
// Use an advanced model for complex tasks
"advanced": "claude-3.5-sonnet",
// Automatically switch based on message complexity
"autoSwitch": true,
"autoSwitchThreshold": 100 // Use advanced model when input exceeds 100 characters
}
}
3.2 Enabling Streaming Responses
Streaming responses reduce time-to-first-byte latency and improve user experience:
{
"model": {
"stream": true,
// Streaming chunk send interval (for channels that don't support streaming)
"streamChunkInterval": 500
}
}
3.3 Setting Request Timeouts
Prevent abnormal requests from occupying resources for extended periods:
{
"model": {
"timeout": 30000, // Single request timeout: 30 seconds
"maxRetries": 2, // Maximum retry count
"retryDelay": 2000 // Retry interval: 2 seconds
}
}
4. Storage and Network Optimization
4.1 Log File Control
Log file growth can consume significant disk space:
{
"log": {
"level": "info", // Don't use debug in production
"rotation": {
"maxSize": "30MB",
"maxAge": 14, // Retain for 14 days
"compress": true
}
}
}
4.2 Session Data Cleanup
Periodically clean up expired session data:
# Check session data size
du -sh ~/.openclaw/sessions/
# Clean up sessions inactive for more than 30 days
openclaw session cleanup --older-than 30d
# Set up auto-cleanup
openclaw config set session.autoCleanup true
openclaw config set session.cleanupInterval "24h"
openclaw config set session.maxAge "30d"
4.3 Network Connection Optimization
{
"gateway": {
// HTTP Keep-Alive timeout
"keepAliveTimeout": 30000,
// Connection pool size
"maxSockets": 20,
// Enable gzip compression
"compression": true
}
}
5. Performance Monitoring and Benchmarking
5.1 Built-in Performance Metrics
# View runtime performance statistics
curl -s http://localhost:18789/health/stats | jq .
# Example output
# {
# "avgResponseTime": "1.8s",
# "p95ResponseTime": "3.2s",
# "messagesPerMinute": 12,
# "activeConnections": 3,
# "queueLength": 0
# }
5.2 Resource Usage Trends
Combine with Prometheus metrics to monitor resource usage trends:
# View memory trend over the past 24 hours
openclaw stats --metric memory --period 24h
# View response time trend
openclaw stats --metric latency --period 7d
5.3 Verifying Optimization Results
After each configuration adjustment, it's recommended to observe the following metrics for at least 24 hours:
| Metric | Healthy Range | Requires Attention |
|---|---|---|
| Heap memory usage | < 70% | > 85% |
| Average response time | < 3s | > 5s |
| Error rate | < 1% | > 5% |
| CPU usage | < 40% | > 70% |
| Queue length | 0-5 | > 20 |
6. Optimization for Low-Spec Servers
For entry-level VPS instances with 512MB of memory, the following aggressive optimization configuration is recommended:
{
"runtime": {
"nodeOptions": "--max-old-space-size=256"
},
"conversation": {
"maxHistory": 8,
"maxTokens": 3000,
"idleTimeout": "10m",
"summarize": true
},
"gateway": {
"maxConcurrentRequests": 2,
"queueMaxSize": 20
},
"model": {
"default": "claude-3-5-haiku",
"stream": true,
"timeout": 20000,
"maxRetries": 1
},
"skills": {
"enabled": [],
"cache": true
},
"log": {
"level": "warn",
"rotation": {
"maxSize": "10MB",
"maxAge": 7
}
}
}
With the optimizations above, OpenClaw can run stably on a server with 512MB of memory while maintaining acceptable response speeds. Gradually adjust the parameters based on your actual load to find the best balance between performance and resources.