Introduction
Understanding OpenClaw's resource consumption is key to ensuring service stability. This article covers how to comprehensively monitor OpenClaw's resource usage — including memory, CPU, network connections, and message throughput — and configure automated alert notifications when metrics become abnormal.
1. Built-in Resource Monitoring
1.1 The openclaw stats Command
OpenClaw provides a built-in statistics command for a quick overview of resource usage:
# View current resource usage
openclaw stats
# Example output
# ┌─────────────────────────────────────────┐
# │ OpenClaw Runtime Statistics │
# ├─────────────────────────────────────────┤
# │ Uptime: 3d 12h 45m │
# │ Process PID: 12345 │
# │ Memory (Heap): 168MB / 512MB (32%) │
# │ Memory (RSS): 245MB │
# │ CPU (1m avg): 2.3% │
# │ Active Channels: 3 / 3 │
# │ Today's Messages: 342 │
# │ Today's Tokens: 125,800 │
# │ Today's Cost: $1.85 │
# │ Avg Response: 1.8s │
# │ Error Rate: 0.3% │
# └─────────────────────────────────────────┘
1.2 Real-Time Monitoring Dashboard
# Start real-time monitoring (similar to the top command)
openclaw stats --live
# Custom refresh interval (seconds)
openclaw stats --live --interval 5
The real-time dashboard continuously updates the following metrics:
- Memory usage trend graph (ASCII chart)
- Messages processed per minute
- Current active connections
- API call latency
- Error count
1.3 Historical Statistics Queries
# View message statistics for the past 24 hours
openclaw stats --period 24h
# View resource trends for the past 7 days
openclaw stats --period 7d --metric memory
# Export statistics as CSV
openclaw stats --period 30d --format csv > openclaw-stats.csv
2. HTTP API Monitoring Endpoints
2.1 Retrieving Runtime Metrics
# Basic runtime metrics
curl -s http://localhost:18789/health/stats | jq .
Response data:
{
"uptime": 302400,
"memory": {
"heapUsed": 168000000,
"heapTotal": 536870912,
"rss": 257000000,
"external": 15000000
},
"cpu": {
"user": 125000,
"system": 45000,
"percent": 2.3
},
"messages": {
"today": 342,
"thisHour": 28,
"total": 15680
},
"tokens": {
"today": {
"input": 89500,
"output": 36300
}
},
"responseTime": {
"avg": 1800,
"p50": 1500,
"p95": 3200,
"p99": 5100
},
"errors": {
"today": 3,
"rate": 0.003
}
}
2.2 Channel-Level Statistics
# Get message statistics per channel
curl -s http://localhost:18789/health/channels | jq .
{
"channels": [
{
"name": "whatsapp",
"status": "connected",
"uptime": 302400,
"messagesReceived": 180,
"messagesSent": 175,
"avgResponseTime": 1650,
"errors": 1
},
{
"name": "telegram",
"status": "connected",
"uptime": 302400,
"messagesReceived": 120,
"messagesSent": 118,
"avgResponseTime": 1950,
"errors": 2
}
]
}
3. Prometheus Metrics Collection
3.1 Enabling the Prometheus Endpoint
// ~/.config/openclaw/openclaw.json5
{
"monitoring": {
"prometheus": {
"enabled": true,
"port": 9191,
"path": "/metrics"
}
}
}
3.2 Key Prometheus Metrics
Complete list of metrics exported by OpenClaw:
Message Processing Metrics:
| Metric Name | Type | Description |
|---|---|---|
openclaw_messages_received_total |
Counter | Total received messages (labeled by channel) |
openclaw_messages_sent_total |
Counter | Total sent messages |
openclaw_messages_failed_total |
Counter | Failed message count |
Model Call Metrics:
| Metric Name | Type | Description |
|---|---|---|
openclaw_model_requests_total |
Counter | Total model API calls |
openclaw_model_errors_total |
Counter | Model API error count |
openclaw_model_duration_seconds |
Histogram | Model response time distribution |
openclaw_model_tokens_total |
Counter | Total token usage (input/output labels) |
Resource Metrics:
| Metric Name | Type | Description |
|---|---|---|
openclaw_memory_heap_bytes |
Gauge | Heap memory usage |
openclaw_memory_rss_bytes |
Gauge | Resident set size |
openclaw_active_connections |
Gauge | Active connection count |
openclaw_queue_length |
Gauge | Request queue length |
3.3 Useful PromQL Queries
# Messages processed per minute
rate(openclaw_messages_received_total[5m]) * 60
# Message volume by channel
sum by (channel) (increase(openclaw_messages_received_total[24h]))
# Model call P95 latency
histogram_quantile(0.95, rate(openclaw_model_duration_seconds_bucket[5m]))
# Error rate
rate(openclaw_model_errors_total[5m]) / rate(openclaw_model_requests_total[5m])
# Memory usage percentage
openclaw_memory_heap_bytes / openclaw_memory_heap_max_bytes * 100
# Token consumption rate (per hour)
rate(openclaw_model_tokens_total[1h]) * 3600
4. Alert Rule Configuration
4.1 Threshold-Based Alerts
Set alert rules in the OpenClaw configuration:
{
"alerts": {
"enabled": true,
"rules": [
{
"name": "High Memory Usage",
"condition": "memory.heapPercent > 85",
"duration": "10m",
"severity": "warning",
"message": "Memory usage exceeds 85%, current: {value}%"
},
{
"name": "Slow Response",
"condition": "responseTime.p95 > 5000",
"duration": "5m",
"severity": "warning",
"message": "P95 response time exceeds 5 seconds, current: {value}ms"
},
{
"name": "High Error Rate",
"condition": "errors.rate > 0.05",
"duration": "5m",
"severity": "critical",
"message": "Error rate exceeds 5%, current: {value}"
},
{
"name": "Channel Disconnected",
"condition": "channels.disconnected > 0",
"duration": "3m",
"severity": "critical",
"message": "{value} channel(s) disconnected"
},
{
"name": "Queue Backlog",
"condition": "queue.length > 30",
"duration": "2m",
"severity": "warning",
"message": "Request queue backlog: {value} items"
}
]
}
}
4.2 Alert Notification Channels
{
"alerts": {
"notifications": [
{
"type": "telegram",
"botToken": "YOUR_BOT_TOKEN",
"chatId": "YOUR_CHAT_ID",
// Only receive critical-level alerts
"minSeverity": "critical"
},
{
"type": "webhook",
"url": "https://hooks.slack.com/services/xxx",
"minSeverity": "warning"
},
{
"type": "email",
"to": "[email protected]",
"minSeverity": "critical"
}
],
// Alert throttling: minimum interval between identical alerts
"throttle": "15m",
// Send notification on resolution
"notifyOnResolve": true
}
}
4.3 Grafana Alert Rules
If using Grafana, you can configure more flexible alerts:
# Grafana alert rules
groups:
- name: openclaw
rules:
- alert: OpenClawHighMemory
expr: openclaw_memory_heap_bytes > 400 * 1024 * 1024
for: 10m
labels:
severity: warning
annotations:
summary: "OpenClaw memory usage too high"
description: "Heap memory: {{ $value | humanizeBytes }}"
- alert: OpenClawMessageBacklog
expr: openclaw_queue_length > 20
for: 5m
labels:
severity: warning
annotations:
summary: "OpenClaw message queue backlog"
- alert: OpenClawDown
expr: up{job="openclaw"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "OpenClaw service unavailable"
5. Message Volume and Cost Statistics
5.1 Daily Report
# View today's summary statistics
openclaw stats --period today --summary
# Output
# Today's Statistics (2026-03-14)
# ──────────────────────
# Total Messages: 342
# Token Usage: 125,800 (input: 89,500 / output: 36,300)
# Estimated Cost: $1.85
# Avg Response: 1.8s
# Slowest Response: 6.2s
# Errors: 3 (0.9%)
# Active Users: 28
5.2 Cost Forecasting
# View this month's cost trend and forecast
openclaw stats --cost --period month
# Output
# Monthly Cost Statistics
# ──────────────────────
# Spent: $42.50
# Daily Average: $3.04
# Month-End Forecast: $94.20
# Highest Cost Channel: WhatsApp ($22.30)
# Highest Cost User: user_abc ($8.50)
6. Monitoring Architecture Recommendations
Choose a monitoring approach that matches your deployment scale:
Personal / Small Team (1-5 users):
- Use
openclaw statsfor manual checks - cron + watchdog scripts for basic health checks
- Telegram alert notifications
Medium Scale (5-50 users):
- Enable Prometheus metrics collection
- Deploy a Grafana dashboard
- Configure multi-level alert rules
- Regularly review cost reports
Large Scale / Enterprise:
- Full Prometheus + Grafana + Alertmanager stack
- Integrate with enterprise monitoring platforms (Datadog/New Relic)
- Centralized log collection (ELK/Loki)
- SLA monitoring and automated operations
Choose a monitoring approach that matches your scale — don't over-engineer, but don't leave blind spots on critical metrics either. Continuous monitoring is the foundation for keeping OpenClaw running stably.