健康检查概述
健康检查 API 是 OpenClaw Gateway 提供的内置端点,用于报告服务的运行状态。它被负载均衡器、监控系统和容器编排平台用来判断服务是否正常工作。
基础健康检查
默认情况下,OpenClaw 在 /health 路径提供健康检查:
curl http://localhost:3000/health
响应:
{
"status": "ok",
"timestamp": "2026-03-11T08:00:00Z"
}
HTTP 状态码:
200:服务健康503:服务不健康
详细健康检查
获取更详细的健康信息:
curl http://localhost:3000/health/detailed
响应:
{
"status": "ok",
"version": "1.2.3",
"uptime": 86400,
"timestamp": "2026-03-11T08:00:00Z",
"checks": {
"gateway": {
"status": "ok",
"responseTime": "2ms"
},
"providers": {
"openai": {
"status": "ok",
"lastCheck": "2026-03-11T07:59:30Z",
"responseTime": "180ms"
},
"anthropic": {
"status": "ok",
"lastCheck": "2026-03-11T07:59:30Z",
"responseTime": "210ms"
}
},
"channels": {
"telegram": {
"status": "ok",
"connected": true
},
"discord": {
"status": "degraded",
"connected": true,
"warning": "High latency detected"
}
},
"memory": {
"status": "ok",
"used": "128MB",
"limit": "512MB",
"percentage": 25
},
"disk": {
"status": "ok",
"used": "2.1GB",
"available": "47.9GB"
}
}
}
配置健康检查
{
"gateway": {
"healthCheck": {
"enabled": true,
"path": "/health",
"detailedPath": "/health/detailed",
"interval": 30000,
"timeout": 5000,
"checks": {
"providers": true,
"channels": true,
"memory": true,
"disk": true
},
"thresholds": {
"memoryWarning": 80,
"memoryCritical": 95,
"diskWarning": 85,
"diskCritical": 95,
"providerTimeout": 5000
}
}
}
}
就绪检查与存活检查
OpenClaw 区分两种健康检查类型:
存活检查(Liveness):服务进程是否在运行
curl http://localhost:3000/health/live
# 只要进程在运行就返回 200
就绪检查(Readiness):服务是否准备好接收请求
curl http://localhost:3000/health/ready
# 所有依赖(模型供应商、频道)就绪后才返回 200
Kubernetes 集成
在 Kubernetes Pod 配置中使用健康检查:
containers:
- name: openclaw
image: openclaw/openclaw:latest
ports:
- containerPort: 3000
livenessProbe:
httpGet:
path: /health/live
port: 3000
initialDelaySeconds: 10
periodSeconds: 15
failureThreshold: 3
readinessProbe:
httpGet:
path: /health/ready
port: 3000
initialDelaySeconds: 20
periodSeconds: 10
failureThreshold: 3
Docker 健康检查
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
Docker Compose:
services:
openclaw:
image: openclaw/openclaw:latest
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 20s
外部监控集成
Uptime Robot / Better Uptime
配置 HTTP 监控,URL 设为:
https://gateway.example.com/health
期望状态码:200
Prometheus 指标
启用 Prometheus 指标端点:
{
"gateway": {
"metrics": {
"enabled": true,
"path": "/metrics",
"format": "prometheus"
}
}
}
curl http://localhost:3000/metrics
输出:
# HELP openclaw_requests_total Total number of requests
# TYPE openclaw_requests_total counter
openclaw_requests_total{channel="telegram"} 1520
openclaw_requests_total{channel="discord"} 980
# HELP openclaw_response_time_seconds Response time in seconds
# TYPE openclaw_response_time_seconds histogram
openclaw_response_time_seconds_bucket{le="0.1"} 500
openclaw_response_time_seconds_bucket{le="0.5"} 1200
告警配置
在 OpenClaw 中配置健康检查告警:
{
"gateway": {
"healthCheck": {
"alerts": {
"enabled": true,
"channels": ["telegram-admin"],
"conditions": [
{"check": "provider", "status": "down", "for": "2m"},
{"check": "memory", "above": 90, "for": "5m"},
{"check": "channel", "status": "disconnected", "for": "1m"}
]
}
}
}
}
自定义健康检查
添加自定义检查逻辑:
{
"gateway": {
"healthCheck": {
"custom": [
{
"name": "redis",
"type": "tcp",
"host": "redis.example.com",
"port": 6379,
"timeout": 2000
},
{
"name": "external-api",
"type": "http",
"url": "https://api.example.com/health",
"expectedStatus": 200,
"timeout": 5000
}
]
}
}
}
总结
健康检查 API 是 OpenClaw 生产运维的核心组件。通过合理配置存活检查、就绪检查和告警机制,可以实现服务故障的快速发现和自动恢复,确保 AI 助手的高可用性。