LiteLLM Introduction
LiteLLM is an open-source AI model proxy layer that unifies 100+ LLM provider APIs into an OpenAI-compatible format. By deploying LiteLLM Proxy, you can access all providers' models through a single endpoint in OpenClaw while gaining load balancing, cost tracking, and API key management.
Deploy LiteLLM Proxy
docker run -d \
--name litellm \
-p 4000:4000 \
-v $(pwd)/litellm_config.yaml:/app/config.yaml \
ghcr.io/berriai/litellm:main-latest \
--config /app/config.yaml
Create litellm_config.yaml:
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: sk-your-openai-key
- model_name: claude-sonnet
litellm_params:
model: anthropic/claude-sonnet-4-20250514
api_key: sk-ant-your-key
- model_name: llama-3
litellm_params:
model: ollama/llama3.1
api_base: http://localhost:11434
general_settings:
master_key: sk-litellm-master-key
Configure in OpenClaw
{
"providers": {
"litellm": {
"type": "openai",
"baseUrl": "http://localhost:4000/v1",
"apiKey": "{{LITELLM_API_KEY}}",
"models": ["gpt-4o", "claude-sonnet", "llama-3"]
}
}
}
Load Balancing
LiteLLM supports load balancing across multiple deployments of the same model:
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: sk-key-1
- model_name: gpt-4o
litellm_params:
model: azure/gpt-4o-deployment
api_key: azure-key-1
api_base: https://your-resource.openai.azure.com
router_settings:
routing_strategy: least-busy
num_retries: 3
Failover and Cost Tracking
LiteLLM provides built-in failover configuration and detailed cost tracking with database persistence.
Best Practices with OpenClaw
- Manage all API keys in LiteLLM -- OpenClaw only needs one LiteLLM key
- Leverage LiteLLM caching to reduce duplicate requests
- Set budget caps to prevent cost overruns
- Use Docker Compose to manage LiteLLM and OpenClaw together
Summary
LiteLLM is an ideal companion for OpenClaw's multi-model management. It abstracts away provider differences and provides enterprise-grade features like load balancing, failover, and cost control, greatly simplifying operational complexity in multi-model environments.