Cloud-based AI models are convenient, but every call costs money and your data travels to third-party servers. If privacy is a priority or you want to run an AI assistant at zero cost, running open-source models locally through Ollama is an excellent option. OpenClaw's self-hosted AI assistant has native Ollama support, and the setup is remarkably simple.
What Is Ollama
Ollama is a local large language model runtime that wraps model downloading, management, and inference into a clean command-line interface and API service. It supports a wide range of open-source models — including Meta's Llama series, Alibaba's Qwen series, and Google's Gemma series — all completely free.
Step 1: Install Ollama
Choose the installation method for your operating system.
Linux (recommended):
curl -fsSL https://ollama.com/install.sh | sh
macOS:
Download the installer from ollama.com, or install via Homebrew:
brew install ollama
Windows:
Download the Windows installer from ollama.com and run it.
After installation, verify it's working:
ollama --version
Step 2: Download an AI Model
Ollama uses the pull command to download models. Here are some recommended choices:
Llama 3.1 8B (general-purpose English model, great for getting started):
ollama pull llama3.1
Qwen2.5 7B (strong Chinese language support, recommended for Chinese users):
ollama pull qwen2.5
Gemma 2 9B (by Google, well-balanced performance):
ollama pull gemma2
You can download multiple models and switch between them in OpenClaw at any time. To see which models you've downloaded:
ollama list
Once a model is downloaded, give it a quick test:
ollama run qwen2.5 "Hello, please introduce yourself"
If the model returns a response, Ollama is ready to go.
Step 3: Make Sure Ollama Is Running
Ollama runs as a background service, listening on http://localhost:11434 by default. Check the service status:
curl http://localhost:11434/api/tags
If you get back JSON data with a model list, the service is running. On Linux, Ollama typically runs as a systemd service that starts automatically:
sudo systemctl status ollama
sudo systemctl enable ollama # Enable auto-start on boot
Step 4: Configure Ollama in OpenClaw
Open the OpenClaw configuration file:
nano ~/.config/openclaw/openclaw.json5
Add the Ollama configuration under the providers section:
{
providers: {
ollama: {
enabled: true,
baseUrl: "http://localhost:11434",
// Ollama doesn't require a real API key, but the field can't be empty
apiKey: "ollama",
}
},
// Set the default model
defaultModel: "ollama/qwen2.5",
}
The apiKey field can be any non-empty string — Ollama itself doesn't require authentication, but OpenClaw's configuration validation requires the field to exist.
Save the configuration and restart the Gateway:
openclaw gateway restart
Step 5: Verify and Test
Use the diagnostic tool to confirm Ollama is connected properly:
openclaw doctor
Look for the Ollama entry in the output and confirm its status is healthy. Then send a test message through one of your configured channels (such as Telegram or Discord).
You can also monitor the model's responses through the Dashboard:
openclaw dashboard
Hardware Requirements
Running AI models locally has certain hardware demands. Here's a reference table for different model sizes:
| Model Size | Minimum RAM | Recommended RAM | Recommended GPU VRAM | Example Models |
|---|---|---|---|---|
| 3B params | 4 GB | 8 GB | 4 GB | llama3.2:3b |
| 7-8B params | 8 GB | 16 GB | 8 GB | qwen2.5, llama3.1 |
| 13B params | 16 GB | 32 GB | 12 GB | llama2:13b |
| 70B params | 64 GB | 128 GB | 48 GB+ | llama3.1:70b |
A few things to keep in mind:
- You don't need a dedicated GPU — Ollama will automatically fall back to CPU inference, though it will be noticeably slower
- NVIDIA GPU users should ensure they have the latest CUDA drivers installed
- Apple Silicon (M1/M2/M3/M4) devices work very well with Ollama, leveraging unified memory for efficient inference
- For VPS deployments, choose an instance with at least 8 GB of RAM to run 7B models
Switching Between Models
OpenClaw supports configuring multiple models and switching between them as needed. You can set up both cloud and local models side by side:
{
providers: {
anthropic: {
enabled: true,
apiKey: "sk-ant-xxxxx",
},
ollama: {
enabled: true,
baseUrl: "http://localhost:11434",
apiKey: "ollama",
}
},
defaultModel: "ollama/qwen2.5",
// You can switch to other models during conversations
}
This way you can use free local models for everyday tasks and switch to a cloud model when you need more power.
Wrapping Up
With Ollama, you can enjoy the convenience of an AI assistant without spending a dime or sending your data to external servers. For Chinese language users, the Qwen2.5 series is particularly recommended for its strong comprehension and generation capabilities. For more model options and advanced configuration, refer to the OpenClaw official documentation. If you run into issues, check the OpenClaw GitHub repository for solutions.