How to Use Free Local Models with OpenClaw and Ollama

Cloud-based AI models are convenient, but every call costs money and your data travels to third-party servers. If privacy is a priority or you want to run an AI assistant at zero cost, running open-source models locally through Ollama is an excellent option. OpenClaw's self-hosted AI assistant has native Ollama support, and the setup is remarkably simple.

What Is Ollama

Ollama is a local large language model runtime that wraps model downloading, management, and inference into a clean command-line interface and API service. It supports a wide range of open-source models — including Meta's Llama series, Alibaba's Qwen series, and Google's Gemma series — all completely free.

Step 1: Install Ollama

Choose the installation method for your operating system.

Linux (recommended):

curl -fsSL https://ollama.com/install.sh | sh

macOS:

Download the installer from ollama.com, or install via Homebrew:

brew install ollama

Windows:

Download the Windows installer from ollama.com and run it.

After installation, verify it's working:

ollama --version

Step 2: Download an AI Model

Ollama uses the pull command to download models. Here are some recommended choices:

Llama 3.1 8B (general-purpose English model, great for getting started):

ollama pull llama3.1

Qwen2.5 7B (strong Chinese language support, recommended for Chinese users):

ollama pull qwen2.5

Gemma 2 9B (by Google, well-balanced performance):

ollama pull gemma2

You can download multiple models and switch between them in OpenClaw at any time. To see which models you've downloaded:

ollama list

Once a model is downloaded, give it a quick test:

ollama run qwen2.5 "Hello, please introduce yourself"

If the model returns a response, Ollama is ready to go.

Step 3: Make Sure Ollama Is Running

Ollama runs as a background service, listening on http://localhost:11434 by default. Check the service status:

curl http://localhost:11434/api/tags

If you get back JSON data with a model list, the service is running. On Linux, Ollama typically runs as a systemd service that starts automatically:

sudo systemctl status ollama
sudo systemctl enable ollama  # Enable auto-start on boot

Step 4: Configure Ollama in OpenClaw

Open the OpenClaw configuration file:

nano ~/.config/openclaw/openclaw.json5

Add the Ollama configuration under the providers section:

{
  providers: {
    ollama: {
      enabled: true,
      baseUrl: "http://localhost:11434",
      // Ollama doesn't require a real API key, but the field can't be empty
      apiKey: "ollama",
    }
  },
  // Set the default model
  defaultModel: "ollama/qwen2.5",
}

The apiKey field can be any non-empty string — Ollama itself doesn't require authentication, but OpenClaw's configuration validation requires the field to exist.

Save the configuration and restart the Gateway:

openclaw gateway restart

Step 5: Verify and Test

Use the diagnostic tool to confirm Ollama is connected properly:

openclaw doctor

Look for the Ollama entry in the output and confirm its status is healthy. Then send a test message through one of your configured channels (such as Telegram or Discord).

You can also monitor the model's responses through the Dashboard:

openclaw dashboard

Hardware Requirements

Running AI models locally has certain hardware demands. Here's a reference table for different model sizes:

Model Size	Minimum RAM	Recommended RAM	Recommended GPU VRAM	Example Models
3B params	4 GB	8 GB	4 GB	llama3.2:3b
7-8B params	8 GB	16 GB	8 GB	qwen2.5, llama3.1
13B params	16 GB	32 GB	12 GB	llama2:13b
70B params	64 GB	128 GB	48 GB+	llama3.1:70b

A few things to keep in mind:

You don't need a dedicated GPU — Ollama will automatically fall back to CPU inference, though it will be noticeably slower
NVIDIA GPU users should ensure they have the latest CUDA drivers installed
Apple Silicon (M1/M2/M3/M4) devices work very well with Ollama, leveraging unified memory for efficient inference
For VPS deployments, choose an instance with at least 8 GB of RAM to run 7B models

Switching Between Models

OpenClaw supports configuring multiple models and switching between them as needed. You can set up both cloud and local models side by side:

{
  providers: {
    anthropic: {
      enabled: true,
      apiKey: "sk-ant-xxxxx",
    },
    ollama: {
      enabled: true,
      baseUrl: "http://localhost:11434",
      apiKey: "ollama",
    }
  },
  defaultModel: "ollama/qwen2.5",
  // You can switch to other models during conversations
}

This way you can use free local models for everyday tasks and switch to a cloud model when you need more power.

Wrapping Up

With Ollama, you can enjoy the convenience of an AI assistant without spending a dime or sending your data to external servers. For Chinese language users, the Qwen2.5 series is particularly recommended for its strong comprehension and generation capabilities. For more model options and advanced configuration, refer to the OpenClaw official documentation. If you run into issues, check the OpenClaw GitHub repository for solutions.