Hugging Face Introduction
Hugging Face is the world's largest open-source AI model community, hosting hundreds of thousands of pretrained models. Through the Hugging Face Inference API, you can easily call these models from OpenClaw without building your own inference infrastructure.
Get an API Token
- Register and log in at huggingface.co
- Go to Settings -> Access Tokens
- Click "New token"
- Select "Read" permissions (choose "Write" for private models)
- Copy the generated token
Basic Configuration
{
"providers": {
"huggingface": {
"type": "openai",
"baseUrl": "https://api-inference.huggingface.co/models/",
"apiKey": "{{HF_API_TOKEN}}",
"models": ["mistralai/Mistral-7B-Instruct-v0.3"]
}
}
}
openclaw secrets set HF_API_TOKEN "hf_your_token_here"
Using Inference Endpoints
For production, consider Hugging Face Inference Endpoints for more stable performance and lower latency:
{
"providers": {
"hf-endpoint": {
"type": "openai",
"baseUrl": "https://your-endpoint-id.us-east-1.aws.endpoints.huggingface.cloud/v1",
"apiKey": "{{HF_API_TOKEN}}",
"models": ["tgi"]
}
}
}
Recommended Models
| Model | Parameters | Use Case |
|---|---|---|
| mistralai/Mistral-7B-Instruct-v0.3 | 7B | General conversation |
| meta-llama/Llama-3.1-8B-Instruct | 8B | General conversation |
| microsoft/Phi-3-mini-4k-instruct | 3.8B | Lightweight conversation |
| Qwen/Qwen2.5-72B-Instruct | 72B | Chinese language scenarios |
Using TGI Format
Hugging Face's TGI service is compatible with the OpenAI API format:
{
"providers": {
"hf-tgi": {
"type": "openai",
"baseUrl": "https://api-inference.huggingface.co/v1",
"apiKey": "{{HF_API_TOKEN}}",
"models": ["meta-llama/Llama-3.1-8B-Instruct"]
}
}
}
Common Questions
Q: Does the free API have limits? The free Inference API has rate limits of approximately 30 requests per minute. For production, consider a Pro subscription or Inference Endpoints.
Q: Why are model responses slow? Free API models may be in a cold-start state. The first request may take tens of seconds to load the model. Inference Endpoints keep the model in memory.
Summary
Hugging Face provides a rich selection of open-source models. Connecting them to OpenClaw via the Inference API or Inference Endpoints gives you flexible model choices for different scenarios while avoiding the complexity of self-hosted inference servers.