The Challenge of Context Management
AI models understand conversation history through the context window. A longer context yields more coherent answers, but also increases token consumption. OpenClaw provides multiple strategies to balance quality and cost.
Basic Configuration
{
"sessions": {
"maxHistory": 20,
"maxTokens": 50000,
"contextStrategy": "sliding-window"
}
}
Context Strategies
Sliding Window
Keeps the most recent N turns of conversation, discarding older history:
{
"sessions": {
"contextStrategy": "sliding-window",
"maxHistory": 20
}
}
Pros: Simple, predictable memory usage Cons: May lose important early context
Smart Trim
Retains context based on relevance:
{
"sessions": {
"contextStrategy": "smart-trim",
"maxTokens": 50000,
"relevanceThreshold": 0.5
}
}
How it works: Computes the relevance of each historical message to the current conversation, keeps highly relevant messages, and trims irrelevant ones from the middle.
Summary Mode
Automatically generates a summary when the conversation exceeds a threshold:
{
"sessions": {
"contextStrategy": "summary",
"summaryAfter": 15,
"summaryModel": "fast",
"summaryPrompt": "Summarize the following conversation into concise bullet points, preserving key information and user preferences.",
"keepRecent": 5
}
}
How it works:
- Summary is triggered at 15 turns
- A lightweight model generates the conversation summary
- The summary replaces the old conversation history
- The 5 most recent turns are kept as-is
Hybrid Mode
Combines multiple strategies:
{
"sessions": {
"contextStrategy": "hybrid",
"phases": [
{"maxMessages": 10, "strategy": "full"},
{"maxMessages": 30, "strategy": "smart-trim"},
{"maxMessages": 999, "strategy": "summary"}
]
}
}
Full history for the first 10 turns, smart trimming for turns 10-30, and summarization beyond 30.
System Prompt Optimization
System prompts also consume context space and should be kept concise:
{
"agents": {
"main": {
"systemPrompt": "You are an AI assistant. Be concise; elaborate only when necessary.",
"dynamicPrompt": {
"base": "You are an AI assistant.",
"additions": [
{"condition": "tools_available", "text": "You can use the following tools: ..."},
{"condition": "user_is_new", "text": "This is a new user. Please guide them kindly."}
]
}
}
}
}
Dynamic prompts only add extra content when needed, reducing unnecessary token consumption.
Combining with Vector Memory
Store long-term information in vector memory, freeing it from the context:
{
"sessions": {
"autoMemorize": {
"enabled": true,
"triggerWords": ["remember", "from now on", "always"],
"extractFacts": true
},
"autoRecall": {
"enabled": true,
"topK": 3,
"threshold": 0.8
}
}
}
When a user says "Remember that I prefer concise answers," this preference is stored in vector memory and no longer needs to be kept in the conversation history.
Context Budget
Allocate token budgets for different components:
{
"sessions": {
"tokenBudget": {
"total": 100000,
"systemPrompt": 2000,
"memory": 3000,
"history": 90000,
"tools": 5000
}
}
}
Monitor Context Usage
openclaw sessions context-stats
Context Usage Statistics:
Avg context length: 12,500 tokens
Max context length: 85,000 tokens
Avg history messages: 15
Context overflow events: 3 (auto-trimmed)
Estimated context cost: $1.25/day
Per-Channel Customization
Different channels can use different context strategies:
{
"channels": {
"telegram-main": {
"session": {
"maxHistory": 30,
"contextStrategy": "hybrid"
}
},
"whatsapp-quick": {
"session": {
"maxHistory": 5,
"contextStrategy": "sliding-window"
}
}
}
}
Customer support scenarios benefit from longer history to understand the full issue, while quick Q&A scenarios need only minimal context.
Manual Management
Users can manage their own conversation context through commands:
User: /clear
AI: Conversation history cleared.
User: /context
AI: The current conversation contains 15 messages, using 8,500 tokens.
{
"channels": {
"telegram-main": {
"commands": {
"/clear": "session.clear",
"/context": "session.info"
}
}
}
}
Summary
Context management is a critical part of OpenClaw performance and cost optimization. Choose the right strategy for each scenario -- sliding window for simple chats, summary mode for long-running services, vector memory for unlimited long-term recall -- to maintain a great conversational experience while effectively controlling token costs.