OpenClaw Conversation History Length Control and Context Compression

Introduction

AI models have limited context windows, and conversation history keeps growing with each interaction. How to retain the most relevant information within a limited context while controlling API call costs is a problem every AI gateway must solve. OpenClaw provides flexible history length controls and automatic compression mechanisms, helping you find the right balance between conversation quality and resource consumption.

Why Control History Length

Not controlling history length leads to the following problems:

Cost Spiral: Every API call sends the full context history — the longer the history, the more tokens consumed
Slower Responses: Models take longer to process longer contexts
Information Noise: Conversation content from long ago may be irrelevant to the current topic, actually interfering with the model's judgment
Exceeding Limits: Directly exceeding the model's context window causes API errors

Basic Configuration

Configure history length limits in the sessions section of openclaw.json:

{
  "sessions": {
    "maxHistoryMessages": 50,
    "maxHistoryTokens": 8000
  }
}

Dual Limit Mechanism

OpenClaw uses a dual limit on both message count and token count, whichever threshold is reached first:

maxHistoryMessages: Maximum number of messages retained in context. Each message includes both user input and AI response
maxHistoryTokens: Maximum number of tokens retained in context. OpenClaw uses the corresponding model's tokenizer for precise calculation

For example, if you set maxHistoryMessages: 50 and maxHistoryTokens: 8000, when there are already 30 messages but the token count has reached 8000, truncation or compression will begin even though the message count has not been reached.

Differentiated Configuration by Channel Type

The usage patterns of DMs and group chats differ significantly. DMs are typically continuous long conversations that benefit from retaining more context; group chats tend to be fragmented short interactions where historical value is relatively lower. OpenClaw supports setting different limits by channel type:

{
  "sessions": {
    "maxHistoryMessages": 50,
    "maxHistoryTokens": 8000,
    "channelOverrides": {
      "dm": {
        "maxHistoryMessages": 100,
        "maxHistoryTokens": 16000
      },
      "group": {
        "maxHistoryMessages": 20,
        "maxHistoryTokens": 4000
      }
    }
  }
}

Supported Channel Types

Type ID	Description	Suggested Configuration
`dm`	Direct message / one-on-one conversation	Longer history (50-100 messages)
`group`	Group chat / multi-person conversation	Shorter history (10-30 messages)
`channel`	Channel messages (e.g., Discord channels)	Medium history (20-50 messages)
`thread`	Topics / threads	Medium history (20-50 messages)

Per-Platform Configuration

You can also set independent limits for specific messaging platforms:

{
  "sessions": {
    "channelOverrides": {
      "dm": {
        "maxHistoryMessages": 100
      },
      "group": {
        "maxHistoryMessages": 20
      }
    },
    "platformOverrides": {
      "telegram": {
        "maxHistoryMessages": 80,
        "maxHistoryTokens": 12000
      },
      "discord": {
        "maxHistoryMessages": 30,
        "maxHistoryTokens": 6000
      }
    }
  }
}

Priority order: platformOverrides > channelOverrides > global defaults.

Automatic Compression Mechanisms

When conversation history approaches the limit, OpenClaw provides two handling strategies: simple truncation and intelligent compression.

Truncation Strategy

{
  "sessions": {
    "autoCompaction": true,
    "compactionStrategy": "truncate"
  }
}

Simply discards the oldest messages, retaining only the most recent N messages. This approach is simple and efficient with no additional API calls, but completely loses information from earlier conversations.

Workflow:

Original history: [msg1, msg2, msg3, ..., msg50, msg51]
After truncation:  [msg21, msg22, ..., msg50, msg51]  (keeping the most recent 30)

Summary Strategy

{
  "sessions": {
    "autoCompaction": true,
    "compactionStrategy": "summary",
    "compactionThreshold": 0.8,
    "summaryMaxTokens": 500
  }
}

When context usage reaches the compactionThreshold (default 80%), OpenClaw automatically sends the older conversation history to the model for summarization. The summary result is inserted as a system message at the beginning of the context, replacing the original history messages.

Workflow:

Step 1: Context usage detected at 80%
Step 2: Send the first 30 messages to the model, requesting a summary
Step 3: Model returns the summary text
Step 4: Replace the original 30 messages with the summary
Final context: [summary message, msg31, msg32, ..., msg51]

Summary Configuration Details

{
  "sessions": {
    "compaction": {
      "strategy": "summary",
      "threshold": 0.8,
      "summaryMaxTokens": 500,
      "summaryModel": "default",
      "summaryPrompt": "Please compress the following conversation history into a concise summary, retaining key information and user preferences:",
      "preserveSystemMessages": true,
      "preserveLastN": 10
    }
  }
}

Parameter	Description
`threshold`	Context usage ratio threshold that triggers compression
`summaryMaxTokens`	Maximum token count for the summary text
`summaryModel`	Model used for generating summaries; `"default"` uses the current model
`summaryPrompt`	Custom summarization instruction
`preserveSystemMessages`	Whether to preserve original system messages from compression
`preserveLastN`	Always keep the most recent N messages out of compression

Rolling Compression

For extremely long conversations, compression may trigger multiple times. Each compression merges the previous summary with new old messages to generate an updated summary. This process is called "rolling compression":

First compression:  [msg1-30] → Summary A
Second compression: [Summary A, msg31-50] → Summary B
Third compression:  [Summary B, msg51-70] → Summary C

As compression occurs more times, the earliest information is gradually condensed and summarized, but key information is typically preserved.

Cost Estimation

Different history limit strategies have a significant impact on API costs. Here is a rough estimate:

Configuration	Avg tokens per request	Monthly cost (100 messages/day)
maxHistoryMessages: 10	~2,000	Lower
maxHistoryMessages: 50	~8,000	Medium
maxHistoryMessages: 100	~15,000	Higher
summary compression	~3,000 + compression overhead	Medium-low

Actual costs depend on model pricing, average message length, and usage frequency.

Manual History Management

In addition to automatic mechanisms, you can manually manage session history via the command line:

# View history statistics for a session
openclaw session stats --session telegram_123456

# Manually trigger compression
openclaw session compact --session telegram_123456

# Trim history but keep the session
openclaw session trim --session telegram_123456 --keep 10

# Completely clear a session
openclaw session clear --session telegram_123456

Users can also use built-in commands in chat to manage their own history:

/clear    - Clear current conversation history
/compact  - Manually trigger conversation compression
/history  - View current history message count and token usage

Recommended Configurations

Personal Use (Cost Priority)

{
  "sessions": {
    "maxHistoryMessages": 20,
    "maxHistoryTokens": 4000,
    "autoCompaction": true,
    "compactionStrategy": "truncate"
  }
}

Daily Use (Balanced Approach)

{
  "sessions": {
    "maxHistoryMessages": 50,
    "maxHistoryTokens": 8000,
    "autoCompaction": true,
    "compactionStrategy": "summary",
    "channelOverrides": {
      "dm": { "maxHistoryMessages": 80 },
      "group": { "maxHistoryMessages": 20 }
    }
  }
}

Deep Conversations (Quality Priority)

{
  "sessions": {
    "maxHistoryMessages": 150,
    "maxHistoryTokens": 32000,
    "autoCompaction": true,
    "compactionStrategy": "summary",
    "compaction": {
      "summaryMaxTokens": 1000,
      "preserveLastN": 30
    }
  }
}

Summary

Conversation history length control is a core tool for cost management and conversation quality optimization in OpenClaw. By configuring differentiated history limits by channel type, combined with summary compression or simple truncation strategies, you can find the optimal balance for different scenarios. It is recommended to start with a moderate configuration and gradually tune based on actual token usage and conversation quality feedback.