Introduction
Conversation data is one of the most valuable assets in AI Agent operations. By analyzing conversation records, you can understand users' real needs, identify agent shortcomings, optimize prompt effectiveness, and calculate cost consumption. OpenClaw stores all session data in JSONL format and provides flexible export tools with rich analysis dimensions.
This article covers the JSONL session file structure in detail, export methods, and practical data analysis scenarios.
JSONL Session File Structure
File Location
OpenClaw's session data is stored by default at:
~/.openclaw/agents/<agentId>/sessions/
├── session-abc123.jsonl
├── session-def456.jsonl
└── session-ghi789.jsonl
Message Format
Each line is an independent JSON object representing a single message:
{"id":"msg_001","parentId":null,"role":"user","content":"Hello","timestamp":1710400000,"metadata":{"userId":"user_123","platform":"telegram","channelType":"dm"}}
{"id":"msg_002","parentId":"msg_001","role":"assistant","content":"Hello! How can I help you?","timestamp":1710400002,"metadata":{"model":"claude-sonnet-4-20250514","inputTokens":45,"outputTokens":12}}
{"id":"msg_003","parentId":"msg_002","role":"user","content":"Write me some Python code","timestamp":1710400010,"metadata":{"userId":"user_123"}}
{"id":"msg_004","parentId":"msg_003","role":"assistant","content":"Sure, here's a sample...","timestamp":1710400015,"metadata":{"model":"claude-sonnet-4-20250514","inputTokens":120,"outputTokens":280,"toolsUsed":["run_code"]}}
Message Field Reference
| Field | Type | Description |
|---|---|---|
id |
string | Unique message ID |
parentId |
string/null | Parent message ID, forming a tree structure |
role |
string | Role: user / assistant / system / tool |
content |
string | Message content |
timestamp |
number | Unix timestamp (seconds) |
type |
string | Special type: compaction / edit / branch |
metadata |
object | Metadata |
Metadata Fields
{
"metadata": {
"userId": "user_123",
"platform": "telegram",
"channelType": "dm",
"channelId": "chat_456",
"model": "claude-sonnet-4-20250514",
"inputTokens": 150,
"outputTokens": 320,
"totalTokens": 470,
"latencyMs": 2340,
"toolsUsed": ["search", "run_code"],
"costUsd": 0.0023
}
}
Exporting Conversation Records
Command-Line Export
# Export all sessions for an Agent
openclaw export --agent my-agent --output ./export/
# Export a specific time range
openclaw export --agent my-agent \
--from "2026-03-01" --to "2026-03-14" \
--output ./export/march.jsonl
# Export as CSV format
openclaw export --agent my-agent \
--format csv --output ./export/conversations.csv
# Export as JSON format (with full tree structure)
openclaw export --agent my-agent \
--format json --output ./export/conversations.json
# Export only a specific user's conversations
openclaw export --agent my-agent \
--user-id "user_123" --output ./export/user123.jsonl
# Export only conversations from a specific platform
openclaw export --agent my-agent \
--platform telegram --output ./export/telegram.jsonl
API Export
# Export via API
curl -X GET "http://localhost:3000/api/v1/export/conversations" \
-H "Authorization: Bearer sk-openclaw-xxx" \
-G -d "agentId=my-agent" \
-d "from=2026-03-01" \
-d "to=2026-03-14" \
-d "format=jsonl" \
-o conversations.jsonl
Dashboard Export
The OpenClaw Web Dashboard provides a visual export interface:
- Go to Dashboard > Session Management
- Set filter criteria (time range, Agent, platform, etc.)
- Click the "Export" button
- Choose the format (JSONL / CSV / JSON)
- Download the file
Data Analysis in Practice
Python Analysis Scripts
Basic Data Loading
import json
from datetime import datetime
from collections import Counter, defaultdict
def load_sessions(filepath):
messages = []
with open(filepath, "r", encoding="utf-8") as f:
for line in f:
if line.strip():
messages.append(json.loads(line))
return messages
messages = load_sessions("./export/conversations.jsonl")
print(f"Total messages: {len(messages)}")
Analysis 1: Token Consumption Statistics
def analyze_token_usage(messages):
total_input = 0
total_output = 0
daily_usage = defaultdict(lambda: {"input": 0, "output": 0})
for msg in messages:
if msg["role"] == "assistant" and "metadata" in msg:
meta = msg["metadata"]
input_t = meta.get("inputTokens", 0)
output_t = meta.get("outputTokens", 0)
total_input += input_t
total_output += output_t
date = datetime.fromtimestamp(msg["timestamp"]).strftime("%Y-%m-%d")
daily_usage[date]["input"] += input_t
daily_usage[date]["output"] += output_t
print(f"Total input tokens: {total_input:,}")
print(f"Total output tokens: {total_output:,}")
print(f"Daily average input tokens: {total_input // max(len(daily_usage), 1):,}")
return daily_usage
usage = analyze_token_usage(messages)
Analysis 2: User Activity
def analyze_user_activity(messages):
user_messages = Counter()
user_sessions = defaultdict(set)
for msg in messages:
if msg["role"] == "user" and "metadata" in msg:
uid = msg["metadata"].get("userId", "unknown")
user_messages[uid] += 1
print("Top 10 active users:")
for uid, count in user_messages.most_common(10):
print(f" {uid}: {count} messages")
return user_messages
activity = analyze_user_activity(messages)
Analysis 3: Topic Categorization
def categorize_topics(messages):
"""Simple keyword-based categorization"""
categories = {
"Technical Issues": ["code", "bug", "error", "deploy", "config"],
"Product Inquiries": ["price", "feature", "compare", "trial"],
"Usage Help": ["how to", "tutorial", "steps", "guide"],
"Feedback": ["suggest", "wish", "improve", "doesn't work"]
}
results = Counter()
for msg in messages:
if msg["role"] == "user":
content = msg["content"]
for category, keywords in categories.items():
if any(kw in content for kw in keywords):
results[category] += 1
break
return results
topics = categorize_topics(messages)
for topic, count in topics.most_common():
print(f" {topic}: {count}")
Analysis 4: Response Latency Distribution
def analyze_latency(messages):
latencies = []
for msg in messages:
if msg["role"] == "assistant" and "metadata" in msg:
latency = msg["metadata"].get("latencyMs")
if latency:
latencies.append(latency)
if latencies:
latencies.sort()
print(f"Average latency: {sum(latencies)/len(latencies):.0f}ms")
print(f"P50 latency: {latencies[len(latencies)//2]:.0f}ms")
print(f"P95 latency: {latencies[int(len(latencies)*0.95)]:.0f}ms")
print(f"P99 latency: {latencies[int(len(latencies)*0.99)]:.0f}ms")
analyze_latency(messages)
Analysis 5: Tool Usage Statistics
def analyze_tool_usage(messages):
tool_counts = Counter()
for msg in messages:
if msg["role"] == "assistant" and "metadata" in msg:
tools = msg["metadata"].get("toolsUsed", [])
for tool in tools:
tool_counts[tool] += 1
print("Tool usage frequency:")
for tool, count in tool_counts.most_common():
print(f" {tool}: {count} times")
analyze_tool_usage(messages)
Data Visualization
Plotting with matplotlib
import matplotlib.pyplot as plt
def plot_daily_usage(daily_usage):
dates = sorted(daily_usage.keys())
input_tokens = [daily_usage[d]["input"] for d in dates]
output_tokens = [daily_usage[d]["output"] for d in dates]
fig, ax = plt.subplots(figsize=(12, 6))
ax.bar(dates, input_tokens, label="Input Tokens", alpha=0.7)
ax.bar(dates, output_tokens, bottom=input_tokens,
label="Output Tokens", alpha=0.7)
ax.set_xlabel("Date")
ax.set_ylabel("Tokens")
ax.set_title("Daily Token Consumption Trend")
ax.legend()
plt.xticks(rotation=45)
plt.tight_layout()
plt.savefig("daily_token_usage.png", dpi=150)
plot_daily_usage(usage)
Privacy and Compliance
Anonymized Export
# Export with automatic anonymization
openclaw export --agent my-agent \
--anonymize \
--redact-patterns "phone,email,id_card" \
--output ./export/anonymized.jsonl
Data Retention Policy
{
storage: {
retention: {
// Session data retention in days
sessionTTL: 90, // Auto-cleanup after 90 days
// Export data retention in days
exportTTL: 30,
// Whether to auto-archive before deletion
archiveBeforeDelete: true,
archiveDir: "./archive/"
}
}
}
Summary
OpenClaw's JSONL session storage format is both simple and powerful — one message per line, easy-to-parse JSON format, with a tree structure that fully captures conversation branches. After exporting data via the command line, API, or Dashboard, you can use Python and other tools for in-depth analysis covering token consumption, user activity, topic distribution, and response latency, gaining a comprehensive understanding of your AI Agent's operational status. These data insights will help you continuously optimize your Agent's performance and cost efficiency.