OpenClaw 대화 내보내기 및 데이터 분석

소개

대화 데이터는 AI Agent 운영에서 가장 가치 있는 자산 중 하나입니다. 대화 기록을 분석하면 사용자의 실제 요구를 이해하고, 에이전트의 부족한 점을 파악하며, 프롬프트 효과를 최적화하고, 비용 소비를 계산할 수 있습니다. OpenClaw은 모든 세션 데이터를 JSONL 형식으로 저장하며 유연한 내보내기 도구와 풍부한 분석 차원을 제공합니다.

이 글에서는 JSONL 세션 파일 구조, 내보내기 방법, 실용적인 데이터 분석 시나리오를 다룹니다.

JSONL 세션 파일 구조

파일 위치

OpenClaw의 세션 데이터는 기본적으로 다음 위치에 저장됩니다:

~/.openclaw/agents/<agentId>/sessions/
├── session-abc123.jsonl
├── session-def456.jsonl
└── session-ghi789.jsonl

메시지 형식

각 줄은 독립적인 JSON 객체로 하나의 메시지를 나타냅니다:

{"id":"msg_001","parentId":null,"role":"user","content":"Hello","timestamp":1710400000,"metadata":{"userId":"user_123","platform":"telegram","channelType":"dm"}}
{"id":"msg_002","parentId":"msg_001","role":"assistant","content":"Hello! How can I help you?","timestamp":1710400002,"metadata":{"model":"claude-sonnet-4-20250514","inputTokens":45,"outputTokens":12}}
{"id":"msg_003","parentId":"msg_002","role":"user","content":"Write me some Python code","timestamp":1710400010,"metadata":{"userId":"user_123"}}
{"id":"msg_004","parentId":"msg_003","role":"assistant","content":"Sure, here's a sample...","timestamp":1710400015,"metadata":{"model":"claude-sonnet-4-20250514","inputTokens":120,"outputTokens":280,"toolsUsed":["run_code"]}}

메시지 필드 참조

필드	타입	설명
`id`	string	고유 메시지 ID
`parentId`	string/null	부모 메시지 ID, 트리 구조를 형성
`role`	string	역할: user / assistant / system / tool
`content`	string	메시지 내용
`timestamp`	number	Unix 타임스탬프 (초)
`type`	string	특수 유형: compaction / edit / branch
`metadata`	object	메타데이터

메타데이터 필드

{
  "metadata": {
    "userId": "user_123",
    "platform": "telegram",
    "channelType": "dm",
    "channelId": "chat_456",
    "model": "claude-sonnet-4-20250514",
    "inputTokens": 150,
    "outputTokens": 320,
    "totalTokens": 470,
    "latencyMs": 2340,
    "toolsUsed": ["search", "run_code"],
    "costUsd": 0.0023
  }
}

대화 기록 내보내기

명령줄 내보내기

# Agent의 모든 세션 내보내기
openclaw export --agent my-agent --output ./export/

# 특정 시간 범위 내보내기
openclaw export --agent my-agent \
  --from "2026-03-01" --to "2026-03-14" \
  --output ./export/march.jsonl

# CSV 형식으로 내보내기
openclaw export --agent my-agent \
  --format csv --output ./export/conversations.csv

# JSON 형식으로 내보내기 (전체 트리 구조 포함)
openclaw export --agent my-agent \
  --format json --output ./export/conversations.json

# 특정 사용자의 대화만 내보내기
openclaw export --agent my-agent \
  --user-id "user_123" --output ./export/user123.jsonl

# 특정 플랫폼의 대화만 내보내기
openclaw export --agent my-agent \
  --platform telegram --output ./export/telegram.jsonl

API 내보내기

# API를 통해 내보내기
curl -X GET "http://localhost:3000/api/v1/export/conversations" \
  -H "Authorization: Bearer sk-openclaw-xxx" \
  -G -d "agentId=my-agent" \
  -d "from=2026-03-01" \
  -d "to=2026-03-14" \
  -d "format=jsonl" \
  -o conversations.jsonl

Dashboard 내보내기

OpenClaw Web Dashboard는 시각적 내보내기 인터페이스를 제공합니다:

Dashboard > 세션 관리로 이동
필터 조건 설정 (시간 범위, Agent, 플랫폼 등)
"내보내기" 버튼 클릭
형식 선택 (JSONL / CSV / JSON)
파일 다운로드

데이터 분석 실전

Python 분석 스크립트

기본 데이터 로딩

import json
from datetime import datetime
from collections import Counter, defaultdict

def load_sessions(filepath):
    messages = []
    with open(filepath, "r", encoding="utf-8") as f:
        for line in f:
            if line.strip():
                messages.append(json.loads(line))
    return messages

messages = load_sessions("./export/conversations.jsonl")
print(f"Total messages: {len(messages)}")

분석 1: 토큰 소비 통계

def analyze_token_usage(messages):
    total_input = 0
    total_output = 0
    daily_usage = defaultdict(lambda: {"input": 0, "output": 0})

    for msg in messages:
        if msg["role"] == "assistant" and "metadata" in msg:
            meta = msg["metadata"]
            input_t = meta.get("inputTokens", 0)
            output_t = meta.get("outputTokens", 0)
            total_input += input_t
            total_output += output_t

            date = datetime.fromtimestamp(msg["timestamp"]).strftime("%Y-%m-%d")
            daily_usage[date]["input"] += input_t
            daily_usage[date]["output"] += output_t

    print(f"Total input tokens: {total_input:,}")
    print(f"Total output tokens: {total_output:,}")
    print(f"Daily average input tokens: {total_input // max(len(daily_usage), 1):,}")

    return daily_usage

usage = analyze_token_usage(messages)

분석 2: 사용자 활동

def analyze_user_activity(messages):
    user_messages = Counter()
    user_sessions = defaultdict(set)

    for msg in messages:
        if msg["role"] == "user" and "metadata" in msg:
            uid = msg["metadata"].get("userId", "unknown")
            user_messages[uid] += 1

    print("Top 10 active users:")
    for uid, count in user_messages.most_common(10):
        print(f"  {uid}: {count} messages")

    return user_messages

activity = analyze_user_activity(messages)

분석 3: 주제 분류

def categorize_topics(messages):
    """간단한 키워드 기반 분류"""
    categories = {
        "기술 문제": ["code", "bug", "error", "deploy", "config"],
        "제품 문의": ["price", "feature", "compare", "trial"],
        "사용 도움": ["how to", "tutorial", "steps", "guide"],
        "피드백": ["suggest", "wish", "improve", "doesn't work"]
    }

    results = Counter()
    for msg in messages:
        if msg["role"] == "user":
            content = msg["content"]
            for category, keywords in categories.items():
                if any(kw in content for kw in keywords):
                    results[category] += 1
                    break

    return results

topics = categorize_topics(messages)
for topic, count in topics.most_common():
    print(f"  {topic}: {count}")

분석 4: 응답 지연 분포

def analyze_latency(messages):
    latencies = []
    for msg in messages:
        if msg["role"] == "assistant" and "metadata" in msg:
            latency = msg["metadata"].get("latencyMs")
            if latency:
                latencies.append(latency)

    if latencies:
        latencies.sort()
        print(f"Average latency: {sum(latencies)/len(latencies):.0f}ms")
        print(f"P50 latency: {latencies[len(latencies)//2]:.0f}ms")
        print(f"P95 latency: {latencies[int(len(latencies)*0.95)]:.0f}ms")
        print(f"P99 latency: {latencies[int(len(latencies)*0.99)]:.0f}ms")

analyze_latency(messages)

분석 5: 도구 사용 통계

def analyze_tool_usage(messages):
    tool_counts = Counter()
    for msg in messages:
        if msg["role"] == "assistant" and "metadata" in msg:
            tools = msg["metadata"].get("toolsUsed", [])
            for tool in tools:
                tool_counts[tool] += 1

    print("Tool usage frequency:")
    for tool, count in tool_counts.most_common():
        print(f"  {tool}: {count} times")

analyze_tool_usage(messages)

데이터 시각화

matplotlib로 차트 그리기

import matplotlib.pyplot as plt

def plot_daily_usage(daily_usage):
    dates = sorted(daily_usage.keys())
    input_tokens = [daily_usage[d]["input"] for d in dates]
    output_tokens = [daily_usage[d]["output"] for d in dates]

    fig, ax = plt.subplots(figsize=(12, 6))
    ax.bar(dates, input_tokens, label="Input Tokens", alpha=0.7)
    ax.bar(dates, output_tokens, bottom=input_tokens,
           label="Output Tokens", alpha=0.7)
    ax.set_xlabel("Date")
    ax.set_ylabel("Tokens")
    ax.set_title("Daily Token Consumption Trend")
    ax.legend()
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.savefig("daily_token_usage.png", dpi=150)

plot_daily_usage(usage)

개인정보 보호 및 규정 준수

익명화 내보내기

# 자동 익명화를 적용하여 내보내기
openclaw export --agent my-agent \
  --anonymize \
  --redact-patterns "phone,email,id_card" \
  --output ./export/anonymized.jsonl

데이터 보존 정책

{
  storage: {
    retention: {
      // 세션 데이터 보존 일수
      sessionTTL: 90,  // 90일 후 자동 정리
      // 내보내기 데이터 보존 일수
      exportTTL: 30,
      // 삭제 전 자동 아카이브 여부
      archiveBeforeDelete: true,
      archiveDir: "./archive/"
    }
  }
}

요약

OpenClaw의 JSONL 세션 저장 형식은 단순하면서도 강력합니다. 한 줄에 하나의 메시지, 쉽게 파싱할 수 있는 JSON 형식, 대화 분기를 완전히 캡처하는 트리 구조를 갖추고 있습니다. 명령줄, API 또는 Dashboard를 통해 데이터를 내보낸 후, Python 등의 도구를 사용하여 토큰 소비, 사용자 활동, 주제 분포, 응답 지연을 다루는 심층 분석을 수행하여 AI Agent의 운영 상태를 종합적으로 파악할 수 있습니다. 이러한 데이터 인사이트는 Agent의 성능과 비용 효율성을 지속적으로 최적화하는 데 도움이 됩니다.