Home Tutorials Categories Skills About
ZH EN JA KO
Advanced

OpenClaw Headless Browser Integration Guide

· 17 min read

Introduction

AI Agents frequently need to interact with web pages — scraping page content, filling out forms, capturing screenshots, and even performing complex browser automation tasks. OpenClaw comes with built-in headless browser integration capabilities, based on the Chromium engine, providing full web interaction functionality through the Pi Agent SDK's browser tool.

This article covers how to configure and use OpenClaw's headless browser features, as well as how the Bridge URL mechanism works.

Enabling the Browser Tool

Basic Configuration

{
  agents: {
    "my-agent": {
      tools: {
        browser: {
          enabled: true,
          // Browser engine
          engine: "chromium",
          // Headless mode
          headless: true,
          // Browser launch arguments
          launchArgs: [
            "--no-sandbox",
            "--disable-gpu",
            "--disable-dev-shm-usage"
          ],
          // Default viewport size
          viewport: {
            width: 1280,
            height: 720
          },
          // Page load timeout (milliseconds)
          timeout: 30000
        }
      }
    }
  }
}

Special Configuration for Docker Environments

When running in Docker, ensure Chromium dependencies are installed:

FROM openclaw/gateway:latest

# Install Chromium dependencies
RUN apt-get update && apt-get install -y \
    chromium \
    fonts-noto-cjk \
    --no-install-recommends \
    && rm -rf /var/lib/apt/lists/*

Or use a pre-built image with browser support:

services:
  openclaw:
    image: openclaw/gateway:latest-browser
    environment:
      - OPENCLAW_BROWSER_ENABLED=true

Browser Tool Features

The AI Agent can perform the following operations through the browser tool:

1. Page Navigation and Content Extraction

User: Show me today's trending articles on Hacker News

AI internally executes:
→ browser.navigate("https://news.ycombinator.com")
→ browser.extract({ selector: ".titleline > a", fields: ["text", "href"] })
→ Returns the extracted results to the user

2. Page Screenshots

User: Take a screenshot of the GitHub trending page

AI internally executes:
→ browser.navigate("https://github.com/trending")
→ browser.screenshot({ fullPage: false })
→ Sends the image to the user

3. Form Interaction

User: Search for "OpenClaw" on that website

AI internally executes:
→ browser.navigate("https://example.com")
→ browser.fill({ selector: "#search-input", value: "OpenClaw" })
→ browser.click({ selector: "#search-button" })
→ browser.waitForNavigation()
→ browser.extract({ selector: ".results" })

4. JavaScript Execution

AI internally executes:
→ browser.evaluate("document.title")
→ browser.evaluate("document.querySelectorAll('a').length")

Bridge URL Mechanism

Bridge URL is a unique OpenClaw feature that allows an AI Agent to "bridge" a browser session to the user, letting the user interact with the web page directly in their own browser.

How It Works

1. AI Agent opens a page in the headless browser
2. OpenClaw generates a Bridge URL
3. The user clicks the Bridge URL to open it in their own browser
4. The user sees a live mirror of the Agent's browser session
5. The user can interact with the page, and actions sync to the Agent's browser

Configuring Bridge URL

{
  agents: {
    "my-agent": {
      tools: {
        browser: {
          bridge: {
            enabled: true,
            // Base URL for Bridge sessions
            baseUrl: "https://openclaw.example.com/bridge",
            // Session expiration time (seconds)
            sessionTTL: 600,  // 10 minutes
            // Access authentication
            auth: {
              required: true,
              // One-time token authentication
              method: "token"
            }
          }
        }
      }
    }
  }
}

Use Cases

Scenario 1: Websites Requiring User Login

User: Check my bank account balance

AI: This requires logging into the bank website. I've opened the login page.
    Please click the link below to complete the login:
    https://openclaw.example.com/bridge/abc123

    Let me know once you've logged in and I'll continue.

User: [Clicks link, completes login] Done

AI: [Detects successful login]
→ browser.extract({ selector: ".balance" })
→ Your account balance is $12,345.67

Scenario 2: CAPTCHA Handling

When a CAPTCHA is encountered, the AI cannot handle it on its own and can use Bridge URL to let the user complete it manually:

AI: A CAPTCHA has appeared on the page. Please click the link to complete the verification:
    https://openclaw.example.com/bridge/def456
    I'll continue once you're done.

Scenario 3: Complex Interaction Confirmation

Let the user confirm via Bridge URL before executing sensitive operations:

AI: I've filled out the form. Please review and submit it through the following link:
    https://openclaw.example.com/bridge/ghi789

Advanced Configuration

Proxy Settings

If you need to access websites through a proxy:

{
  tools: {
    browser: {
      proxy: {
        server: "http://proxy.example.com:8080",
        username: "user",
        password: "pass",
        // Domains that bypass the proxy
        bypass: ["localhost", "*.internal.com"]
      }
    }
  }
}

Cookie and Session Management

{
  tools: {
    browser: {
      // Persistent cookie storage
      persistCookies: true,
      cookieDir: "./data/browser-cookies",
      // Default cookies
      defaultCookies: [
        {
          name: "lang",
          value: "zh-CN",
          domain: ".example.com"
        }
      ]
    }
  }
}

User-Agent and Request Headers

{
  tools: {
    browser: {
      userAgent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
      extraHeaders: {
        "Accept-Language": "zh-CN,zh;q=0.9"
      }
    }
  }
}

Resource Blocking

To improve performance and reduce bandwidth consumption, you can block unnecessary resources:

{
  tools: {
    browser: {
      resourceBlocking: {
        // Block images (for pure content extraction)
        blockImages: false,
        // Block fonts
        blockFonts: true,
        // Block ads and tracking scripts
        blockPatterns: [
          "*google-analytics*",
          "*doubleclick*",
          "*facebook.com/tr*"
        ]
      }
    }
  }
}

Security Considerations

Sandbox Isolation

OpenClaw's browser tool runs in a sandboxed environment:

{
  tools: {
    browser: {
      sandbox: {
        // Network access whitelist
        allowedDomains: [
          "*.github.com",
          "*.stackoverflow.com",
          "news.ycombinator.com"
        ],
        // Block file downloads
        blockDownloads: true,
        // Block access to local files
        blockFileUrls: true,
        // Maximum concurrent pages
        maxPages: 5,
        // Maximum runtime per page
        maxPageTime: 60000
      }
    }
  }
}

Content Filtering

{
  tools: {
    browser: {
      contentFilter: {
        // Redact sensitive information (e.g., if extracted pages contain password fields)
        redactPatterns: [
          "password",
          "credit.?card",
          "ssn"
        ],
        // Maximum length of extracted content
        maxExtractLength: 10000
      }
    }
  }
}

Performance Optimization

Browser Pool

For high-concurrency scenarios, OpenClaw supports browser instance pooling:

{
  tools: {
    browser: {
      pool: {
        // Minimum browser instances
        min: 2,
        // Maximum browser instances
        max: 10,
        // Maximum idle time for idle instances
        idleTimeout: 300000
      }
    }
  }
}

Page Caching

{
  tools: {
    browser: {
      pageCache: {
        enabled: true,
        // Cache expiration time
        ttl: 300,  // 5 minutes
        // Maximum cached pages
        maxPages: 50
      }
    }
  }
}

Debugging

# Start in non-headless mode (for local debugging)
OPENCLAW_BROWSER_HEADLESS=false openclaw start

# View browser tool invocation logs
openclaw logs --filter "browser"

# Save a screenshot locally
openclaw browser screenshot https://example.com --output ./screenshot.png

Conclusion

OpenClaw's headless browser integration gives AI Agents the ability to "see and interact with web pages." Through the browser tool, Agents can navigate pages, extract content, fill forms, and capture screenshots. The Bridge URL mechanism seamlessly handles operations that require user participation (such as login and CAPTCHAs). Combined with sandbox isolation, resource blocking, and browser pooling, this solution can run safely and efficiently in production environments.

OpenClaw is a free, open-source personal AI assistant that supports WhatsApp, Telegram, Discord, and many more platforms