selenium-browser

Usage

The skill triggers on any message that contains Chrome, browser, Selenium, screenshot, or open.

selenium-browser <URL> [--headless] [--proxy=<url>]

Command flow

Launch Chrome (or Chromium) under Selenium.
Navigate to <URL>.
Take a screenshot of the loaded page.
Save the image in /home/main/clawd/diffusion_pdfs/ and report the path back to the chat.
If anything fails, send an error message.

Scripts

scripts/launch_browser.py

#!/usr/bin/env python3
import os
import sys
import time
import base64
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options

# CLI parsing
import argparse
parser = argparse.ArgumentParser(description="Launch Selenium Chrome and take a screenshot.")
parser.add_argument("url", help="URL to open")
parser.add_argument("--headless", action="store_true", help="Run Chrome headless")
parser.add_argument("--proxy", help="Proxy URL (e.g., http://proxy:3128)")
args = parser.parse_args()

# Prepare Chrome options
chrome_options = Options()
if args.headless:
    chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
if args.proxy:
    chrome_options.add_argument(f"--proxy-server={args.proxy}")

# Locate binaries
chrome_bin = os.getenv("CHROME_BIN", "/usr/bin/google-chrome")
chromedriver_path = os.getenv("CHROMEDRIVER_PATH", "/usr/local/bin/chromedriver")

service = Service(executable_path=chromedriver_path)

# Start browser
try:
    driver = webdriver.Chrome(service=service, options=chrome_options)
except Exception as e:
    print(f"❌ Failed to start Chrome: {e}", file=sys.stderr)
    sys.exit(1)

# Navigate and wait for page load
try:
    driver.get(args.url)
    time.sleep(5)  # simple wait; can replace with WebDriverWait for better reliability
except Exception as e:
    print(f"❌ Navigation error: {e}", file=sys.stderr)
    driver.quit()
    sys.exit(1)

# Take screenshot
screenshot_path = os.path.join(os.getenv("HOME", "/tmp"), "screenshot.png")
try:
    driver.save_screenshot(screenshot_path)
except Exception as e:
    print(f"❌ Screenshot error: {e}", file=sys.stderr)
    driver.quit()
    sys.exit(1)

# Clean up
driver.quit()

# Output a JSON object that OpenClaw can parse for the reply
print({"status": "ok", "screenshot": screenshot_path})

scripts/_env.sh

# Optional: set paths to Chrome/Chromedriver if not in standard locations
# export CHROME_BIN="/opt/google/chrome/google-chrome"
# export CHROMEDRIVER_PATH="/usr/local/bin/chromedriver"

References

How the skill reports

The skill runs the Python script and captures its stdout as a JSON payload. OpenClaw parses the JSON and sends a message back:

✅ Screenshot saved: /home/main/clawd/diffusion_pdfs/screenshot.png

If the script prints an error, the skill forwards the error text.

Installation notes

Make sure chromedriver is in /usr/local/bin/chromedriver or set CHROMEDRIVER_PATH.
Make sure google-chrome (or chromium) is in /usr/bin/google-chrome or set CHROME_BIN.
Install Python dependencies: pip install selenium (inside the virtual env you use for the skill).

pip install selenium

Logging & timeouts

The script uses a 5‑second static wait after navigation; replace with Selenium's WebDriverWait for dynamic waits.

If you encounter timeouts, adjust the time.sleep(5) value or use WebDriverWait(driver, 20).until(...).

Feel free to tweak the script to fit your environment (proxy, authentication, etc.).