Browser Automation Skill
Guidance for effective browser automation in Claude Code. Complements the dev-browser plugin.
Prerequisites
This skill provides guidance for using browser automation. Requires the dev-browser plugin to be installed:
/plugin marketplace add sawyerhood/dev-browser /plugin install dev-browser@sawyerhood/dev-browser
When to Use Browser Automation
Good use cases:
-
Testing local development (localhost, staging)
-
Verifying UI changes after code modifications
-
Debugging visual issues or user flows
-
Extracting data from web pages
-
Automating repetitive browser tasks
Poor use cases:
-
Tasks that require authenticated sessions you can't access
-
High-frequency scraping (use APIs instead)
-
Actions on production systems without explicit approval
Core Patterns
- Persistent Page Sessions
Dev-browser maintains page state across interactions. Use this for multi-step workflows:
-
Navigate once to the page
-
Inspect → identify elements
-
Interact → click, type, verify
-
Don't reload unless necessary
-
LLM-Friendly DOM Inspection
Use DOM snapshots over screenshots when possible:
-
Snapshots are structured and searchable
-
Screenshots require visual interpretation
-
Combine both for complex debugging
Pattern:
snapshot → identify element refs → interact with refs
- Step-by-Step for Exploration
When exploring unknown pages:
-
Take snapshot to understand structure
-
Identify interactive elements
-
Take one action
-
Verify result with new snapshot
-
Repeat
-
Full Scripts for Known Flows
When you know the exact flow:
- Write complete interaction sequence
- Execute in one script
- Verify final state
Common Operations
Navigation
-
browser_navigate
-
Go to URL
-
browser_navigate_back
-
Go back
-
browser_snapshot
-
Get page structure (preferred)
-
browser_take_screenshot
-
Visual capture
Interaction
-
browser_click
-
Click element by ref
-
browser_type
-
Type into element
-
browser_fill_form
-
Fill multiple fields
-
browser_select_option
-
Select from dropdown
-
browser_press_key
-
Keyboard input
Waiting
-
browser_wait_for
-
Wait for text/element/time
-
Always wait after navigation or actions that trigger loading
Debugging
-
browser_console_messages
-
Check for errors
-
browser_network_requests
-
Inspect API calls
Best Practices
- Reference-Based Interaction
Always use element refs from snapshots, not CSS selectors:
snapshot → find ref="btn-42" → click ref="btn-42"
- Explicit Waits
After actions that cause page changes:
click → wait_for text="Success" → continue
- Error Recovery
If an action fails:
-
Take new snapshot
-
Verify page state
-
Adjust approach
- Form Filling
Use browser_fill_form for multiple fields:
fill_form([ {name: "email", type: "textbox", ref: "...", value: "..."}, {name: "password", type: "textbox", ref: "...", value: "..."} ])
- Verification Pattern
After completing a flow:
- Take final snapshot or screenshot
- Verify expected elements present
- Check console for errors
- Report success/failure with evidence
Integration with Glean Workflows
Testing Agent-Generated Content
-
Build agent in Glean
-
Navigate to Glean in browser
-
Test agent responses
-
Verify output format and accuracy
Verifying Customer Deployments
-
Navigate to customer's Glean instance (if accessible)
-
Test specific agent or search functionality
-
Document results with screenshots
Local Development Testing
-
Start local dev server
-
Navigate to localhost
-
Test changes iteratively
-
Verify before committing
Example Workflow
Testing a login flow:
- browser_navigate("http://localhost:3000/login")
- browser_snapshot() → identify form elements
- browser_fill_form([ {name: "email", ref: "input-1", value: "test@example.com"}, {name: "password", ref: "input-2", value: "testpass"} ])
- browser_click(ref: "submit-btn")
- browser_wait_for(text: "Dashboard")
- browser_snapshot() → verify logged in state
- Report: "Login successful - dashboard loaded"
Skill version: 1.0.0 Requires: dev-browser plugin -- Axon | 2026-01-01