Browse — Browser Automation for Agents
How it works
browse is a CLI that wraps Playwright behind a persistent daemon on a Unix socket. The daemon cold-starts in ~3s on first use, then every command runs in sub-200ms. Session state (cookies, localStorage, auth tokens) persists across commands within a session.
All output is plain text. Objects are JSON-stringified. Commands return non-zero on failure with an error message.
Important constraints:
- Commands are sequential — do not run multiple
browsecommands in parallel. The daemon handles one command at a time. - Run
browse helpfor the full command list, orbrowse help <command>for detailed usage and flags.
The ref system — read this first
Refs (@e1, @e2, ...) are how you target elements. They replace CSS selectors for most interactions.
Rules:
- Always
browse snapshotbefore interacting. Refs only exist after a snapshot. - Refs are ephemeral. Every
snapshotcall regenerates them. Old refs are invalid. - Refs go stale after navigation. Any
gotoor click that changes the page invalidates refs. You'll get a clear error — justbrowse snapshotagain.
Core interaction loop:
browse snapshot # see what's on the page — get refs
browse fill @e3 "test" # fill the search field
browse click @e4 # click a button
browse snapshot # re-snapshot after the page changes
Workflow
The standard pattern for any browser task:
- Navigate:
browse goto <url> - Observe:
browse snapshotfor page structure (interactive elements with refs). Usebrowse snapshot -ito include structural elements (headings, text), or-ffor the full accessibility tree. - Check for errors:
browse console --level errorafter navigation. - Interact:
browse fill @eN "value",browse click @eN,browse hover @eN,browse press Tab,browse select @eN "option",browse scroll @eN(scroll into view).- Use
browse press <key>for keyboard navigation (Tab, Escape, Enter, ArrowDown, Shift+Tab, etc.). Multiple keys:browse press Tab Tab Tab. - Use
browse scroll down/upto page through content,browse scroll top/bottomto jump to extremes. - After clicks that trigger SPA navigation, use
browse wait url /path,browse wait text "Expected", orbrowse wait visible .selectorbefore snapshotting.
- Use
- Verify:
browse snapshotorbrowse screenshotafter each interaction to confirm the result. - Repeat: Move through pages and flows.
For configured applications, browse healthcheck gives a quick pass/fail across key pages.
Key commands by category
| Category | Commands |
|---|---|
| Navigate | goto <url>, url, back, forward, reload [--hard], text, version, quit, wipe |
| Observe | snapshot, screenshot (--diff, --threshold), console, network |
| Interact | click @eN, hover @eN [--duration ms], press <key> [key ...], fill @eN "value", select @eN "option", upload @eN <file> [file ...], attr @eN [attribute], scroll down/up/top/bottom/@eN/x y |
| Wait | wait url <str>, wait text <str>, wait visible <sel>, wait hidden <sel>, wait network-idle, wait <ms> |
| Viewport | viewport, goto --viewport/--device/--preset |
| Evaluate | eval <expr> (in-page JS), page-eval <expr> (Playwright page API) |
| Auth | login --env <name>, auth-state save/load <path> |
| Tabs | tab list/new/switch/close |
| Assert | assert visible/text-contains/url-contains/... |
| Accessibility | a11y (full page), a11y @eN (element), a11y --standard wcag2aa, a11y --json |
| Flows | flow list, flow <name> --var key=value (--reporter junit, --dry-run, --stream), healthcheck (--reporter junit, --parallel, --concurrency) |
| Sessions | session list/create/close, --session <name> on any command |
| Tracing | trace start (--screenshots, --snapshots), trace stop --out <path>, trace status |
| Tooling | init, report --out <path>, screenshots list/clean/count, completions bash/zsh/fish, status --json |
Run browse help <command> for flags and detailed usage — don't guess at flags.
Named sessions
Use named sessions to run multiple independent page groups:
browse session create worker-1 # shared context (same cookies/storage)
browse session create worker-2 --isolated # isolated context (separate cookies/storage)
browse --session worker-1 goto https://a.com
browse --session worker-2 goto https://b.com
browse session list
browse session close worker-1
By default, sessions share the browser context. Use --isolated for fully separate cookies, storage, and permissions.
Authentication
Configured login (preferred — uses browse.config.json):
browse login --env staging
Manual login:
browse goto https://app.example.com/login
browse snapshot
browse fill @e1 "user@example.com"
browse fill @e2 "password123"
browse click @e3
browse snapshot # verify redirect / dashboard loaded
Session reuse — save after login, load in future sessions:
browse auth-state save /tmp/auth.json
browse auth-state load /tmp/auth.json
Use browse wipe to clear all session data before switching accounts or at the end of a session.
Visual diff
Compare screenshots against a baseline to detect visual regressions:
browse screenshot current.png --diff baseline.png
browse screenshot current.png --diff baseline.png --threshold 5
Output includes similarity percentage, diff pixel count, and a path to the diff image (changed pixels highlighted in red).
Headed mode
Launch the browser visibly for debugging (set before the daemon starts):
BROWSE_HEADED=1 browse goto https://example.com
Timeout control
Any command accepts --timeout <ms> (default 30s). Use for slow pages:
browse goto https://slow-page.example.com --timeout 60000
Error recovery
| Error | Fix |
|---|---|
"element is outside of the viewport" | Run browse scroll @eN to scroll it into view, then retry |
"Refs are stale" / "Unknown ref" | Run browse snapshot to refresh refs |
"Daemon connection lost" | Re-run the command — CLI auto-restarts the daemon |
"Command timed out after Nms" | Use --timeout 60000, or check the URL |
"Daemon crashed and recovery failed" | Run browse quit, then retry |
"Unknown command" for a valid command | Stale daemon — run browse quit, then retry |
"Unknown flag" | Check browse help <cmd> for valid flags |
| Login fails | Check env vars, verify login URL, browse screenshot to see the page |