agent-browser

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "agent-browser" with this command: npx skills add hjewkes/agent-skills/hjewkes-agent-skills-agent-browser

Browser Automation with agent-browser

Core Workflow

Every browser automation follows this pattern:

  1. Navigate: agent-browser open <url>
  2. Snapshot: agent-browser snapshot -i (get element refs like @e1, @e2)
  3. Interact: Use refs to click, fill, select
  4. Re-snapshot: After navigation or DOM changes, get fresh refs
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i  # Check result

Essential Commands

# Navigation
agent-browser open <url>              # Navigate
agent-browser close                   # Close browser

# Snapshot
agent-browser snapshot -i             # Interactive elements with refs (recommended)
agent-browser snapshot -i -C          # Include cursor-interactive elements

# Interaction (use @refs from snapshot)
agent-browser click @e1               # Click
agent-browser fill @e2 "text"         # Clear and type
agent-browser select @e1 "option"     # Select dropdown
agent-browser check @e1               # Checkbox
agent-browser press Enter             # Key press

# Wait
agent-browser wait @e1                # Wait for element
agent-browser wait --load networkidle # Wait for network idle

# Capture
agent-browser screenshot              # Screenshot
agent-browser screenshot --full       # Full page

Ref Lifecycle (Important)

Refs (@e1, @e2, etc.) are invalidated when the page changes. Always re-snapshot after clicking links/buttons that navigate, form submissions, or dynamic content loading.

Deep-Dive Documentation

ReferenceWhen to Use
references/commands.mdFull command reference with all options
references/snapshot-refs.mdRef lifecycle, invalidation rules, troubleshooting
references/session-management.mdParallel sessions, state persistence, concurrent scraping
references/authentication.mdLogin flows, OAuth, 2FA handling, state reuse
references/video-recording.mdRecording workflows for debugging and documentation
references/proxy-support.mdProxy configuration, geo-testing, rotating proxies
references/common-patterns.mdForm submission, auth, data extraction, parallel sessions, iOS simulator
references/semantic-locators.mdAlternatives to refs
references/javascript-evaluation.mdeval rules, --stdin/-b explanation

Ready-to-Use Templates

TemplateDescription
templates/form-automation.shForm filling with validation
templates/authenticated-session.shLogin once, reuse state
templates/capture-workflow.shContent extraction with screenshots

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

github-pr

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

managing-github-issues

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

code-review

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

subagent-driven-development

No summary provided by upstream source.

Repository SourceNeeds Review