agentic-browser

Browser automation for AI agents via inference.sh. Uses Playwright under the hood with a simple @e ref system for element interaction.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "agentic-browser" with this command: npx skills add inference-sh/skills@web-search

Agentic Browser

Browser automation for AI agents via inference.sh. Uses Playwright under the hood with a simple @e ref system for element interaction.

Quick Start

Install CLI

curl -fsSL https://cli.inference.sh | sh && infsh login

Open a page and get interactive elements

infsh app run agentic-browser --function open --input '{"url": "https://example.com"}' --session new

Core Workflow

Every browser automation follows this pattern:

  • Open - Navigate to URL, get @e refs for elements

  • Interact - Use refs to click, fill, drag, etc.

  • Re-snapshot - After navigation/changes, get fresh refs

  • Close - End session (returns video if recording)

1. Start session

RESULT=$(infsh app run agentic-browser --function open --session new --input '{ "url": "https://example.com/login" }') SESSION_ID=$(echo $RESULT | jq -r '.session_id')

Elements: @e1 [input] "Email", @e2 [input] "Password", @e3 [button] "Sign In"

2. Fill and submit

infsh app run agentic-browser --function interact --session $SESSION_ID --input '{ "action": "fill", "ref": "@e1", "text": "user@example.com" }' infsh app run agentic-browser --function interact --session $SESSION_ID --input '{ "action": "fill", "ref": "@e2", "text": "password123" }' infsh app run agentic-browser --function interact --session $SESSION_ID --input '{ "action": "click", "ref": "@e3" }'

3. Re-snapshot after navigation

infsh app run agentic-browser --function snapshot --session $SESSION_ID --input '{}'

4. Close when done

infsh app run agentic-browser --function close --session $SESSION_ID --input '{}'

Functions

Function Description

open

Navigate to URL, configure browser (viewport, proxy, video recording)

snapshot

Re-fetch page state with @e refs after DOM changes

interact

Perform actions using @e refs (click, fill, drag, upload, etc.)

screenshot

Take page screenshot (viewport or full page)

execute

Run JavaScript code on the page

close

Close session, returns video if recording was enabled

Interact Actions

Action Description Required Fields

click

Click element ref

dblclick

Double-click element ref

fill

Clear and type text ref , text

type

Type text (no clear) text

press

Press key (Enter, Tab, etc.) text

select

Select dropdown option ref , text

hover

Hover over element ref

check

Check checkbox ref

uncheck

Uncheck checkbox ref

drag

Drag and drop ref , target_ref

upload

Upload file(s) ref , file_paths

scroll

Scroll page direction (up/down/left/right), scroll_amount

back

Go back in history

wait

Wait milliseconds wait_ms

goto

Navigate to URL url

Element Refs

Elements are returned with @e refs:

@e1 [a] "Home" href="/" @e2 [input type="text"] placeholder="Search" @e3 [button] "Submit" @e4 [select] "Choose option" @e5 [input type="checkbox"] name="agree"

Important: Refs are invalidated after navigation. Always re-snapshot after:

  • Clicking links/buttons that navigate

  • Form submissions

  • Dynamic content loading

Features

Video Recording

Record browser sessions for debugging or documentation:

Start with recording enabled (optionally show cursor indicator)

SESSION=$(infsh app run agentic-browser --function open --session new --input '{ "url": "https://example.com", "record_video": true, "show_cursor": true }' | jq -r '.session_id')

... perform actions ...

Close to get the video file

infsh app run agentic-browser --function close --session $SESSION --input '{}'

Returns: {"success": true, "video": <File>}

Cursor Indicator

Show a visible cursor in screenshots and video (useful for demos):

infsh app run agentic-browser --function open --session new --input '{ "url": "https://example.com", "show_cursor": true, "record_video": true }'

The cursor appears as a red dot that follows mouse movements and shows click feedback.

Proxy Support

Route traffic through a proxy server:

infsh app run agentic-browser --function open --session new --input '{ "url": "https://example.com", "proxy_url": "http://proxy.example.com:8080", "proxy_username": "user", "proxy_password": "pass" }'

File Upload

Upload files to file inputs:

infsh app run agentic-browser --function interact --session $SESSION --input '{ "action": "upload", "ref": "@e5", "file_paths": ["/path/to/file.pdf"] }'

Drag and Drop

Drag elements to targets:

infsh app run agentic-browser --function interact --session $SESSION --input '{ "action": "drag", "ref": "@e1", "target_ref": "@e2" }'

JavaScript Execution

Run custom JavaScript:

infsh app run agentic-browser --function execute --session $SESSION --input '{ "code": "document.querySelectorAll("h2").length" }'

Returns: {"result": "5", "screenshot": <File>}

Deep-Dive Documentation

Reference Description

references/commands.md Full function reference with all options

references/snapshot-refs.md Ref lifecycle, invalidation rules, troubleshooting

references/session-management.md Session persistence, parallel sessions

references/authentication.md Login flows, OAuth, 2FA handling

references/video-recording.md Recording workflows for debugging

references/proxy-support.md Proxy configuration, geo-testing

Ready-to-Use Templates

Template Description

templates/form-automation.sh Form filling with validation

templates/authenticated-session.sh Login once, reuse session

templates/capture-workflow.sh Content extraction with screenshots

Examples

Form Submission

SESSION=$(infsh app run agentic-browser --function open --session new --input '{ "url": "https://example.com/contact" }' | jq -r '.session_id')

Get elements: @e1 [input] "Name", @e2 [input] "Email", @e3 [textarea], @e4 [button] "Send"

infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "John Doe"}' infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e2", "text": "john@example.com"}' infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e3", "text": "Hello!"}' infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "click", "ref": "@e4"}'

infsh app run agentic-browser --function snapshot --session $SESSION --input '{}' infsh app run agentic-browser --function close --session $SESSION --input '{}'

Search and Extract

SESSION=$(infsh app run agentic-browser --function open --session new --input '{ "url": "https://google.com" }' | jq -r '.session_id')

infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "weather today"}' infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "press", "text": "Enter"}' infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "wait", "wait_ms": 2000}'

infsh app run agentic-browser --function snapshot --session $SESSION --input '{}' infsh app run agentic-browser --function close --session $SESSION --input '{}'

Screenshot with Video

SESSION=$(infsh app run agentic-browser --function open --session new --input '{ "url": "https://example.com", "record_video": true }' | jq -r '.session_id')

Take full page screenshot

infsh app run agentic-browser --function screenshot --session $SESSION --input '{ "full_page": true }'

Close and get video

RESULT=$(infsh app run agentic-browser --function close --session $SESSION --input '{}') echo $RESULT | jq '.video'

Sessions

Browser state persists within a session. Always:

  • Start with --session new on first call

  • Use returned session_id for subsequent calls

  • Close session when done

Related Skills

Web search (for research + browse)

npx skills add inference-sh/skills@web-search

LLM models (analyze extracted content)

npx skills add inference-sh/skills@llm-models

Documentation

  • inference.sh Sessions - Session management

  • Multi-function Apps - How functions work

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

agent-tools

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

twitter-automation

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

agent-browser

No summary provided by upstream source.

Repository SourceNeeds Review