Unbrowse — Drop-in Browser Replacement for Agents

Browse once, cache the APIs, reuse them instantly. First call discovers and learns the site's APIs (~20-80s). Every subsequent call uses cached skills (<200ms for server-fetch, ~2s for sites requiring browser execution).

IMPORTANT: Always use the CLI (bun src/cli.ts ). NEVER pipe output to node -e , python -c , or jq — this causes shell escaping failures. Use --path , --extract , and --limit flags instead.

Server Startup

cd ~/.agents/skills/unbrowse && bun src/cli.ts health

If not running, the CLI auto-starts the server. First time requires ToS acceptance — ask the user:

Unbrowse needs you to accept its Terms of Service:

Discovered API structures may be shared in the collective registry
You will not use Unbrowse to attack, overload, or abuse any target site Full terms: https://unbrowse.ai/terms

After consent, the CLI handles startup automatically. First run also needs the browser engine:

cd ~/.agents/skills/unbrowse && npx agent-browser install

Core Workflow

Step 1: Resolve an intent

cd ~/.agents/skills/unbrowse && bun src/cli.ts resolve
--intent "get feed posts"
--url "https://www.linkedin.com/feed/"
--pretty

This returns available_endpoints — a ranked list of discovered API endpoints. Pick the right one by URL pattern (e.g., MainFeed for feed, HomeTimeline for tweets).

Step 2: Execute with extraction

Use --extract to get the fields you need. For well-known domains, use the known extraction patterns from the Examples section — don't wait for auto-extraction to guess.

cd ~/.agents/skills/unbrowse && bun src/cli.ts execute
--skill {skill_id}
--endpoint {endpoint_id}
--path "data.events[]"
--extract "name,url,start_at,price"
--limit 10 --pretty

See full schema without data

cd ~/.agents/skills/unbrowse && bun src/cli.ts execute
--skill {skill_id}
--endpoint {endpoint_id}
--schema --pretty

Get raw unprocessed response

cd ~/.agents/skills/unbrowse && bun src/cli.ts execute
--skill {skill_id}
--endpoint {endpoint_id}
--raw --pretty

--path

--extract
--limit replace ALL piping to jq/node/python.

Auto-extraction caveat: The CLI may auto-extract on first try, but for normalized APIs (LinkedIn Voyager, Facebook Graph) with mixed-type included[] arrays, auto-extraction often picks up the wrong fields. Always validate auto-extracted results — if you see mostly nulls or just metadata, ignore it and extract manually with known field patterns.

Step 3: Present results to the user

Show the user their data first. Do not block on feedback before returning information.

Step 4: Submit feedback (MANDATORY — but after presenting results)

Submit feedback after you've shown the user their results. This can run in parallel with your response.

cd ~/.agents/skills/unbrowse && bun src/cli.ts feedback
--skill {skill_id}
--endpoint {endpoint_id}
--rating 5
--outcome success

Rating: 5=right+fast, 4=right+slow(>5s), 3=incomplete, 2=wrong endpoint, 1=useless.

CLI Flags

Auto-generated from src/cli.ts CLI_REFERENCE — do not edit manually. Run bun scripts/sync-skill-md.ts to sync.

Commands

Command Usage Description

health

Server health check

resolve

--intent "..." --url "..." [opts]

Resolve intent → search/capture/execute

execute

--skill ID --endpoint ID [opts]

Execute a specific endpoint

feedback

--skill ID --endpoint ID --rating N

Submit feedback (mandatory after resolve)

--url "..."

Interactive browser login

skills

List all skills

skill

<id>

Get skill details

--intent "..." [--domain "..."]

Search marketplace

sessions

--domain "..." [--limit N]

Debug session logs

Global flags

Flag Description

--pretty

Indented JSON output

--no-auto-start

Don't auto-start server

--raw

Return raw response data (skip server-side projection)

resolve/execute flags

Flag Description

--schema

Show response schema + extraction hints only (no data)

--path "data.items[]"

Drill into result before extract/output

--extract "field1,alias:deep.path.to.val"

Pick specific fields (no piping needed)

--limit N

Cap array output to N items

--endpoint-id ID

Pick a specific endpoint

--dry-run

Preview mutations

--force-capture

Bypass caches, re-capture

--params '{...}'

Extra params as JSON

When --path /--extract are used, trace metadata is slimmed automatically (1MB raw -> 1.5KB output typical).

When NO extraction flags are used on a large response (>2KB), the CLI auto-wraps the result with extraction_hints instead of dumping raw data. This prevents context window bloat and tells you exactly how to extract. Use --raw to override this and get the full response.

Examples

Step 1: resolve — auto-executes and returns hints for complex responses

bun src/cli.ts resolve --intent "get events" --url "https://lu.ma" --pretty

Response includes extraction_hints.cli_args = "--path "data.events[]" --extract "name,url,start_at,city" --limit 10"

Step 2: use the hints directly

bun src/cli.ts execute --skill {id} --endpoint {id}
--path "data.events[]" --extract "name,url,start_at,city" --limit 10 --pretty

If you need to see the schema first

bun src/cli.ts execute --skill {id} --endpoint {id} --schema --pretty

X timeline — extract tweets with user, text, likes

bun src/cli.ts execute --skill {id} --endpoint {id}
--path "data.home.home_timeline_urt.instructions[].entries[].content.itemContent.tweet_results.result"
--extract "user:core.user_results.result.legacy.screen_name,text:legacy.full_text,likes:legacy.favorite_count"
--limit 20 --pretty

LinkedIn feed — extract posts from included[] (chained URN resolution)

bun src/cli.ts execute --skill {id} --endpoint {id}
--path "included[]"
--extract "author:actor.name.text,text:commentary.text.text,likes:socialDetail.totalSocialActivityCounts.numLikes,comments:socialDetail.totalSocialActivityCounts.numComments"
--limit 20 --pretty

Simple case — just limit results

bun src/cli.ts execute --skill {id} --endpoint {id} --limit 10 --pretty

Best Practices

Minimize round-trips — one CLI call, not five curl + jq pipes

Bad (5 steps):

curl ... /v1/intent/resolve | jq .skill.skill_id # Step 1: resolve curl ... /v1/skills/{id}/execute | jq . # Step 2: execute curl ... | jq '.result.included[]' # Step 3: drill in curl ... | jq 'select(.commentary)' # Step 4: filter curl ... | jq '{author, text, likes}' # Step 5: extract

Good (1 step):

bun src/cli.ts execute --skill {id} --endpoint {id}
--path "included[]"
--extract "text:commentary.text.text,author:actor.title.text,likes:numLikes,comments:numComments"
--limit 10 --pretty

Know the endpoint ID before executing

On first resolve for a domain, you'll get available_endpoints . Scan descriptions and URLs to pick the right one — don't blindly execute the top-ranked result.

Common patterns:

LinkedIn feed: look for voyagerFeedDashMainFeed in the URL
Twitter timeline: look for HomeTimeline in the URL
Luma events: look for /home/get-events in the URL
Notifications: look for /notifications/list in the URL

Once you know the endpoint ID, pass it with --endpoint on every subsequent call.

Domain skills have many endpoints — use search or description matching

After domain convergence, a single skill (e.g. linkedin.com ) may have 40+ endpoints. Don't scroll through all of them — filter by intent:

Search finds the best endpoint by embedding similarity

bun src/cli.ts search --intent "get my notifications" --domain "www.linkedin.com"

Or filter available_endpoints by URL/description pattern in the resolve response.

Mixed-type arrays and normalized APIs

Many APIs return heterogeneous arrays — posts, profiles, media, and metadata objects all mixed together (e.g. included[] , data[] , entries[] ). When you --extract fields, rows where all extracted fields are null are automatically dropped, so only objects that match your field selection survive. You don't need to filter by type.

Some APIs (LinkedIn Voyager, Facebook Graph) use normalized entity references — objects reference each other via *fieldName URN keys instead of nesting data inline. The CLI auto-resolves these chains when entityUrn -keyed arrays are detected:

Direct field: commentary.text.text → walks into nested object

URN chain: socialDetail.totalSocialActivityCounts.numLikes

→ socialDetail is inline, but totalSocialActivityCounts is a *URN reference

→ CLI resolves *totalSocialActivityCounts → looks up entity by URN → gets .numLikes

You don't need to know if a field is inline or URN-referenced — just use the dot path and the CLI resolves it automatically. If a field doesn't resolve, check --schema output for *fieldName patterns indicating URN references.

Large responses — trust extraction_hints

When a response is >2KB and no --path /--extract is given, the CLI returns extraction_hints instead of dumping raw JSON. Read extraction_hints.cli_args and paste it directly:

Response says: extraction_hints.cli_args = "--path "entries[]" --extract "name,start_at,url" --limit 10"

bun src/cli.ts execute --skill {id} --endpoint {id}
--path "entries[]" --extract "name,start_at,url" --limit 10 --pretty

Why the CLI over curl + jq

The CLI handles things that break with raw curl:

Shell escaping — zsh escapes != to != which breaks jq filters
URN resolution — chained entity references resolved automatically across normalized arrays
Null-row filtering — mixed-type arrays filtered to only objects matching your --extract fields
Auto-extraction — large responses wrapped with hints instead of dumping 500KB of JSON
Auth injection — cookies loaded from vault automatically
Server auto-start — boots the server if not running

Authentication

Automatic. Unbrowse extracts cookies from your Chrome/Firefox SQLite database — if you're logged into a site in Chrome, it just works.

If auth_required is returned:

cd ~/.agents/skills/unbrowse && bun src/cli.ts login --url "https://example.com/login"

User completes login in the browser window. Cookies are stored and reused automatically.

Other Commands

bun src/cli.ts skills # List all skills bun src/cli.ts skill {id} # Get skill details bun src/cli.ts search --intent "..." --domain "..." # Search marketplace bun src/cli.ts sessions --domain "linkedin.com" # Debug session logs bun src/cli.ts health # Server health check

Mutations

Always --dry-run first, ask user before --confirm-unsafe :

bun src/cli.ts execute --skill {id} --endpoint {id} --dry-run bun src/cli.ts execute --skill {id} --endpoint {id} --confirm-unsafe

REST API Reference

For cases where the CLI doesn't cover your needs, the raw REST API is at http://localhost:6969 :

Method Endpoint Description

POST /v1/intent/resolve

Resolve intent -> search/capture/execute

POST /v1/skills/:id/execute

Execute a specific skill

POST /v1/auth/login

Interactive browser login

POST /v1/feedback

Submit feedback with diagnostics

POST /v1/search

Search marketplace globally

POST /v1/search/domain

Search marketplace by domain

GET /v1/skills/:id

Get skill details

GET /v1/sessions/:domain

Debug session logs

GET /health

Health check

Rules

Always use the CLI — never pipe to node -e , python -c , or jq . Use --path /--extract /--limit instead.
Always try resolve first — it handles the full marketplace search -> capture pipeline
Don't blindly trust auto-extraction — for normalized APIs (LinkedIn, Facebook) auto-extraction often grabs wrong fields from mixed-type arrays. If you know the domain's extraction pattern (see Examples), use --extract directly. If auto-extraction fires, validate the result — mostly-null rows mean it picked the wrong fields.
NEVER guess paths by trial-and-error — use --schema to see the full response structure, or read _auto_extracted.all_fields / extraction_hints.schema_tree
Use --raw if you need the unprocessed full response
Check the result — if wrong endpoint, pick from available_endpoints and re-execute with --endpoint
If auth_required , use login then retry
Always --dry-run before mutations
Always submit feedback — but after presenting results to the user, not before
Report bugs and issues on GitHub — when something breaks, is slow, or behaves unexpectedly, file an issue:

gh issue create --repo unbrowse-ai/unbrowse
--title "bug: {short description}"
--body "## What happened\n{description}\n\n## Expected\n{what should have happened}\n\n## Context\n- Skill: {skill_id}\n- Endpoint: {endpoint_id}\n- Domain: {domain}\n- Error: {error message or status code}"

Categories: bug: (broken/wrong data), perf: (slow), auth: (login/cookie issues), feat: (missing capability)

unbrowse

Safety Notice

Copy this and send it to your AI assistant to learn

See full schema without data

Get raw unprocessed response

Step 1: resolve — auto-executes and returns hints for complex responses

Response includes extraction_hints.cli_args = "--path "data.events[]" --extract "name,url,start_at,city" --limit 10"

Step 2: use the hints directly

If you need to see the schema first

X timeline — extract tweets with user, text, likes

LinkedIn feed — extract posts from included[] (chained URN resolution)

Simple case — just limit results

Search finds the best endpoint by embedding similarity

Direct field: commentary.text.text → walks into nested object

URN chain: socialDetail.totalSocialActivityCounts.numLikes

→ socialDetail is inline, but totalSocialActivityCounts is a *URN reference

→ CLI resolves *totalSocialActivityCounts → looks up entity by URN → gets .numLikes

Response says: extraction_hints.cli_args = "--path "entries[]" --extract "name,start_at,url" --limit 10"

Source Transparency

Related Skills

unbrowse

foundry

unbrowse

Open-broker