magicbrowse

Use magicbrowse to reach a target page when your own browser tooling cannot do it reliably. The planner runs two LLM loops per task and is slower than direct browser control; prefer your own tools when they suffice. Use magicbrowse to reach a target page (search, form-filling, multi-step navigation), then hand off to magicpay for any protected step.

Setup Check

Run magicbrowse doctor first on a fresh install. It verifies the shared MagicPay gateway config and reachability.
If it fails, run magicbrowse init <apiKey> (sign up at https://agents.mercuryo.io/signup), or set MAGICPAY_API_KEY in the environment. Persisted config lives at ~/.magicpay/config.json, shared with the magicpay skill.
Only proceed to launch and act once doctor passes.

Hard Rules

Consequential actions require approval. magicbrowse may navigate, inspect, draft, and prepare. It must stop and ask before submitting a form, posting or sending content, accepting terms, changing account data or settings, booking, buying, ordering, deleting or modifying remote data, or otherwise committing an irreversible or account-affecting action. After approval, re-run observe and execute only the approved final action.

MagicPay boundary. Do not use act, type, fill, or select for any of the following on any page:

login or signup credentials (email, username, password, OTP),

identity-document fields (passport, ID, KYC address, DOB tied to identity),

payment-card or banking fields (PAN, CVV, expiry, IBAN, account),

any value sourced from a vault or secret store.

Stop at the form boundary and switch to the magicpay skill.

Target-ids are snapshot-scoped. Valid only for the observe snapshot that produced them. Re-run observe after any click, type, navigation, popup, or lazy-load before the next primitive — reusing an old id silently addresses a different element.

✓ observe → click 12 → observe → type 7 "hello" ✗ observe → click 12 → type 7 "hello"

One workflow per MAGICBROWSE_HOME. The current-session pointer at $MAGICBROWSE_HOME/current-session.json (default ~/.magicbrowse/) is a singleton. Concurrent workflows on the same home overwrite each other. Set a distinct MAGICBROWSE_HOME per workflow for parallel use.

Fresh browser by default. Prefer an owned, fresh browser session. Use attach, --profile, or --user-data-dir only when the user explicitly approves that browser/session for the current task. Keep CDP endpoints private. Close the session before unrelated work.

Page context can leave the browser. LLM-backed act sends page state to the MagicPay gateway; --use-vision can include screenshots. Avoid private pages unless the user approves that workflow, and stop at protected forms.

Primary Workflow

Contract: launch [url] → act … act → close. Sequential act calls in one session preserve page state and planner memory.

magicbrowse launch <url> — start an owned Chrome session pre-placed at the entry URL. --headful opts out of headless. To attach to an existing CDP browser instead, first get explicit user approval for that endpoint/session: magicbrowse attach <cdp-url-or-ws-endpoint> (positional, not a --cdp-url flag).
magicbrowse act "<goal>" — natural-language browser step. Prompt is positional. act does not take --url; you cannot reset the page from inside act. To re-anchor, close and launch again.
Repeat act for the next strategic granule.
magicbrowse close — release the session when done.

magicbrowse run exists in the CLI for one-shot developer use. It is not part of this skill contract — its bundled close destroys continuity. Do not use it in an orchestrated workflow.

Fallback Ladder

Try in order. Do not start at layer 4 just because primitives exist.

Your own browser tooling (Computer Use, native browser tools).
magicbrowse act "<goal>" — DOM-only navigator.
magicbrowse act "<goal>" --use-vision — same goal, navigator with screenshots. Use only when the user is comfortable sending screenshots/page context for this workflow. Vision is a retry mode for the same task; keep the granule.
magicbrowse observe + primitives — click <target-id>, type <target-id> <text>, fill <target-id> <value>, select <target-id> <option-text>, press <keys>. Use only when vision-mode act cannot make progress, or when single-element precision is required. press is global — click first if focus matters.
Surface failure to the user.

Goal Granularity

Granule = atomic strategic segment. End each act where the orchestrator needs the next strategic decision. Tactics (which form field first) live inside act; strategy (this partner is wrong, try another) lives between act calls.
Target horizon: 15-30 navigator steps per act; smaller is safer. maxSteps: 100 is a safety ceiling. The planner self-validates done=true, so longer tasks have more room for false-positive completion. Prefer smaller granules when the success criterion cannot be checked externally.
Auth walls and captcha are hard boundaries, not obstacles. A task that plans through auth ends with status: completed and a finalMessage asking for login, not failed. Plan tasks to end at auth, not through it.
Rely on session memory; do not re-narrate. Sequential act calls in one session preserve page state and planner memory. Do not write "as we already found, continue with…" into goals — if you feel the need to, the granularity is wrong.

Goal Formulation

No element indexes or selectors in goal text. Indexes renumber on every DOM scan. Describe elements semantically.
- ✗ act "click target 14"
- ✓ act "click the 'Continue' button under the price summary"
Describe the expected terminal state where it adds a checkable criterion.
- ✗ act "get to checkout"
- ✓ act "navigate to a checkout page that shows passenger fields and total fare"
Pass the starting URL to launch, not as a separate step. To switch sites mid-workflow, either close and re-launch, or describe the navigation inside the goal text.

Common Mistakes

Element indexes ([14], target 7) in goal text.

magicbrowse run for orchestrated multi-step workflows.

type / fill / select / act on protected fields instead of switching to magicpay.

Letting act submit, post, book, buy, save, delete, or otherwise commit an account-affecting action without explicit approval.

Attaching to a logged-in browser or named profile without explicit approval for the current task.

Re-narrating prior act results into the next goal — sequential act calls keep state.

Starting at layer 4 (observe + primitives) without trying act.

Reusing a target-id from before a click, navigation, or popup.

Status and Errors

act returns status: completed | failed | max_steps | cancelled. completed does not always mean task success — auth walls and captcha return completed with a finalMessage asking for human action. Parse finalMessage for the actual outcome. See references/statuses.md.

References

references/commands.md — every CLI command.
references/workflow.md — worked end-to-end example.
references/guardrails.md — long-form hard rules.
references/statuses.md — outcome codes and finalMessage parsing.

Safety Notice

Copy this and send it to your AI assistant to learn

Setup Check

Hard Rules

Primary Workflow

Fallback Ladder

Goal Granularity

Goal Formulation

Common Mistakes

Status and Errors

References

Source Transparency

Related Skills

Image Deck

Aegis Bridge

Mini Coder Max

kotlin-specialist