Use magicbrowse to reach a target page when your own browser tooling
cannot do it reliably. The planner runs two LLM loops per task and is
slower than direct browser control; prefer your own tools when they
suffice. Use magicbrowse to reach a target page (search,
form-filling, multi-step navigation), then hand off to magicpay for
any protected step.
Setup Check
- Run
magicbrowse doctorfirst on a fresh install. It verifies the shared MagicPay gateway config and reachability. - If it fails, run
magicbrowse init <apiKey>(sign up athttps://agents.mercuryo.io/signup), or setMAGICPAY_API_KEYin the environment. Persisted config lives at~/.magicpay/config.json, shared with themagicpayskill. - Only proceed to
launchandactoncedoctorpasses.
Hard Rules
Consequential actions require approval.
magicbrowsemay navigate, inspect, draft, and prepare. It must stop and ask before submitting a form, posting or sending content, accepting terms, changing account data or settings, booking, buying, ordering, deleting or modifying remote data, or otherwise committing an irreversible or account-affecting action. After approval, re-runobserveand execute only the approved final action.
MagicPay boundary. Do not use
act,type,fill, orselectfor any of the following on any page:
- login or signup credentials (email, username, password, OTP),
- identity-document fields (passport, ID, KYC address, DOB tied to identity),
- payment-card or banking fields (PAN, CVV, expiry, IBAN, account),
- any value sourced from a vault or secret store.
Stop at the form boundary and switch to the
magicpayskill.
Target-ids are snapshot-scoped. Valid only for the
observesnapshot that produced them. Re-runobserveafter any click, type, navigation, popup, or lazy-load before the next primitive — reusing an old id silently addresses a different element.✓
observe→click 12→observe→type 7 "hello"✗observe→click 12→type 7 "hello"
One workflow per
MAGICBROWSE_HOME. The current-session pointer at$MAGICBROWSE_HOME/current-session.json(default~/.magicbrowse/) is a singleton. Concurrent workflows on the same home overwrite each other. Set a distinctMAGICBROWSE_HOMEper workflow for parallel use.
Fresh browser by default. Prefer an owned, fresh browser session. Use
attach,--profile, or--user-data-dironly when the user explicitly approves that browser/session for the current task. Keep CDP endpoints private. Close the session before unrelated work.
Page context can leave the browser. LLM-backed
actsends page state to the MagicPay gateway;--use-visioncan include screenshots. Avoid private pages unless the user approves that workflow, and stop at protected forms.
Primary Workflow
Contract: launch [url] → act … act → close. Sequential act calls in
one session preserve page state and planner memory.
magicbrowse launch <url>— start an owned Chrome session pre-placed at the entry URL.--headfulopts out of headless. To attach to an existing CDP browser instead, first get explicit user approval for that endpoint/session:magicbrowse attach <cdp-url-or-ws-endpoint>(positional, not a--cdp-urlflag).magicbrowse act "<goal>"— natural-language browser step. Prompt is positional.actdoes not take--url; you cannot reset the page from insideact. To re-anchor,closeandlaunchagain.- Repeat
actfor the next strategic granule. magicbrowse close— release the session when done.
magicbrowse run exists in the CLI for one-shot developer use. It
is not part of this skill contract — its bundled close destroys
continuity. Do not use it in an orchestrated workflow.
Fallback Ladder
Try in order. Do not start at layer 4 just because primitives exist.
- Your own browser tooling (Computer Use, native browser tools).
magicbrowse act "<goal>"— DOM-only navigator.magicbrowse act "<goal>" --use-vision— same goal, navigator with screenshots. Use only when the user is comfortable sending screenshots/page context for this workflow. Vision is a retry mode for the same task; keep the granule.magicbrowse observe+ primitives —click <target-id>,type <target-id> <text>,fill <target-id> <value>,select <target-id> <option-text>,press <keys>. Use only when vision-modeactcannot make progress, or when single-element precision is required.pressis global —clickfirst if focus matters.- Surface failure to the user.
Goal Granularity
- Granule = atomic strategic segment. End each
actwhere the orchestrator needs the next strategic decision. Tactics (which form field first) live insideact; strategy (this partner is wrong, try another) lives betweenactcalls. - Target horizon: 15-30 navigator steps per
act; smaller is safer.maxSteps: 100is a safety ceiling. The planner self-validatesdone=true, so longer tasks have more room for false-positive completion. Prefer smaller granules when the success criterion cannot be checked externally. - Auth walls and captcha are hard boundaries, not obstacles. A
task that plans through auth ends with
status: completedand afinalMessageasking for login, notfailed. Plan tasks to end at auth, not through it. - Rely on session memory; do not re-narrate. Sequential
actcalls in one session preserve page state and planner memory. Do not write "as we already found, continue with…" into goals — if you feel the need to, the granularity is wrong.
Goal Formulation
- No element indexes or selectors in goal text. Indexes renumber
on every DOM scan. Describe elements semantically.
- ✗
act "click target 14" - ✓
act "click the 'Continue' button under the price summary"
- ✗
- Describe the expected terminal state where it adds a checkable
criterion.
- ✗
act "get to checkout" - ✓
act "navigate to a checkout page that shows passenger fields and total fare"
- ✗
- Pass the starting URL to
launch, not as a separate step. To switch sites mid-workflow, eithercloseand re-launch, or describe the navigation inside the goal text.
Common Mistakes
- Element indexes (
[14],target 7) in goal text.magicbrowse runfor orchestrated multi-step workflows.type/fill/select/acton protected fields instead of switching tomagicpay.- Letting
actsubmit, post, book, buy, save, delete, or otherwise commit an account-affecting action without explicit approval.- Attaching to a logged-in browser or named profile without explicit approval for the current task.
- Re-narrating prior
actresults into the next goal — sequentialactcalls keep state.- Starting at layer 4 (observe + primitives) without trying
act.- Reusing a target-id from before a click, navigation, or popup.
Status and Errors
act returns status: completed | failed | max_steps | cancelled.
completed does not always mean task success — auth walls and captcha
return completed with a finalMessage asking for human action.
Parse finalMessage for the actual outcome. See
references/statuses.md.
References
- references/commands.md — every CLI command.
- references/workflow.md — worked end-to-end example.
- references/guardrails.md — long-form hard rules.
- references/statuses.md — outcome codes and
finalMessageparsing.