When to Use
User needs the agent to control the real Safari browser on macOS, not a generic headless browser. Use this when the task depends on the user's actual Safari session, open tabs, cookies, login state, AppleScript automation, safaridriver, or Safari-only rendering.
Choose this skill when the next step is to read a page, open or switch tabs, run JavaScript in Safari, click or type through AppleScript, capture Safari screenshots, or launch an isolated Safari WebDriver session. If the real requirement is generic browser automation after that, hand off to playwright.
Architecture
Memory lives in ~/safari/. If ~/safari/ does not exist, run setup.md. See memory-template.md for structure and starter files.
~/safari/
|-- memory.md # Activation defaults, preferred control mode, and guardrails
|-- permissions.md # Automation, Develop menu, and screenshot preflight state
|-- sessions.md # Real-session vs WebDriver session notes and target tabs
|-- snippets.md # Known-good AppleScript and shell patterns worth reusing
|-- recipes.md # High-value task recipes such as read, click, fill, capture
`-- incidents.md # Permission failures, blocked JS, and repeat breakages
Quick Reference
Load only the smallest file needed for the current blocker.
| Topic | File |
|---|---|
| Setup guide | setup.md |
| Memory template | memory-template.md |
| Permissions and preflight checks | preflight-and-permissions.md |
| AppleScript control and real-session commands | applescript-control.md |
safaridriver, WebDriver, and BiDi usage | webdriver-and-bidi.md |
| Screenshot feedback loop and visual verification | screenshot-and-visual-loop.md |
| Failure patterns and recovery order | troubleshooting.md |
Requirements
- macOS with Safari installed and
osascript,safaridriver, andscreencaptureavailable. - Explicit approval before enabling Safari remote automation, enabling JavaScript-from-automation prompts, granting Apple Events or Screen Recording access, or controlling the daily browsing profile.
- Treat the user's open tabs, cookies, login state, clipboard, downloads, and any visible page content as sensitive session data.
If direct Safari access is unavailable, stay in planning mode and prepare commands or scripts instead of pretending the browser is controllable.
Control Modes
This skill covers two distinct control surfaces:
- AppleScript mode for the user's real Safari session: current tabs, real cookies, logged-in pages, JavaScript execution, and screenshots of what the user actually has open.
- WebDriver mode for isolated automation:
safaridriver, standard WebDriver clients, and Safari-specific validation without touching the user's daily tabs unless they approve it.
Do not blur the two. Real-session control and isolated automation have different risk, visibility, and verification rules.
Data Storage
Keep only durable Safari operating context in ~/safari/:
- approved control mode defaults and whether daily-session control is allowed
- permission state for Apple Events, Screen Recording, Develop menu, and remote automation
- known-good snippets, task recipes, and incidents worth reusing
- recurring no-go actions such as "never control banking tabs" or "never type blindly"
Core Rules
1. Choose the Control Surface Before Sending Any Command
- Decide first whether the task needs the real Safari session or an isolated WebDriver session.
- Use AppleScript mode when the user needs their actual logged-in browser state.
- Use
safaridrivermode when the task should avoid their daily tabs or needs cleaner automation boundaries.
2. Run a Preflight Before Touching the Browser
- Verify Safari is installed, the expected permissions are present, and the target window or session exists.
- For AppleScript, confirm Safari responds to a simple read command before trying clicks or typing.
- For WebDriver, confirm
safaridriver --enablehas been run and the local driver starts cleanly.
3. Read and Verify Before You Click or Type
- Start with title, URL, DOM text, or screenshot checks so the target surface is explicit.
- After every navigation or action, re-read or re-screenshot the page before assuming success.
- A shell command returning zero is not proof that Safari is showing the expected state.
4. Treat the Real Safari Session as High-Trust State
- The real Safari window may contain logged-in accounts, active drafts, and sensitive tabs.
- Ask before activating Safari, switching windows, typing, clicking, copying page data, or capturing screenshots that may reveal personal content.
- Prefer a dedicated Safari profile or an isolated WebDriver session for risky or repetitive automation.
5. Use AppleScript for Real State, WebDriver for Clean State
osascriptcan inspect and control the user's live Safari windows and tabs.safaridrivergives a cleaner automation boundary but does not automatically inherit the user's existing Safari tab state.- Do not promise real-session continuity from WebDriver mode unless you verified that exact setup.
6. Never Type Blindly Into Safari
- Before keystrokes, focus the correct tab and confirm the intended element or page state.
- Prefer DOM-based input with verification over raw keystrokes when possible.
- If Safari focus is uncertain, stop and recover state before sending more input.
7. Route Adjacent Problems to the Right Skill
- Use this skill for Safari control, Safari session inspection, and Safari-specific WebDriver setup.
- Hand off deep AppleScript design to
applescript, macOS permission debugging tomacos, generic browser scripting toplaywright, and credential or WebAuthn logic topasskey.
Safari Traps
- Treating
safaridriveras if it automatically controls the tabs the user already has open -> session assumptions go wrong fast. - Typing through
System Eventswithout verifying Safari focus -> input lands in the wrong field or wrong app. - Clicking by guessed selectors without reading the page first -> automation hits the wrong element or stale DOM.
- Capturing screenshots without warning when the real session contains sensitive tabs -> privacy failure.
- Enabling automation on the daily profile and leaving it there -> unnecessary exposure and flaky state.
- Assuming a command succeeded because Safari opened -> always re-read or re-screenshot the target page.
External Endpoints
This skill makes no external requests on its own.
| Endpoint | Data Sent | Purpose |
|---|---|---|
| None | None | N/A |
No data is sent externally.
Security & Privacy
Data that leaves your machine:
- None from this skill itself
Data that stays local:
- optional Safari control notes in
~/safari/ - permission state, approved recipes, and incident notes approved by the user
This skill does NOT:
- send undeclared network requests
- recommend bypassing paywalls, anti-fraud controls, or account protections
- claim browser control without verifying permissions and target state
- store passwords, raw history exports, or full browsing archives in its own memory files
Scope
This skill ONLY:
- helps control Safari safely through AppleScript, screenshots, and
safaridriver - structures real-session and isolated-session workflows into reversible steps
- keeps durable notes for approved modes, snippets, and recurring incidents
This skill NEVER:
- act as a generic search-engine skill
- claim live browser state it cannot verify
- store secrets, credentials, or full browsing history in its own memory files
- modify its own skill files
Related Skills
Install with clawhub install <slug> if user confirms:
applescript- Write safer AppleScript when Safari control moves beyond known-good snippets.macos- Handle Apple Events, Screen Recording, app focus, and native system diagnostics on Mac.playwright- Automate browser flows once the task no longer depends on the user's real Safari session.ios- Bridge Safari-adjacent workflows when the target moves to iPhone or iPad.passkey- Diagnose Safari sign-in and WebAuthn behavior without guessing at credential rules.
Feedback
- If useful:
clawhub star safari - Stay updated:
clawhub sync