Skill: weavgui Desktop Automation

Installation

uv tool install weavgui

Verify:

weavgui --version

macOS requirement: mouse and keystroke automation requires Accessibility permission.
Grant it at: System Settings > Privacy & Security > Accessibility

Auto-Screenshot Behavior

Every action command automatically captures a screenshot to screenshot.png in the current working directory after a short delay. You never need to pass output paths or screenshot flags.

Command	Auto-screenshot delay
`screenshot`	immediate
`mouse move`, `mouse moveto`	500 ms
`mouse click`, `doubleclick`, `rightclick`	2 s
`keystroke`	1 s

After each command, read screenshot.png as an image to observe the current state of the screen.

Coordinate System

All mouse and screenshot commands share the same coordinate space:

Origin (0, 0) is the top-left of the primary display
x increases to the right, y increases downward
Coordinates are logical pixels (on macOS Retina, screenshots are auto-downscaled to match)

Core Commands

Screenshot

weavgui screenshot

Always saves to screenshot.png in the current working directory. Always draws cursor markers:

Red crosshair lines
Red small box (100×100 px, radius 50)
Green medium box (200×200 px, radius 100)
Blue large box (600×600 px, radius 300)

The three concentric boxes are positioning references — use them to gauge how far to move the mouse next:

Target location	Delta range
Inside red box	Fine: `±50 px`
Between red and green	Medium: `±50–100 px`
Between green and blue	Coarse: `±100–300 px`
Outside blue box	Large move — estimate from full screenshot

The command also prints the current mouse coordinates and display bounds to stdout.

Mouse Move

weavgui mouse move '(dx,dy)'

Moves the mouse by a relative delta. The argument uses (dx,dy) format — negative values work naturally. Fails if the target would leave the display bounds. Prints the start position, end position, and display bounds to stdout. Automatically saves a screenshot to screenshot.png after a 500 ms delay.

Mouse Move To

weavgui mouse moveto '(x,y)'

Moves the mouse to an absolute position. Fails if the position is outside the display bounds. Automatically saves a screenshot to screenshot.png after a 500 ms delay.

Mouse Click

weavgui mouse click         # left click
weavgui mouse doubleclick   # double left click
weavgui mouse rightclick    # right click

All clicks happen at the current cursor position. A screenshot is automatically saved to screenshot.png after a 2 s delay to allow the UI to settle.

Keystroke

weavgui keystroke <keys>

Examples: c, ctrl+c, command+c, shift+a, command+z

A screenshot is automatically saved to screenshot.png after a 1 s delay.

Pasteboard

weavgui pasteboard write <text...>   # write to clipboard
weavgui pasteboard read              # read from clipboard

Critical Workflow: Precise Mouse Positioning

Never guess a target coordinate and click immediately.

mouse move accepts only relative deltas; mouse moveto accepts absolute coordinates but you still need to know the target pixel position. The correct approach is an iterative positioning loop:

screenshot → analyze image → move mouse → (auto-screenshot) → analyze image → move mouse → ... → click

Step-by-step

Take a screenshot and read the image into context:
```
weavgui screenshot
```
Then read screenshot.png as an image attachment.
Analyze the screenshot: Identify the target UI element. Read the cursor marker position from the stdout output (printed automatically). Use the three reference boxes to gauge your delta:
- Target inside the red box (radius 50) → fine delta, within ±50 px
- Target inside the green box (radius 100) → medium delta, within ±100 px
- Target inside the blue box (radius 300) → coarse delta, within ±300 px
- Target outside the blue box → large move, estimate from the full screenshot
Move the mouse:
```
weavgui mouse move '(dx,dy)'
```
A screenshot is saved to screenshot.png automatically after 500 ms. Read it immediately.
Verify position: Check that the crosshair (red lines) is now centered on the target. If not, repeat from step 2 with a corrected delta.
Click only when the cursor is confirmed on-target:
```
weavgui mouse click
```
A screenshot is saved automatically after 2 s. Read it to confirm the action took effect.

Why this loop matters

Even with mouse moveto, you need the exact target pixel coordinate — which requires a screenshot to determine.
Screen content, window positions, and scroll state can all shift between steps.
Even a single iteration of screenshot → analyze → move can land the cursor accurately.
For high-precision targets (small buttons, text fields), two or three iterations are typical.

Example: clicking a button labeled "Submit"

# Step 1: initial screenshot
weavgui screenshot
# → read screenshot.png, observe crosshair at (500, 400), Submit button at approx (720, 610)
# → estimate dx=220, dy=210

# Step 2: move toward target (auto-screenshots after 500 ms)
weavgui mouse move '(220,210)'
# → read screenshot.png, crosshair now at (720, 608) — close enough

# Step 3: click (auto-screenshots after 2 s)
weavgui mouse click
# → read screenshot.png to confirm the click took effect

Delegate to a Subagent

The iterative positioning loop loads multiple screenshots into context, which can consume significant tokens. When possible, launch a subagent (Task tool) to perform the entire positioning-and-click sequence, keeping the main conversation context clean.

How to delegate

Use the Task tool with a prompt that describes:

The target element (e.g. "the Submit button in the bottom-right of the dialog")
The action to perform once positioned (e.g. mouse click, mouse doubleclick)
Any follow-up actions (e.g. type text, press a key)

Example prompt for the Task tool:

Use the weavgui CLI to click the "Submit" button visible on screen.

Workflow:
1. Run `weavgui screenshot`, then read screenshot.png as an image.
2. Identify the "Submit" button. Read the crosshair position from stdout.
3. Estimate (dx, dy) from the crosshair to the button center, run `weavgui mouse move '(dx,dy)'`.
4. Read screenshot.png (auto-captured after 500 ms), verify the crosshair is on the button. Adjust if needed.
5. Run `weavgui mouse click`.
6. Read screenshot.png (auto-captured after 2 s) to confirm the click took effect.

Return a summary of what happened and the final mouse position.

Benefits

Saves main context: screenshots stay inside the subagent and are discarded when it finishes.
Isolation: if the loop takes many iterations, the main conversation is unaffected.
Composability: you can launch multiple subagents in sequence (e.g. one to click a field, another to type text) without accumulating images.

When NOT to delegate

If you only need a single screenshot for analysis (no mouse interaction), just run the command directly.
If the task is a single click where you are already confident about the position.

Text Input Workflow

To type text into a focused field:

Click the target field (using the positioning loop above)
Optionally select all existing text: weavgui keystroke command+a
Write new text to the pasteboard: weavgui pasteboard write <your text>
Paste: weavgui keystroke command+v

weavgui mouse click
weavgui keystroke command+a
weavgui pasteboard write Hello World
weavgui keystroke command+v

Tips

Every action command auto-captures screenshot.png — always read it after each command to observe the result.
Always prefer the iterative screenshot loop over single-shot coordinate estimation.
After any keyboard shortcut that changes screen state (e.g. command+z, return), the auto-screenshot is taken automatically after 1 s — read screenshot.png before proceeding.
The stdout output of every command includes the current mouse position — use this as a precise anchor for the next delta calculation.

weavgui

Safety Notice

Copy this and send it to your AI assistant to learn