Windows Desktop Automation with winguictl
Scripts
The skill includes a standalone CLI script:
scripts\winguictl.py— Python CLI entry point (Windows only)
Quick start
# List windows (with window state, foreground flag, and hierarchical indentation)
python scripts\winguictl.py window list
# Control window state
python scripts\winguictl.py window --window-id <id> focus
python scripts\winguictl.py window --window-id <id> minimize
python scripts\winguictl.py window --window-id <id> maximize
python scripts\winguictl.py window --window-id <id> restore
python scripts\winguictl.py window --window-id <id> move --x 100 --y 200
python scripts\winguictl.py window --window-id <id> resize --width 800 --height 600
# Take a screenshot and save to file
python scripts\winguictl.py screenshot --window-id <id> --output artifacts\shot.png
# Get window structure snapshots
python scripts\winguictl.py snapshot --window-id <id> hwnd
python scripts\winguictl.py snapshot --window-id <id> uia
python scripts\winguictl.py snapshot --window-id <id> ocr
Commands
For detailed command documentation, see:
- Window - List all windows, control window state and position
- Snapshot - Get window structure snapshots
- Find - Find elements in a window
- Action - Execute interaction operations
- Control - Directly control specific controls (Win32 and UIA)
- Screenshot - Capture window screenshots
- Driver Test - Driver test steps
Workflow
- List windows and identify the correct target —
window listshows hierarchical parent-child relationships with indentation. - Prefer exact window ids over fuzzy titles when possible.
- Use
window focusto bring the target window to the foreground before interacting. - Use
window minimize/maximize/restore/close/move/resizeto control window state before interacting. - Use
snapshot hwnd/uia/ocrto inspect window structure when locators are not obvious. - Prefer HWND and UIA locators over OCR and image matching — structured identifiers (
hwnd,automation_id,runtime_id) are more reliable and deterministic than pixel-based approaches.
- For UIA controls, run
snapshot uiafirst to get elementautomation_idorruntime_id, then useuia-controlcommands to interact. - For Win32 controls, run
snapshot hwndto get controlhwnd, then usecontrolcommands to interact. - Use
find ocronly for rendered text that is not exposed through UIA or window text. - Use
find image/click-imageonly for iconography, canvas content, or custom-painted controls where no structured locator exists.
- Use relative window coordinates only when neither structured locators nor image matching are available.
- Capture screenshots before or after important steps.
- Return structured results, artifact paths, and any follow-up risk.
Operating Rules
- Coordinates are relative to the window unless the tool explicitly says otherwise.
- Use
--dry-runwhen you need to preview coordinates or confirm intent. - Report the exact window title and
window_idyou acted on.
Dependencies
| Package | Install | Required | Description |
|---|---|---|---|
| Python 3.14+ | — | Yes | Runtime |
| pywinauto | pip install pywinauto | Yes | Windows GUI automation (core dependency) |
| pywin32 | pip install pywin32 | Yes | Win32 API Pythonic wrapper (win32gui/win32api/win32con/win32ui) |
| Pillow | pip install Pillow | Yes | Image processing |
| wx-ocr | pip install wx-ocr | No | Self-contained WeChat OCR, no external dependencies |
| opencv-python | pip install opencv-python | No | Image template matching |
Safety Boundary
- Use this skill for automation of the user's own software, test environments, or explicitly authorized systems.
- Do not use this skill to bypass third-party anti-bot checks, CAPTCHAs, or unrelated security controls.