chrome-ai-action-skill

Browser automation via Chrome AI Action (CAA) bridge. Control Chrome programmatically — navigate, click, type, screenshot, extract content, and more. Uses Puppeteer (CDP) mode. First use auto-installs npm package and starts the bridge. Chrome is auto-launched if not running.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "chrome-ai-action-skill" with this command: npx skills add jami-lin/chromeskill

Chrome AI Action — Browser Automation Skill

AI Agent 浏览器自动化技能。通过 Chrome AI Action (CAA) 桥接服务,以 Puppeteer (CDP) 模式编程控制 Chrome 浏览器,支持导航、点击、输入、截图、内容提取、网络拦截、Cookie 管理、PDF 导出等 60+ 操作。


When to Use / 何时使用

场景调用
User asks to browse a web page, search, fill forms, extract dataYes
User needs screenshots of a web pageYes
User wants to automate browser interactionsYes
User asks about writing code / debugging (no browser involved)No
场景调用
用户需要在浏览器中打开网页、搜索、填写表单、提取数据
用户需要网页截图
用户希望自动化浏览器操作
用户问代码/调试相关(不涉及浏览器)

⚠️ CRITICAL: Chinese URL Encoding

IMPORTANT: When constructing URLs with Chinese characters for the navigate action, the agent MUST encode the query string values using encodeURIComponent. The bridge automatically encodes non-ASCII characters in the URL path, but query string values must be pre-encoded by the caller.

重要说明: 调用 navigate 时,URL 中如果包含中文字符,智能体必须先用 encodeURIComponent 对查询参数值进行编码。例如 wd=妻子的浪漫旅行 必须写成 wd=%E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C

Correct / 正确写法

{"action": "navigate", "params": {"url": "https://www.baidu.com/s?wd=%E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C"}}

Wrong / 错误写法

{"action": "navigate", "params": {"url": "https://www.baidu.com/s?wd=妻子的浪漫旅行"}}

How to encode in Node.js / 如何在 Node.js 中编码

const encoded = encodeURIComponent('妻子的浪漫旅行');
// Result: %E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C

Prerequisites / 前提条件

RequirementCheckAuto-resolve
Chrome / Chromium installedDetected automaticallyNo (user must install)
Chrome running with CDPDetected on startupYes (auto-launched)
Node.js 18+node --versionNo
要求检查方式自动处理
已安装 Chrome / Chromium自动检测常用安装路径否(用户需安装)
Chrome 以 CDP 模式运行启动时检测是(自动启动)
Node.js 18+node --version

Startup Protocol / 启动协议

When loaded for the first time, the agent MUST run the startup script. The script runs the bridge as a background child process — the agent does NOT need to manage the process separately.

首次加载时,AI 智能体必须执行以下启动脚本。脚本会自动在后台启动桥接服务,智能体无需单独管理进程

node <skill_dir>/scripts/startup.js

What it does / 执行流程

  1. Check if bridge is already running: GET /health on port 9876 → skip if OK
  2. Ensure npm package installed: npm list -g chrome-ai-action → installs via npm install -g chrome-ai-action if missing
  3. Start the bridge: chrome-ai-action --port 9876, waits for health check
  4. Auto-launch Chrome: If Chrome not running with CDP, the bridge starts it automatically (cross-platform)

Environment Variables / 环境变量

VariableDefaultDescription
CAA_BRIDGE_PORT9876Bridge HTTP server port
CAA_STARTUP_TIMEOUT30000Max wait for bridge ready (ms)
CHROME_PATHauto-detectCustom Chrome executable path
CHROME_USER_DATA_DIRplatform-dependentChrome profile directory

API Protocol / 通信协议

Endpoint: http://127.0.0.1:9876/

Endpoints / 接口地址

MethodPathDescription
GET/healthHealth check — returns bridge & CDP status
GET/schemaFull action schema (64+ actions)
POST/Execute action(s)

Request Format / 请求格式

{"type": "action", "action": "<ACTION>", "params": {...}, "requestId": "optional-id"}

Batch Request / 批量请求

{"type": "batch", "actions": [
  {"action": "navigate", "params": {"url": "https://example.com"}},
  {"action": "getTitle"}
]}

Response Format / 响应格式

{"success": true, "data": {...}, "requestId": "req-1", "timestamp": 1712345678901}

Error Response / 错误响应

{"success": false, "error": {"code": "ACTION_ERROR", "message": "..."}, "requestId": "req-1", "timestamp": 1712345678901}

Available Actions (64+) / 可用操作 (64+)

Navigation / 导航

navigate, goBack, goForward, reload, getUrl, getTitle

Page Content / 页面内容

getText, getHtml, getLinks, getImages, getHeadings, getMetaTags, getFormFields, getFocusableElements

Element Interaction / 元素交互

click, type, pressKey, scroll, scrollIntoView, findElement, focus, hover, select

Data Extraction / 数据提取

getValue, getAttribute, getAttributeAll, getBoundingBox, getCookies, getPerformanceMetrics, getSelectedValue, getSelectOptions

JavaScript / JS 执行

evaluate, injectScript, injectCSS

Screenshot & Export / 截图与导出

screenshot (PNG/JPEG), getPdf (A4/Letter)

Tab Management / 标签页管理

listTabs, newTab, closeTab, switchTab, getCurrentTab

Waiting / 等待

waitForElement, waitForTimeout, waitForNavigation

Cookie Management / Cookie 管理

setCookie, deleteCookie

Network Interception / 网络拦截

blockUrls, unblockUrls, mockResponse, getNetworkRequests, clearNetworkRequests

Storage / 本地存储

getLocalStorage, setLocalStorage, removeLocalStorage, clearLocalStorage

File Operations / 文件操作

uploadFile, setInputFiles, downloadFile

Viewport / 视口

getViewport, setViewport

Console / 控制台日志

getConsoleLogs, clearConsoleLogs

Accessibility / 无障碍

getAccessibilityTree

Utility / 工具

ping, connect, disconnect, getBrowserInfo, highlight, dispatchEvent


Typical Workflow / 典型工作流

  1. Navigate: navigate → go to target URL (encode Chinese in query params)
  2. Wait: waitForElement → wait for key content
  3. Read: getText / getHtml / getLinks → understand page
  4. Interact: click / type / pressKey → perform actions
  5. Extract: getText / screenshot / evaluate → get results
  6. Confirm: screenshot → visually verify

Example: Search Baidu with Chinese / 百度搜索中文示例

{"type": "batch", "actions": [
  {"action": "navigate", "params": {"url": "https://www.baidu.com/s?wd=%E5%A6%BB%E5%AD%90%E7%9A%84%E6%B5%AA%E6%BC%AB%E6%97%85%E8%A1%8C"}},
  {"action": "waitForTimeout", "params": {"ms": 2000}},
  {"action": "getText"}
]}

Example: Full Login Flow / 登录流程示例

{"type": "batch", "actions": [
  {"action": "navigate", "params": {"url": "https://example.com/login"}},
  {"action": "waitForElement", "params": {"selector": "input[name=username]", "timeout": 10000}},
  {"action": "type", "params": {"selector": "input[name=username]", "value": "myuser"}},
  {"action": "type", "params": {"selector": "input[name=password]", "value": "mypassword"}},
  {"action": "click", "params": {"selector": "button[type=submit]"}},
  {"action": "waitForTimeout", "params": {"ms": 3000}},
  {"action": "getCurrentTab"}
]}

Error Handling / 错误处理

Error CodeMeaningResolution
CDP_NOT_CONNECTEDChrome not running with debug portBridge auto-launches Chrome, retries every 3s
ACTION_ERRORAction execution failedCheck params, use getFocusableElements to find elements first
INVALID_REQUESTMalformed requestCheck request format
PARSE_ERRORJSON parse failureSend valid JSON

Discovery Tips / 探测提示

When you don't know what elements are on a page:

  1. getFocusableElements → all interactive elements (with positions)
  2. getFormFields → all form inputs with metadata
  3. getLinks → all links on page
  4. getHeadings → understand page structure
  5. getText → all visible text

References / 参考资料

  • references/bridge-api.md — Complete API reference with all 64+ actions
  • references/setup-guide.md — Detailed setup and troubleshooting
  • scripts/startup.js — Startup automation script

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

Overstory Integration

Integrates overstory Claude Code agent swarm with nanobot orchestrator. Manages agent lifecycle, SQLite mail bridge, and git worktree coordination.

Registry SourceRecently Updated
Coding

Codex Agent

作为项目经理操作 OpenAI Codex CLI 完全体。包含:知识库维护(自动跟踪 Codex 最新功能)、任务执行(提示词设计→执行→监控→质量检查→迭代→汇报)、配置管理(feature flags/模型/skills/MCP)。通过 tmux 操作交互式 TUI,通过 notify hooks + pa...

Registry SourceRecently Updated
Coding

Firm Delivery Export

Post-processing skill for firm-orchestration: converts the structured JSON output of a multi-agent workflow run into real deliverables — GitHub Pull Requests...

Registry SourceRecently Updated
Coding

Tuya Smart Home

Control Tuya/Smart Life smart home devices including pet feeders, lights, plugs, curtains via cloud API or local network. Use when: (1) controlling smart hom...

Registry SourceRecently Updated