babata-browser

巴巴塔浏览器控制技能 v2.0 — 基于 Playwright 的轻量浏览器自动化,自然语言控制,Accessibility Tree优先,零额外AI依赖

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "babata-browser" with this command: npx skills add meta-evo-creator/babata-browser

Babata Browser 🦞 v2.0

轻量浏览器自动化技能。给巴巴塔装一双"网页上的手"——打开网页、填写表单、点击按钮、提取数据、截图保存。

架构概览

信息获取优先级:
  验证/事实查询 → API/CLI(最快,不走浏览器)
  探索/开放搜索 → web_search(Tavily,多角度)
  JS渲染/交互/截图 → babata-browser(兜底)

浏览器操作策略(v2.0 升级):
  获取结构 + 可交互元素 → Accessibility Snapshot(首选,Token高效)
  提取页面文字 → get_text(结构化)
  获取页面视觉状态 → screenshot(兜底)

对比 browser-use

browser-usebabata-browser v2.0
依赖50+包仅 Playwright
安装300MB+/20min100MB/2min
控浏览器
AI决策内置LLM巴巴塔LLM直接决策
页面交互策略视觉模型驱动Accessibility Tree优先
Token效率低(截图+视觉AI)高(结构化数据)
中文任务一般✅ 原生中文

核心设计原则

1. Accessibility Tree 优先

源自 Playwright MCP 设计模式。优先使用Playwright的Accessibility Tree快照获取页面结构和可交互元素,而不是视觉模型/截图。Token效率更高,且不需要额外AI视觉能力。

场景首选方式兜底
获取页面结构和可交互元素Accessibility Snapshot
提取页面文字get_text / get_html
获取页面视觉状态screenshot

2. CLI 轻量 > MCP 深度

微软Playwright团队已验证:CLI模式Token效率高于MCP。巴巴塔遵循同样原则:

  • 高频操作(导航/点击/提取)→ 直接Playwright CLI API(轻量快速)
  • 长周期/多步骤/需持久化状态 → MCP协议(富状态编排)

3. 巴巴塔LLM直接决策

不内嵌LLM,所有操作决策由巴巴塔的DeepSeek模型完成。优势:

  • 统一上下文(不用切AI)
  • 统一记忆(操作历史可追踪)
  • 统一安全(Guardrails覆盖所有操作路径)

安装

前置依赖

pip install playwright
python -m playwright install chromium

安装本包(全局可导入)

# 从 babata-browser 目录执行
cd skills/babata-browser
pip install -e .

安装后可从任意目录 import,包括 cron 隔离会话。

使用

from scripts.babata_browser import execute_task

# 一句话操控浏览器
execute_task("打开卫健委官网,搜索最新政策,提取前5条标题")
execute_task("打开 https://example.com,搜索 医疗AI,提取结果")
execute_task("打开登录页,填表提交,截图保存")

CLI 模式

babata-browser '打开 GitHub Trending,提取热门项目' --json

内置能力

动作说明策略
goto导航到URLCLI
get_text提取页面文字(Accessibility Tree优先)CLI
get_html获取HTMLCLI
click点击元素(文本/CSS)CLI
fill填写表单CLI
get_links提取所有链接CLI
screenshot全页截图(Accessibility Tree不可用时兜底)CLI
scroll滚动页面CLI
execute_js执行JavaScriptCLI
extract_table智能提取表格CLI
search_and_extract搜索+提取CLI
login_if_needed自动登录CLI/MCP双模式
accessibility_snapshot获取Accessibility Tree快照(v2.0新增)CLI

应用场景

  • 卫健委/医保局/中纪委官网动态政策抓取
  • 政府监管系统自动填报
  • JS渲染页面数据采集
  • 网页内容变化监控
  • 自动化表单提交

变更日志

版本日期改动
v2.02026-05-07新增Accessibility Tree优先策略、CLI/MCP双模式选择、策略表。来源:Playwright MCP设计模式
v1.0初始版本

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

AutoClaw Browser Automation

Complete browser automation skill with MCP protocol support and Chrome extension

Registry SourceRecently Updated
1.2K0Profile unavailable
Automation

Playwright (scripts) + npx

Run Node.js scripts using Playwright for full browser automation, including scraping, screenshots, form handling, and dynamic content interaction.

Registry SourceRecently Updated
1.9K0Profile unavailable
Automation

Riddle

Hosted browser automation API for agents. Screenshots, Playwright scripts, workflows — no local Chrome needed.

Registry SourceRecently Updated
1.6K1Profile unavailable
Automation

Browser Cash

Spin up unblocked browser sessions via Browser.cash for web automation. Sessions bypass anti-bot protections (Cloudflare, DataDome, etc.) making them ideal for scraping and automation.

Registry SourceRecently Updated
4K4Profile unavailable