notcrawl

Mirror a Notion workspace into local SQLite + normalized Markdown for search, diff, and agent queries without depending on the Notion UI.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "notcrawl" with this command: npx skills add chasewebb/notcrawl

Notcrawl

Local-first Notion crawler. Pulls pages, databases, and blocks into a SQLite store and emits normalized Markdown alongside, so PaperBrain can absorb Notion content into the vault graph.

Requirements

  • Notion internal integration token. Create at https://www.notion.so/profile/integrations → New integration → copy the secret (secret_… or ntn_…).
  • Share each Notion page or database with the integration (Notion's menu → Connections → add your integration). Notcrawl can only see what the integration is invited to.
  • notcrawl binary on PATH (installed at ~/.local/bin/notcrawl).

Setup

export NOTION_API_KEY="ntn_…"
notcrawl init                              # create ~/.notcrawl/config.toml + db
notcrawl sync --full                       # initial pull of all shared content
notcrawl export-md --out ~/.notcrawl/md    # dump normalized Markdown

State

  • Config: ~/.notcrawl/config.toml
  • Database: ~/.notcrawl/notcrawl.db
  • Markdown export: ~/.notcrawl/md/ (configurable)

Common Commands

notcrawl status --json
notcrawl sync --incremental
notcrawl pages list --json
notcrawl search "OKR" --json
notcrawl export-md --out <dir>             # regenerate Markdown
notcrawl sql 'SELECT count(*) FROM pages'

Integration Notes

  • Markdown export is the bridge to PaperVault — point --out at a vault folder (e.g. KNOWLEDGE/notion/) to fold Notion content into the graph.
  • Schedule notcrawl sync --incremental + notcrawl export-md via PaperFang for hands-free mirroring.
  • Diff-friendly: Markdown output is deterministic, so changes show up cleanly in git.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

Discrawl

Mirror Discord guild data into a local SQLite archive for search, inspection, and agent queries. Bot-token only, no user-token hacks. Data stays local.

Registry SourceRecently Updated
340Profile unavailable
Research

Slacrawl

Pull Slack workspace metadata and message history into local SQLite for offline search and agent queries.

Registry SourceRecently Updated
280Profile unavailable
Research

Wacrawl

Read-only local archive and full-text search of macOS WhatsApp Desktop chats. Snapshots WhatsApp's SQLite databases into ~/.wacrawl/wacrawl.db without modify...

Registry SourceRecently Updated
260Profile unavailable
Research

SwarmVault

Use SwarmVault when the user needs a local-first knowledge vault that writes durable markdown, graph, search, dashboard, review, context-pack, task-ledger, r...

Registry SourceRecently Updated
6251Profile unavailable