RagClaw Knowledge Base

# RagClaw Knowledge Base Skill

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "RagClaw Knowledge Base" with this command: npx skills add emdzej/ragclaw

RagClaw Knowledge Base Skill

Local-first knowledge base for OpenClaw.

Description

Index and search your documents, code, and web pages locally. Zero external APIs, offline embeddings, SQLite-based storage.

Commands

/kb add <source>

Index a file, directory, or URL.

Examples:

/kb add ./docs/
/kb add https://docs.example.com
/kb add ~/projects/my-app/src/
/kb add https://docs.example.com --crawl --crawl-max-depth 2

Options:

  • --db <name> — Knowledge base name (default: "default")
  • --recursive — Recurse into directories (default: true)
  • --embedder <preset> — Embedder preset: nomic|bge|mxbai|minilm (default: nomic)
  • --include <pattern> — Regex filter: include only matching filenames
  • --exclude <pattern> — Regex filter: exclude matching filenames
  • --max-depth <n> — Maximum directory recursion depth
  • --max-files <n> — Maximum number of files to index
  • --crawl — Follow links from a seed URL
  • --crawl-max-depth <n> — Link traversal depth (default: 3)
  • --crawl-max-pages <n> — Max pages to fetch (default: 100)
  • --crawl-same-origin — Stay on the same domain (default: true)
  • --crawl-include <patterns> — Comma-separated URL path prefixes to include
  • --crawl-exclude <patterns> — Comma-separated URL path prefixes to exclude
  • --crawl-concurrency <n> — Concurrent fetchers (default: 1)
  • --crawl-delay <ms> — Delay between requests in ms (default: 1000)
  • --enforce-guards — Enable path/URL security guards

/kb search <query>

Search the knowledge base.

Examples:

/kb search how to configure authentication
/kb search async function error handling
/kb search "memory leak" --mode hybrid --limit 10

Options:

  • --db <name> — Knowledge base name (default: "default")
  • --limit <n> — Max results (default: 5)
  • --mode <mode> — Search mode: vector|keyword|hybrid (default: hybrid)
  • --json — Machine-readable JSON output

/kb reindex

Re-process changed sources and keep vectors up to date.

Options:

  • --db <name> — Knowledge base name (default: "default")
  • -f, --force — Force full rebuild (ignore hashes)
  • -p, --prune — Remove sources that no longer exist on disk
  • --embedder <preset> — Switch embedder and rebuild all vectors

/kb merge <source.sqlite>

Merge another knowledge base into the local one.

Options:

  • --db <name> — Destination knowledge base (default: "default")
  • --strategy <strict|reindex>strict copies vectors verbatim (same embedder required); reindex re-embeds locally (default: strict)
  • --on-conflict <skip|prefer-local|prefer-remote> — Conflict resolution (default: skip)
  • --dry-run — Preview changes without writing
  • --include <paths> — Comma-separated path prefixes to import
  • --exclude <paths> — Comma-separated path prefixes to skip

/kb status

Show knowledge base statistics (chunks, sources, vector backend, embedder).

Options:

  • --db <name> — Knowledge base name (default: "default")

/kb list

List indexed sources.

Options:

  • --db <name> — Knowledge base name (default: "default")
  • -t <file|url> — Filter by source type

/kb remove <source>

Remove a source from the index.

Options:

  • --db <name> — Knowledge base name (default: "default")
  • -y — Skip confirmation prompt

/kb embedder list

List all available embedder presets with RAM requirements and status.

/kb embedder download [preset]

Pre-download a model for offline use.

Options:

  • --all — Download all built-in presets

/kb doctor

Check system health: Node.js version, RAM, sqlite-vec status, embedder compatibility, loaded plugins.

/kb plugin list

List discovered plugins with enabled/disabled status.

/kb plugin enable <name>

Enable a plugin (use --all for all discovered plugins).

/kb plugin disable <name>

Disable a plugin.

/kb config list

Show all configuration values and their source (env / config file / default).

/kb config get <key>

Show a single config value.

/kb config set <key> <value>

Persist a config value to ~/.config/kbclaw/config.yaml.

Supported Formats

TypeExtensions
Markdown.md, .mdx
Text.txt
PDF.pdf (OCR for scanned pages)
Word.docx
Code.ts, .js, .py, .go, .java
Images.png, .jpg, .gif, .webp, .bmp, .tiff (OCR)
Webhttp://, https://

Embedder Presets

AliasModelLanguageContextDims~RAMStrengths
nomicnomic-ai/nomic-embed-text-v1.5English8 192 tok768~600 MBLong docs, balanced, default
bgeBAAI/bge-m3100+ languages8 192 tok1024~2.3 GBMultilingual
mxbaimixedbread-ai/mxbai-embed-large-v1English512 tok1024~1.4 GBBest English MTEB
minilmsentence-transformers/all-MiniLM-L6-v2English256 tok384~90 MBMinimal RAM

Run /kb doctor to check which presets fit your available RAM.

Storage

Knowledge bases are stored as SQLite files following XDG conventions:

  • Default data dir: ~/.local/share/kbclaw/
  • Config file: ~/.config/kbclaw/config.yaml
  • Backwards compat: if ~/.openclaw/kbclaw/ exists it will be used automatically.

How It Works

  1. Extract — Pull text from documents (PDF, DOCX, HTML, code, images via OCR)
  2. Chunk — Split into semantic units (paragraphs, functions, classes)
  3. Embed — Generate vectors using a configurable local model (default: nomic-embed-text-v1.5, 768 dims)
  4. Store — SQLite with FTS5 for keyword search; embedder info written to DB metadata
  5. Search — Hybrid: 70% vector similarity + 30% BM25 keyword; embedder auto-detected from DB

All processing happens locally. No API keys required.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

Philosophical Three Questions

A structured decision framework for embodied navigation using Goal Tree, Current State Tree, and Future Tree analysis. Use when: making navigation decisions...

Registry SourceRecently Updated
Research

FN Portrait Toolkit

Financial report footnote extraction and analysis tool for Chinese A-share listed companies. Use when: (1) User wants to extract financial note data from ann...

Registry SourceRecently Updated
Research

SEO AGI (Multi-Agent SEO: Research → Gap Analysis → Write → Validate → Ship)

Write SEO pages that rank in Google AND get cited by LLMs (ChatGPT, Perplexity, Claude). Use when creating airport parking pages, local service pages, listic...

Registry SourceRecently Updated
Research

Knowledge Gaps

Track questions Hans failed to answer and flag missing knowledge

Registry SourceRecently Updated