raglite

Local-first RAG cache: distill docs into structured Markdown, then index/query with Chroma + hybrid search (vector + keyword).

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "raglite" with this command: npx skills add raglite-library

RAGLite — a local RAG cache (not a memory replacement)

RAGLite is a local-first RAG cache.

It does not replace model memory or chat context. It gives your agent a durable place to store and retrieve information the model wasn’t trained on — especially useful for local/private knowledge (school work, personal notes, medical records, internal runbooks).

Why it’s better than “paid RAG” / knowledge bases (for many use cases)

  • Local-first privacy: keep sensitive data on your machine/network.
  • Open-source building blocks: Chroma 🧠 + ripgrep ⚡ — no managed vector DB required.
  • Compression-before-embeddings: distill first → less fluff/duplication → cheaper prompts + more reliable retrieval.
  • Auditable artifacts: the distilled Markdown is human-readable and version-controllable.

If you later outgrow local, you can swap in a hosted DB — but you often don’t need to.

What it does

1) Condense ✍️

Turns docs into structured Markdown outputs (low fluff, more “what matters”).

2) Index 🧠

Embeds the distilled outputs into a Chroma collection (one DB, many collections).

3) Query 🔎

Hybrid retrieval:

  • vector similarity via Chroma
  • keyword matches via ripgrep (rg)

Default engine

This skill defaults to OpenClaw 🦞 for condensation unless you pass --engine explicitly.

Prereqs

  • Python 3.11+
  • For indexing/query:
    • Chroma server reachable (default http://127.0.0.1:8100)
  • For hybrid keyword search:
    • rg installed (brew install ripgrep)
  • For OpenClaw engine:
    • OpenClaw Gateway /v1/responses reachable
    • OPENCLAW_GATEWAY_TOKEN set if your gateway requires auth

Install (skill runtime)

This skill installs RAGLite into a skill-local venv:

./scripts/install.sh

It installs from GitHub:

  • git+https://github.com/VirajSanghvi1/raglite.git@main

Usage

One-command pipeline (recommended)

./scripts/raglite.sh run /path/to/docs \
  --out ./raglite_out \
  --collection my-docs \
  --chroma-url http://127.0.0.1:8100 \
  --skip-existing \
  --skip-indexed \
  --nodes

Query

./scripts/raglite.sh query ./raglite_out \
  --collection my-docs \
  --top-k 5 \
  --keyword-top-k 5 \
  "rollback procedure"

Outputs (what gets written)

In --out you’ll see:

  • *.tool-summary.md
  • *.execution-notes.md
  • optional: *.outline.md
  • optional: */nodes/*.md plus per-doc *.index.md and a root index.md
  • metadata in .raglite/ (cache, run stats, errors)

Troubleshooting

  • Chroma not reachable → check --chroma-url, and that Chroma is running.
  • No keyword results → install ripgrep (rg --version).
  • OpenClaw engine errors → ensure gateway is up and token env var is set.

Pitch (for ClawHub listing)

RAGLite is a local RAG cache for repeated lookups.

When you (or your agent) keep re-searching for the same non-training data — local notes, school work, medical records, internal docs — RAGLite gives you a private, auditable library:

  1. Distill to structured Markdown (compression-before-embeddings)
  2. Index locally into Chroma
  3. Query with hybrid retrieval (vector + keyword)

It doesn’t replace memory/context — it’s the place to store what you need again.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

qwencloud-model-selector

[QwenCloud] Recommend the best Qwen model and parameters. TRIGGER when: choosing between Qwen models, comparing Qwen model pricing, understanding Qwen model...

Registry SourceRecently Updated
General

deployment-manager

You are a deployment manager with expertise in release orchestration, deployment strategies, and production reliability. Use when: release orchestration and...

Registry SourceRecently Updated
General

Hk Stock Morning Report

Generate HK stock market morning report (股市晨報) for bank trading desks. Triggers: "生成晨报", "股市晨报", "今日股市", "港股晨報" 報告結構(5部分): 1. 市場回顧(恒指/科指/國指 + 強弱勢股) 2. 南下資金(總...

Registry SourceRecently Updated
General

Story Long Scan

长篇网文扫榜。分析起点、番茄、晋江等平台排行榜数据,提炼市场趋势与热门题材。 触发方式:/story-long-scan、/长篇扫榜、「长篇什么火」「起点排行」

Registry SourceRecently Updated