semfind

Semantic search over local text files using embeddings. Use when grep/ripgrep fails to find relevant results because the exact wording is unknown, or when searching by meaning rather than pattern — e.g., searching logs for "deployment issue" when the actual text says "container build failed". Install with `pip install semfind`. Ideal for searching memory files, project docs, logs, and notes by meaning.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "semfind" with this command: npx skills add PaperBoardOfficial/semfind

semfind

Semantic grep for the terminal. Searches files by meaning using local embeddings (BAAI/bge-small-en-v1.5 + FAISS). No API keys needed.

When to reach for semfind

  1. grep or ripgrep returned no results or irrelevant results
  2. You don't know the exact wording of what you're looking for
  3. You want to search by concept/meaning rather than exact text

Do NOT use semfind when grep works — grep is instant and has zero overhead.

Install

pip install semfind

First run downloads a ~65MB model (~10-30s). Subsequent runs use the cached model.

Usage

# Basic search
semfind "deployment issue" logs.md

# Search multiple files, top 3 results
semfind "permission error" memory/*.md -k 3

# With context lines
semfind "database migration" notes.md -n 2

# Force re-index after file changes
semfind "query" file.md --reindex

# Minimum similarity threshold
semfind "auth bug" *.md -m 0.5

Options

FlagDescriptionDefault
-k, --top-kNumber of results5
-n, --contextContext lines before/after0
-m, --max-distanceMinimum similarity scorenone
--reindexForce re-embedfalse
--no-cacheSkip embedding cachefalse

Output format

Grep-like with similarity scores:

file.md:9: [2026-01-15] Fixed docker build with missing env vars  (0.796)
file.md:3: [2026-01-17] Agent couldn't write to /var/log          (0.689)

Higher scores (closer to 1.0) mean stronger semantic match.

Resource usage

  • ~250MB RAM while running, freed immediately on exit
  • ~65MB model cached in /tmp/fastembed_cache/
  • ~2s first query (model load), ~14ms cached queries
  • Embedding cache in ~/.cache/semfind/, auto-invalidates on file changes

Workflow pattern

# Step 1: Try grep first
grep "deployment" memory/*.md

# Step 2: If grep fails, use semfind
semfind "something went wrong with the deployment" memory/*.md -k 5

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Hippo Video

Hippo Video integration. Manage Persons, Organizations, Deals, Leads, Activities, Notes and more. Use when the user wants to interact with Hippo Video data.

Registry SourceRecently Updated
General

币安资金费率监控

币安资金费率套利监控工具 - 查看账户、持仓、盈亏统计,SkillPay收费版

Registry SourceRecently Updated
General

apix

Use `apix` to search, browse, and execute API endpoints from local markdown vaults. Use this skill to discover REST API endpoints, inspect request/response s...

Registry SourceRecently Updated
0160
dngpng