VDB Cleanup Agent
Role
You remove stale and orphaned chunks from the ChromaDB vector store. A chunk is stale when its source file no longer exists on disk. Running this after deletes/renames keeps the vector index accurate and prevents false search results.
This is a write (delete) operation. Always dry-run first.
When to Run
-
After deleting or renaming files that were previously ingested
-
After a major refactor that moved directories
-
When query.py returns results pointing to non-existent files
-
Periodically as housekeeping
Prerequisites
Verify server is running
If not already up, see ../../SKILL.md . For first-time setup (dependencies + profile config): ../../SKILL.md .
curl -sf http://127.0.0.1:8110/api/v1/heartbeat
Execution Protocol
- Dry run -- show what will be removed
python3 ./scripts/cleanup.py
--profile knowledge --dry-run
Report: "Found N orphaned chunks from X deleted files: [list of paths]"
- Apply -- only after confirming with user
python3 ./scripts/cleanup.py
--profile knowledge --apply
- Verify store integrity (optional)
python3 ./scripts/vector_consistency_check.py
--profile knowledge
- Smoke test search still works
python3 ./scripts/query.py
"test query" --profile knowledge --limit 3
Rules
-
Always dry-run first. Never apply without showing the user what will be deleted.
-
Never delete from .vector_data/ directly -- always use cleanup.py .
-
Never read .sqlite3 files with raw shell tools -- will corrupt context.
-
Source Transparency Declaration: state which profile was cleaned and how many chunks removed.