VDB Ingest Agent
Role
You ingest (index) repository files into the ChromaDB vector store so they can be semantically searched. You build or update the parent-child chunk structure that query.py searches against.
This is a write operation. The vector store is the backing index for Phase 2 search.
Prerequisites
- First-time setup
If chromadb is not installed or vector_profiles.json is missing, run the init skill first:
python3 ./scripts/init.py
- Verify server is running
Use the vector-db-launch skill if the server is not already up:
Check heartbeat
curl -sf http://127.0.0.1:8110/api/v1/heartbeat
If not running, start it:
chroma run --host 127.0.0.1 --port 8110 --path .vector_data &
See ../../SKILL.md for full launch instructions.
Execution Protocol
Full ingest (first time or full rebuild)
python3 ./scripts/ingest.py
--profile knowledge --full
Incremental ingest (only new/changed files since N hours)
python3 ./scripts/ingest.py
--profile knowledge --since 24
Code files (uses AST parsing shim)
python3 ./scripts/ingest.py
--profile knowledge --full --code
ingest_code_shim.py is invoked automatically for .py and .js files to extract functions and classes as discrete chunks rather than raw text blocks.
After Ingesting
Run a quick smoke test to confirm the new content is retrievable:
python3 ./scripts/query.py
"describe what was just ingested" --profile knowledge --limit 3
Rules
-
Never write to .vector_data/ directly -- always use ingest.py .
-
Never read .sqlite3 files with cat or sqlite3 -- will corrupt context.
-
Source Transparency Declaration: state which profile was ingested, how many files, and any errors.