Semantic Code Search
Find code by meaning, not just text. Search across codebases using natural language intent.
Quick Start
from code_search import CodeIndex, SemanticSearch
index = CodeIndex("/path/to/codebase")
index.build() # Parse AST, generate embeddings
search = SemanticSearch(index)
results = search.query("how is authentication handled?")
results = search.similar_to("src/auth/login.py:validate_token")
How It Works
- Parse — Walk codebase, extract functions/classes with AST
- Embed — Generate vector embeddings from code + docstrings
- Index — Store in vector index with metadata (file, line, type)
- Search — Query by intent or find similar code
Search Types
- Intent search: "find error handling patterns" → returns matching code
- Similarity search: Given a function, find others doing the same thing
- Structural search: Find all functions matching a call pattern
- Duplicate detection: Find code doing the same thing differently
CLI
python3 scripts/search.py index /path/to/codebase
python3 scripts/search.py query "database connection setup"
python3 scripts/search.py similar src/db/connect.py:10
python3 scripts/search.py duplicates