GitHub Code Search
Fetch real-world code examples from GitHub through intelligent multi-search orchestration.
Prerequisites
Required:
-
gh CLI installed and authenticated (gh auth login --web )
-
Node.js + pnpm
Validation:
gh auth status # Should show authenticated
Usage
cd plugins/knowledge-work/skills/gh-code-search pnpm install # First time only pnpm search "your query here"
When to Use
Invoke this skill when user requests:
-
Real-world examples ("how do people implement X?")
-
Library usage patterns ("show me examples of workflow.dev usage")
-
Implementation approaches ("claude code skills doing github search")
-
Architectural patterns ("repository pattern in TypeScript")
-
Best practices for specific frameworks/libraries
Examples requiring nuanced multi-search:
-
"Find React hooks" → Need queries for useState/useEffect (imports), function use (definitions), const use = (arrow functions)
-
"Express error handling middleware" → Queries for (req, res, next) => (signatures), app.use (registration), function(err, (error handlers)
-
"How do people use XState?" → Queries for useMachine , createMachine , interpret (actual function names, not "state machine")
-
"GitHub Actions for TypeScript projects" → Queries for .yml workflow files + actions/setup-node
- tsc or pnpm build (actual commands)
How It Works
Division of responsibilities:
Script Responsibilities (Tool Implementation)
-
Authenticates with GitHub (gh CLI token)
-
Executes single search query (Octokit API, pagination, text-match)
-
Ranks by quality (stars, recency, code structure)
-
Fetches top 10 files in parallel
-
Extracts factual data (imports, syntax patterns, metrics)
-
Returns clean markdown with code + metadata + GitHub URLs for all files
The script is a single-query tool. Claude orchestrates multiple invocations.
Claude's Orchestration Responsibilities
Claude executes a multi-phase workflow:
Query Strategy Generation: Craft 3-5 targeted search queries considering:
-
Language context (TypeScript vs JavaScript)
-
File type nuances (tsx vs ts for React, config files vs implementations)
-
Specificity variants (broad → narrow)
-
Related terminology expansion
Sequential Search Execution: Run tool multiple times, adapting queries based on intermediate results
Result Aggregation: Combine all code files, deduplicate by repo+path, preserve GitHub URLs
Pattern Analysis: Extract common imports, architectural styles, implementation patterns
Comprehensive Summary: Synthesize findings with trade-offs, recommendations, key code highlights, and GitHub URLs for ALL meaningful files
Claude's Orchestration Workflow
Step 1: Analyze Request & Generate Queries (BLOCKING)
Analyze user request to determine:
-
Primary language/framework (TypeScript, Python, Go, etc.)
-
Specific patterns sought (hooks, middleware, configs, actions)
-
File type variations needed (tsx vs ts, yaml vs json)
Generate 3-5 targeted queries considering nuances:
Example 1: "Find React hooks"
-
Query 1: useState useEffect language:typescript extension:tsx (matches import statements, hook usage in components)
-
Query 2: function use language:typescript extension:ts (matches hook definitions like function useFetch() )
-
Query 3: const use = language:typescript (matches arrow function hooks like const useAuth = () => )
Example 2: "Express error handling"
-
Query 1: (req, res, next) => language:javascript (matches middleware function signatures)
-
Query 2: app.use express (matches Express middleware registration)
-
Query 3: function(err, req, res, next) language:javascript (matches error handler signatures with 4 params)
Example 3: "XState state machines in React"
-
Query 1: useMachine language:typescript extension:tsx (matches React hook usage like const [state, send] = useMachine(machine) )
-
Query 2: createMachine language:typescript (matches machine definitions and imports)
-
Query 3: interpret xstate language:typescript (matches service creation like const service = interpret(machine) )
Rationale for each query:
-
Match actual code patterns (function names, import statements, signatures)
-
NOT human-friendly search terms or documentation phrases
-
Different file extensions capture different use cases (tsx for components, ts for utilities)
-
Specific function/variable names find concrete implementations
-
Signature patterns match common code structures (arrow functions, function declarations)
-
Language/extension filters prevent noise from unrelated languages
CHECKPOINT: Verify 3-5 queries generated before proceeding
Step 2: Execute Searches Sequentially
For each query (in order):
cd plugins/knowledge-work/skills/gh-code-search pnpm search "query text here"
After each search:
-
Note result count - Too many? Too few?
-
Check quality indicators - Stars, recency, code structure scores
-
Identify patterns - Are results converging on specific approaches?
-
Adapt next query if needed:
-
Too many results → Narrow scope (add more filters)
-
Too few results → Broaden (remove restrictive filters)
-
Found strong pattern → Search for similar variations
Track which queries succeed - Note for summary which strategies were effective
Early stopping: If first 2-3 queries yield 30+ high-quality results, remaining queries optional
Step 3: Aggregate Results
Combine all code files from all searches:
-
Collect files from each search output with their GitHub URLs
-
Deduplicate by repository.full_name + file_path
-
Keep highest-scored version if duplicates exist
-
Note which queries found the same file (indicates strong relevance)
-
Maintain diversity - Ensure multiple repositories represented (avoid 10 files from one repo)
-
Track provenance - Remember which query found each file (useful for analysis)
-
Preserve GitHub URLs - Maintain full GitHub URLs for every file to include in summary
Result: Unified list of 15-30 unique, high-quality code files with GitHub URLs
Step 4: Analyze Patterns
Extract factual patterns across all code files:
Import Analysis:
-
List most common imports (group by library)
-
Identify standard library vs third-party dependencies
-
Note version-specific patterns (e.g., React 18 hooks)
Architectural Patterns:
-
Functional vs class-based implementations
-
Custom hooks vs inline logic
-
Middleware chains vs single-handler
-
Configuration patterns (env vars, config files, inline)
Code Structure:
-
Function signatures (parameters, return types)
-
Error handling approaches (try/catch, error middleware, Result types)
-
Testing patterns (unit vs integration, mocking strategies)
-
Documentation styles (JSDoc, inline comments, README)
Language Distribution:
-
TypeScript vs JavaScript split
-
Strict mode usage
-
Type definition patterns
DO NOT editorialize - Extract facts, not opinions. Analysis comes in Step 5.
Step 5: Generate Comprehensive Summary
Output format (exact structure):
GitHub Code Search: [User Query Topic]
Search Strategy Executed
Ran [N] targeted queries:
query text- [brief rationale] → [X results]query text- [brief rationale] → [Y results]- ...
Total unique files analyzed: [N]
Pattern Analysis
Common Imports
library-name- Used in [N/total] filesanother-lib- Used in [M/total] files
Architectural Styles
- Functional Programming - [N] files use pure functions, immutable patterns
- Object-Oriented - [M] files use classes, inheritance
- Hybrid - [K] files mix both approaches
Implementation Patterns
- Pattern 1 Name: [Description with prevalence]
- Pattern 2 Name: [Description with prevalence]
Approaches Found
Approach 1: [Name]
Repos: [repo1], [repo2] Characteristics:
- [Key trait 1]
- [Key trait 2]
Example: repo/file_path:line_number
[relevant code snippet - NOT just first 40 lines]
Approach 2: [Name]
[Similar structure]
Trade-offs
Approach
Pros
Cons
Best For
Approach 1
[pros]
[cons]
[use case]
Approach 2
[pros]
[cons]
[use case]
Recommendations
For your use case ([infer from user's context]):
-
Primary recommendation: [Approach name]
- Why: [Rationale based on analysis]
- Implementation: [High-level steps]
- References: repo/file_path:line_number
-
Alternative: [Another approach]
- When to use: [Scenarios where this is better]
- References: repo/file_path:line_number
Key Code Sections
[Concept 1]
Source: repo/file_path:line_range
[Specific relevant code - imports, key function, not arbitrary truncation]
Why this matters: [Brief explanation]
[Concept 2]
Source: repo/file_path:line_range
[Specific relevant code]
Why this matters: [Brief explanation]
All GitHub Files Analyzed
IMPORTANT: Include GitHub URLs for ALL meaningful files found across all searches.
List every unique file analyzed (15-30 files), grouped by repository:
[repo-owner/repo-name] ([stars]⭐)
- file_path - [language] - [brief description of what this file demonstrates]
- file_path - [language] - [brief description]
[repo-owner/repo-name2] ([stars]⭐)
- file_path - [language] - [brief description]
Format: Direct GitHub blob URLs with line numbers where relevant (e.g., https://github.com/owner/repo/blob/main/path/file.ts#L10-L50
)
**Summary characteristics:**
- **Factual** - Based on extracted data, not assumptions
- **Actionable** - Clear recommendations with reasoning
- **Contextualized** - References specific code locations with line numbers
- **Balanced** - Shows trade-offs, not just "best practice"
- **Comprehensive** - Covers patterns, approaches, trade-offs, recommendations
- **Accessible** - GitHub URLs for ALL meaningful files so users can explore full source code
---
### Step 6: Save Results to File
**After generating comprehensive summary, persist for future reference:**
1. **Generate timestamp:**
- Invoke `timestamp` skill to get deterministic YYYYMMDDHHMMSS format
- Example: `20250110143052`
2. **Sanitize query for filename:**
- Convert user's original query to kebab-case slug
- Rules: lowercase, spaces → hyphens, remove special chars, max 50 chars
- Example: "React hooks useState" → "react-hooks-usestate"
3. **Construct file path:**
- Directory: `docs/research/github/`
- Format: `<timestamp>-<sanitized-query>.md`
- Full path: `docs/research/github/20250110143052-react-hooks-usestate.md`
4. **Save using Write tool:**
- Content: Full comprehensive summary from Step 5
- Ensures persistence across sessions
- User can reference past research
5. **Log saved location:**
- Inform user where file was saved
- Example: "Research saved to docs/research/github/20250110143052-react-hooks-usestate.md"
**Why save:**
- Comprehensive summaries represent significant analysis work (30-150s of API calls)
- Users may want to reference patterns/trade-offs later
- Builds searchable knowledge base of GitHub research
- Avoids re-running expensive queries for same topics
---
## Error Handling
**Common issues:**
- **Auth errors:** Prompt user to run `gh auth login --web`
- **Rate limits:** Show remaining quota, reset time. If hit during multi-search, stop gracefully with partial results
- **Network failures:** Continue with partial results, note which queries failed
- **No results for query:** Note in summary, adjust subsequent queries to be broader
- **All queries return same files:** Note low diversity, recommend broader initial queries
**Graceful degradation:** Partial results are acceptable. Complete summary based on available data.
---
## Limitations
- GitHub API rate limit: 5,000 req/hr authenticated (multi-search uses more quota)
- Each tool invocation fetches top 10 results only
- Skips files >100KB
- Sequential execution takes longer than single query (10-30s per query)
- Provides factual data, not conclusions (Claude interprets patterns)
- Deduplication assumes exact path matches (renamed files treated as unique)
---
## Typical Execution Time
**Per query:** 10-30 seconds depending on:
- Number of results (100 max)
- File sizes
- Network latency
- API rate limits
**Full workflow (3-5 queries):** 30-150 seconds
**Optimization:** If first queries yield sufficient results, skip remaining queries
---
## Integration Notes
**Example: User asks "find Claude Code skills doing github search"**
```javascript
// 1. Claude analyzes: Skills use SKILL.md with frontmatter, likely in .claude/skills/
// 2. Claude generates queries:
// - "filename:SKILL.md github search" (matches SKILL.md files with "github search" text)
// - "octokit.rest.search language:typescript" (matches actual Octokit API usage)
// - "gh api language:typescript path:skills" (matches gh CLI usage in skills)
// 3. Claude executes sequentially:
cd plugins/knowledge-work/skills/gh-code-search
pnpm search "filename:SKILL.md github search"
// [analyze results]
pnpm search "octokit.rest.search language:typescript"
// [analyze results]
pnpm search "gh api language:typescript path:skills"
// 4. Claude aggregates, deduplicates, analyzes patterns
// 5. Claude generates comprehensive summary with trade-offs and recommendations
Testing
Validate workflow with diverse queries:
- Simple: "React hooks" (should generate 3+ queries, combine results)
- Complex: "GitHub Actions TypeScript project setup" (requires config + implementation queries)
- Ambiguous: "state management" (should generate queries for Redux, Zustand, XState, Context)
Check quality:
- Deduplication works (no repeated files in summary)
- Diverse repositories (not all from one repo)
- Summary includes trade-offs and recommendations
- Code snippets are relevant (not arbitrary truncations)