Semgrep Static Analysis
When to Use Semgrep
Ideal scenarios:
-
Quick security scans (minutes, not hours)
-
Pattern-based bug detection
-
Enforcing coding standards and best practices
-
Finding known vulnerability patterns
-
Single-file analysis without complex data flow
-
First-pass analysis before deeper tools
Consider CodeQL instead when:
-
Need interprocedural taint tracking across files
-
Complex data flow analysis required
-
Analyzing custom proprietary frameworks
Installation
pip
pip install semgrep
Homebrew
brew install semgrep
Docker
docker run --rm -v "${PWD}:/src" returntocorp/semgrep semgrep --config auto /src
Core Workflow
- Quick Scan
semgrep --config auto . # Auto-detect rules semgrep --config auto --metrics=off . # Disable telemetry
- Use Rulesets
semgrep --config p/<RULESET> . # Single ruleset semgrep --config p/security-audit --config p/trailofbits . # Multiple
Ruleset Description
p/default
General security and code quality
p/security-audit
Comprehensive security rules
p/owasp-top-ten
OWASP Top 10 vulnerabilities
p/cwe-top-25
CWE Top 25 vulnerabilities
p/trailofbits
Trail of Bits security rules
p/python
Python-specific
p/javascript
JavaScript-specific
p/rust
Rust-specific
- Output Formats
semgrep --config p/security-audit --sarif -o results.sarif . # SARIF semgrep --config p/security-audit --json -o results.json . # JSON semgrep --config p/security-audit --dataflow-traces . # Show data flow
Writing Custom Rules
Basic Structure
rules:
- id: hardcoded-password languages: [python] message: "Hardcoded password detected: $PASSWORD" severity: ERROR pattern: password = "$PASSWORD"
Pattern Syntax
Syntax Description Example
...
Match anything func(...)
$VAR
Capture metavariable $FUNC($INPUT)
<... ...>
Deep expression match <... user_input ...>
Pattern Operators
Operator Description
pattern
Match exact pattern
patterns
All must match (AND)
pattern-either
Any matches (OR)
pattern-not
Exclude matches
pattern-inside
Match only inside context
pattern-not-inside
Match only outside context
metavariable-regex
Regex on captured value
Combining Patterns
rules:
- id: sql-injection
languages: [python]
message: "Potential SQL injection"
severity: ERROR
patterns:
- pattern-either:
- pattern: cursor.execute($QUERY)
- pattern: db.execute($QUERY)
- pattern-not:
- pattern: cursor.execute("...", (...))
- metavariable-regex: metavariable: $QUERY regex: .+.|..format(.|.%.
- pattern-either:
Taint Mode (Data Flow)
Taint mode tracks data through assignments and transformations:
rules:
- id: command-injection
languages: [python]
message: "User input flows to command execution"
severity: ERROR
mode: taint
pattern-sources:
- pattern: request.args.get(...)
- pattern: request.form[...]
- pattern: request.json pattern-sinks:
- pattern: os.system($SINK)
- pattern: subprocess.call($SINK, shell=True)
- pattern: subprocess.run($SINK, shell=True, ...) pattern-sanitizers:
- pattern: shlex.quote(...)
- pattern: int(...)
CI/CD Integration (GitHub Actions)
name: Semgrep on: push: branches: [main] pull_request: schedule: - cron: '0 0 1 * *' # Monthly
jobs: semgrep: runs-on: ubuntu-latest container: image: returntocorp/semgrep steps: - uses: actions/checkout@v4 with: fetch-depth: 0 - name: Run Semgrep run: | if [ "${{ github.event_name }}" = "pull_request" ]; then semgrep ci --baseline-commit ${{ github.event.pull_request.base.sha }} else semgrep ci fi env: SEMGREP_RULES: >- p/security-audit p/owasp-top-ten p/trailofbits
Configuration
.semgrepignore
tests/ fixtures/ **/testdata/ generated/ vendor/ node_modules/
Suppress False Positives
password = get_from_vault() # nosemgrep: hardcoded-password dangerous_but_safe() # nosemgrep
Rationalizations to Reject
Shortcut Why It's Wrong
"Semgrep found nothing, code is clean" Semgrep is pattern-based; can't track complex data flow
"I wrote a rule, so we're covered" Rules need testing; false negatives are silent
"Too many findings = noisy tool" High finding count often means real problems
Resources
-
Registry: https://semgrep.dev/explore
-
Playground: https://semgrep.dev/playground
-
Trail of Bits Rules: https://github.com/trailofbits/semgrep-rules
Attribution
Based on trailofbits/skills semgrep skill.