repair-broken-references
When to Use
Use this skill when project references have become stale:
- Documentation contains broken internal links
- External URLs return 404 errors
- Import statements reference moved or deleted modules
- Cross-references between files are out of sync
- Files exist but are never referenced anywhere
Do NOT use for refactoring module dependencies or redesigning information architecture. This skill repairs existing references, not restructures them.
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
project_path | string | Yes | Absolute path to project root |
check_external | boolean | No | Verify external URLs (default: true, slow) |
fix_mode | enum | No | auto (fix obvious), report (document only), interactive (prompt) |
orphan_threshold | integer | No | Days since last modified to flag as orphan (default: 180) |
Procedure
Step 1: Scan for Broken Internal Links
Find all markdown links pointing to non-existent files.
# Find all markdown files
find . -name "*.md" -type f > markdown_files.txt
# Extract all markdown links: [text](path)
grep -oP '\[.*?\]\(\K[^)]+' *.md | sort | uniq > all_links.txt
# For each link:
while read link; do
# Skip external URLs (http/https)
if [[ "$link" =~ ^https?:// ]]; then
continue
fi
# Resolve relative path
target=$(realpath -m "$link")
# Check if target exists
if [ ! -e "$target" ]; then
echo "BROKEN: $link (referenced in $file)" >> broken_internal.txt
fi
done < all_links.txt
Expected: broken_internal.txt lists all broken internal references
On failure: If realpath unavailable, manually check each link
Step 2: Check External URLs
Verify that external links are still accessible (HTTP 200 response).
# Extract external URLs
grep -ohP 'https?://[^\s\)]+' *.md | sort | uniq > external_urls.txt
# Check each URL (rate-limit to avoid bans)
while read url; do
status=$(curl -o /dev/null -s -w "%{http_code}" "$url")
if [ "$status" -ge 400 ]; then
echo "DEAD ($status): $url" >> dead_urls.txt
fi
sleep 0.5 # Rate limit
done < external_urls.txt
Expected: dead_urls.txt lists URLs returning 4xx/5xx errors
On failure: If curl unavailable or blocked, use online link checker or skip
Note: Some URLs may return 403 due to bot detection but work in browsers. Manual review required.
Step 3: Find Broken Imports
Check that all import/require statements reference existing modules.
JavaScript/TypeScript:
# Find all import statements
grep -rh "^import.*from ['\"]" . | sed -E "s/.*from ['\"]([^'\"]+)['\"].*/\1/" > imports.txt
# For each import:
while read import; do
# Skip node_modules and external packages
if [[ "$import" =~ ^[./] ]]; then
# Resolve to file path
target="${import}.js" # Try .js, .ts, .jsx, .tsx
if [ ! -e "$target" ]; then
echo "BROKEN IMPORT: $import" >> broken_imports.txt
fi
fi
done < imports.txt
Python:
# Find all import statements
grep -rh "^from .* import\|^import " . --include="*.py" | \
sed -E "s/from ([^ ]+) import.*/\1/" | \
sed -E "s/import ([^ ]+)/\1/" > imports.txt
# For each local import (starts with .)
# Check if module file exists
R:
# Find library() and source() calls
grep -rh "library(\\|source(" . --include="*.R" | \
sed -E 's/.*library\("([^"]+)"\).*/\1/' > packages.txt
# For source() calls, check if file exists
# For library() calls, check if package installed
Rscript -e "installed.packages()[,'Package']" > installed_packages.txt
Expected: broken_imports.txt lists all references to deleted/moved modules
On failure: If language-specific tool unavailable, manually review recent refactoring commits
Step 4: Find Orphaned Files
Identify files that exist but are never referenced anywhere.
# Find all code files
find . -type f \( -name "*.js" -o -name "*.py" -o -name "*.R" \) > all_files.txt
# For each file:
while read file; do
basename=$(basename "$file")
# Search for references (import, require, source, href, link)
refs=$(grep -r "$basename" . --exclude-dir=node_modules --exclude-dir=.git | wc -l)
# If only 1 reference (itself):
if [ "$refs" -le 1 ]; then
# Check last modified date
last_mod=$(git log -1 --format="%ci" "$file")
# If modified more than orphan_threshold days ago
# Flag as potential orphan
echo "ORPHAN: $file (last modified: $last_mod)" >> orphans.txt
fi
done < all_files.txt
Expected: orphans.txt lists files not referenced elsewhere
On failure: If git log fails, use filesystem mtime instead
Note: Some files (e.g., CLI entry points, top-level scripts) are legitimately unreferenced but not orphans. Requires manual review.
Step 5: Fix Internal Links
Repair broken internal references using one of three strategies:
Strategy 1: Find Moved Files
# For each broken link, search for file by name
while read broken_link; do
filename=$(basename "$broken_link")
# Search for file in project
found=$(find . -name "$filename" | head -1)
if [ -n "$found" ]; then
# Update link to new path
old_path="$broken_link"
new_path="$found"
# Use Edit tool to replace in all markdown files
echo "FIX: $old_path -> $new_path"
fi
done < broken_internal.txt
Strategy 2: Create Redirect Stub
# If file was deleted intentionally, create redirect stub
echo "# Moved" > "$broken_link"
echo "This content moved to [new location](new_path.md)" >> "$broken_link"
Strategy 3: Remove Dead Link
# If content no longer exists, remove link (keep text)
# Replace [text](broken_link) with text (plain)
Expected: All broken internal links either fixed, redirected, or removed
On failure: If automated fix breaks context, escalate for manual review
Step 6: Fix Broken Imports
Update import statements to reference correct paths after moves.
JavaScript Example:
// Before (broken)
import { helper } from './utils/helper';
// After (fixed — file moved to lib/)
import { helper } from './lib/helper';
For each broken import:
- Locate the moved module (similar to Step 5)
- Update import path in all files referencing it
- Run linter/type checker to verify fix
Expected: All imports resolve correctly; no module-not-found errors
On failure: If module was truly deleted, escalate to determine if functionality still needed
Step 7: Document Orphaned Files
For files flagged as orphans, determine disposition:
- Keep: Legitimately unreferenced (entry points, scripts, templates)
- Archive: Old code no longer needed but preserve history
- Delete: Dead code with no value
# Orphaned Files Review
| File | Last Modified | Recommendation | Reason |
|------|---------------|----------------|--------|
| scripts/old_deploy.sh | 2024-01-05 | Archive | Replaced by CI/CD |
| src/legacy_api.js | 2023-06-12 | Delete | API v1 fully deprecated |
| bin/cli.py | 2025-12-01 | Keep | CLI entry point (unreferenced by design) |
Expected: Orphan review document created; automated decisions flagged for human approval
On failure: (N/A — document even if no clear disposition)
Step 8: Generate Repair Report
Summarize all broken references and fixes applied.
# Reference Repair Report
**Date**: YYYY-MM-DD
**Project**: <project_name>
**Fix Mode**: auto | report | interactive
## Broken Internal Links
- Total: X
- Fixed: Y
- Redirected: Z
- Escalated: W
Details:
- [file.md](file.md) line 45: Fixed broken link to moved doc
- [another.md](another.md) line 12: Created redirect stub
## Dead External URLs
- Total: X
- Fixed (wayback machine): Y
- Removed: Z
Details:
- https://example.com/old-page (404) → Removed
- https://api.old.com/docs (gone) → Replaced with new docs
## Broken Imports
- Total: X
- Fixed: Y
- Escalated: Z
Details:
- src/main.js line 3: Updated import path after refactor
## Orphaned Files
- Total: X
- Kept: Y
- Archived: Z
- Escalated for review: W
See ORPHAN_REVIEW.md for full analysis.
## Validation
- [x] All tests pass after fixes
- [x] Linter reports no module-not-found errors
- [x] Dead links documented in report
Expected: Report saved to REFERENCE_REPAIR_REPORT.md
On failure: (N/A — generate report regardless)
Validation Checklist
After repairs:
- No broken internal links in documentation
- Dead external URLs documented (not all fixable)
- All imports resolve correctly
- Orphaned files reviewed and dispositioned
- Tests pass after import fixes
- Linter reports no unresolved references
- Git history preserved (used
git mvfor any moves)
Common Pitfalls
-
Automatic URL Fixes Break Context: Replacing dead links with web.archive.org URLs may not be what the author intended. Some links are better removed.
-
Over-Aggressive Orphan Deletion: Entry points, CLI scripts, and templates are often unreferenced by design. Don't delete without review.
-
Import Path Assumptions: Assuming all relative imports use the same base path. Different module systems (CommonJS, ES6, TypeScript) handle paths differently.
-
External URL False Positives: Some sites block curl/bots but work fine in browsers. Always manually verify dead URLs.
-
Circular Reference Traps: File A imports B, B imports A. Updating one breaks the other. Requires simultaneous fix.
-
Ignoring Fragment Identifiers: Fixing
[link](#section)requires checking if#sectionanchor exists, not just if file exists.
Related Skills
- clean-codebase — Remove dead code after confirming orphans
- tidy-project-structure — Reorganize files (may create broken references)
- escalate-issues — Route complex reference issues to specialists
- compliance/documentation-audit — Comprehensive documentation review
- web-dev/link-checker — Advanced external URL validation