karpathy-wiki

Karpathy LLM Wiki pattern implementation — full ingest/query/relink/lint/DeepResearch pipeline, automatic knowledge graph maintenance, URL-level source traceability.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "karpathy-wiki" with this command: npx skills add zhangmengyang/karpathy-wiki-improve

karpathy-wiki — OpenClaw Implementation v3.0

Based on Andrej Karpathy's LLM Wiki pattern.


wiki Root

wiki_root: /path/to/your/wiki  # configure to your local path
<wiki_root>/
├── raw/
│   ├── sources/          # raw bookmarks/docs (immutable)
│   └── assets/           # images and resources
├── wiki/
│   ├── entities/        # entity pages (people, products, companies, sites, books)
│   ├── concepts/        # concept pages (tech, theory, methodology)
│   ├── comparisons/     # comparison pages
│   ├── synthesis/       # synthesis/overview pages
│   ├── index.md         # wiki index (entry point)
│   ├── log.md           # operation log (append-only)
│   └── overview.md      # global overview
├── purpose.md           # wiki goal definition (wiki constitution)
└── schema.md            # structure conventions

Core Principles (v3.0)

  1. sources/ is read-only — LLM only writes wiki/, never modifies raw sources
  2. wikilink cross-references[[page-slug]] syntax for page connections
  3. YAML frontmatter — every page has type/tags/related/sources
  4. Bidirectional links enforced — every write to related must sync back-link
  5. Two-phase Ingest — Analysis → Generation
  6. URL-level traceability — sources contain specific URLs, not just filenames
  7. Lint-driven — periodic health checks, graph stays clean
  8. Deep Research — knowledge gaps auto-discovered and filled

Page Type Taxonomy (entity vs concept boundary)

TypeDefinitionExamples
entityNamed, discrete thingspeople/products/companies/sites/books/tools
conceptAbstract ideas/theories/methodologiesindexing principles, microservices, DI
comparisonMulti-option comparisonsVue vs React, MySQL vs PostgreSQL
synthesisComprehensive overviewtech stack panorama, annual summary

Boundary Rules:

  • If it has a specific name → entity ("pdai.tech", "Effective Java")
  • If it's abstract/generic → concept ("MySQL indexing", "dependency injection")
  • Avoid having both entity and concept for the same topic

Naming Convention

entity:
  blogs/sites: use domain or person name
    → mysql-zhu-shuangyin
    → pdai-tech
    → jon-index-blog
  books: use simplified book title
    → effective-java

concept:
  use core terms in kebab-case
    → mysql-innodb
    → jwt-json-web-token
    → dependency-injection

comparison:
  → mysql-postgresql-comparison
  → vue-vs-react

synthesis:
  → go-web-dev-overview
  → 2026-learning-roadmap-summary

Rules:

  • All lowercase, hyphen-separated
  • No mixed Chinese/English
  • Unique slugs, no duplicates

YAML Frontmatter (Required Fields)

---
type: entity | concept | comparison | synthesis
title: Page Title
created: YYYY-MM-DD
updated: YYYY-MM-DD
tags: [tag1, tag2]
related: [page-slug-1, page-slug-2]  # forward reference (back-link auto-added)
sources:
  - file: bookmarks_xxx.md
    urls:
      - https://example.com/article1
      - https://example.com/article2
---

sources.urls is mandatory — URL-level traceability is a core principle.


Quality Thresholds

Every concept/comparison page must have:

RequirementDescription
One-line definitionfrontmatter title or page header >
Core principles ≥ 3body contains at least 3 substantial points
Related pages ≥ 1related field is non-empty
Source URLssources.urls is non-empty
Back-links addedevery page in related back-links to this page

Every entity page must have:

RequirementDescription
One-line descriptionfrontmatter title
Key features ≥ 2body has substantive descriptions
Related pages ≥ 1related field non-empty
Source URLssources.urls is non-empty

Page Templates

Entity Page

---
type: entity
title: Entity Name
created: YYYY-MM-DD
updated: YYYY-MM-DD
tags: [tags]
related: [page-slug-1, page-slug-2]
sources:
  - file: bookmarks_xxx.md
    urls:
      - https://example.com
---

# Entity Name

> One-line description (used in index.md summary).

## Overview
Main content and background.

## Key Features
- Feature 1
- Feature 2

## Related
- [[page-slug]] — reason (back-link auto-added)

## Sources
- [Article Title](https://example.com) — source description

Concept Page

---
type: concept
title: Concept Name
created: YYYY-MM-DD
updated: YYYY-MM-DD
tags: [tags]
related: [page-slug-1, page-slug-2]
sources:
  - file: bookmarks_xxx.md
    urls:
      - https://example.com/article1
---

# Concept Name

> One-line definition.

## Core Principles
- Principle 1
- Principle 2
- Principle 3

## Use Cases
- Use case 1

## Related
- [[page-slug]] — reason

## Counter-arguments / Data Gaps
- Known limitations
- Uncovered aspects

## Sources
- [Article Title](https://example.com) — source description

Operations

Ingest (Collection & Digestion)

Phase 1 — Analysis

## Key Entities
Identified entities

## Key Concepts
Identified core concepts

## Main Arguments & Findings
Key arguments and findings

## Connections to Existing Wiki
Relations to existing wiki pages

## Contradictions & Tensions
Conflicts with existing knowledge

## Coverage Gaps
What was mentioned but not covered deeply?
What related topics are missing?

## Recommendations
New/update which pages

Phase 2 — Generation

  1. Create/update target pages (with urls in sources)
  2. Sync related + back-link (bidirectional link enforcement)
  3. Verify pages meet quality thresholds
  4. Update index.md
  5. Append to log.md

Output format:

---FILE: wiki/concepts/page.md---
[page content with sources.urls]
---END FILE---

---FILE: wiki/entities/backlink-target.md---
[update target page, append back-link]
---END FILE---

---FILE: wiki/index.md---
[append new page entry]
---END FILE---

---FILE: wiki/log.md---
[append ingest log entry]
---END FILE---

Query

  1. Read wiki/index.md to locate relevant pages
  2. Read related pages + extract sources.urls
  3. Use web_fetch to trace and verify original URLs
  4. Synthesize answer, annotate source confidence

Relink (Automatic Relationship Discovery)

Trigger: batch ingest complete / periodic heartbeat

Process:

1. Scan all wiki/*.md tags and body text
2. Extract core topics from each page
3. Find page pairs sharing tags/topics
4. Analyze relationship strength pairwise
5. Generate recommended link list (candidate)
6. User confirms before writing (back-link sync)

Execution steps:

# 1. Collect all related pairs (shared tags)
grep -r "^tags:" wiki/concepts/ wiki/entities/ | analyze

# 2. List orphan pages
for f in wiki/**/*.md; do
  related=$(grep "^related:" "$f")
  inbound=$(grep -r "^\* \[\[$(basename $f .md)\]\]" wiki/)
  [ -z "$related" ] && [ -z "$inbound" ] && echo "$f is orphan"
done

# 3. LLM generates relink suggestion report
#    Format:
#    [[page-A]] <--> [[page-B]]  reason: shared tag MySQL B+tree
#    [[page-C]] --> [[page-D]]   reason: C mentions D but not linked

Write rules:

  • Update A's related to add B
  • Update B's related to add A
  • Append to log.md

Lint (Health Check) — Enhanced

Trigger: user request / periodic heartbeat

Scan dimensions (6):

#DimensionDescription
1Orphan pagesNo related pages, no inbound links
2Dangling referencesrelated references non-existent slugs
3One-way linksA→B but B→A missing
4Contradiction detectionSame claim described differently across pages
5Quality thresholdPage fails minimum quality (no urls/no related/principles<3)
6Naming driftSlug style inconsistent (mixed case/mixed Chinese-English)

Lint report format:

## Lint Report — YYYY-MM-DD

### Orphan Pages (N)
- [[page]] — no related, no inbound

### One-way Links (N)
- [[A]] → [[B]] (B not back-linking A)

### Dangling References (N)
- [[page]] references non-existent [[nonexistent]]

### Quality Failures (N)
- [[page]] — missing urls source
- [[page]] — empty related

### Contradictions (N)
- [[page-A]] says: X is Y
- [[page-B]] says: X is Z

### Naming Issues (N)
- [[page]] — slug has uppercase/mixed Chinese-English

### Recommended Actions
1. [Priority 1]
2. [Priority 2]

Deep Research

Trigger: lint finds Coverage Gaps / user says "research X"

Process:

1. Discover knowledge gap
   lint report "missing coverage" items
   user says "help me research XXX"

2. Generate search queries
   LLM generates 3-5 search queries from gap

3. Multi-source search
   Execute web_search for each query

4. Ingest results
   Write search results to raw/sources/
   Execute ingest to generate new pages

5. relink + lint
   complete relationships + health check

purpose.md (Wiki Constitution)

Every wiki should have purpose.md defining:

# purpose.md

## Goal
Who is this wiki for? What problem does it solve?

## Core Questions
What core questions must this wiki answer?

## Scope
What domains are covered?
What is explicitly excluded?

## Evolution Direction
Near-term (3 months): fill gaps in which domains?
Mid-term (6 months): what state to achieve?
Long-term (1 year): what is the ideal wiki form?

## Quality Standards
What is the minimum quality threshold?

Source Traceability Chain

User bookmarks (Chrome export)
  ↓
raw/sources/bookmarks_xxx.md  (immutable)
  ↓  Ingest writes
wiki/xxx.md
  sources:
    - file: bookmarks_xxx.md
      urls:
        - https://example.com  ← specific URL
  ↓  Query time
OpenClaw reads wiki → reads sources.urls → web_fetch original URL → verify

Use Cases

  • User asks technical question (check wiki first, then search)
  • User says "help me digest this link"
  • User requests "organize my collected content on XXX"
  • User requests "run lint"
  • User requests "relink"
  • User requests "research X" (Deep Research)
  • Periodic heartbeat triggers lint + relink + quality check

Confidence Annotations

AnnotationMeaning
✅ Verifiedwiki content matches original URL source
⚠️ Inferredwiki content is LLM inference based on source, not direct quote
❌ Disputedwiki content contradicts source, needs verification

Bidirectional Link Write Rules (Enforced)

Every time you modify the related field:

When writing A's related to add B:
  1. Add B to A's related: [...]
  2. Check if B's related already has A
  3. If not, add A as back-link
  4. If yes, skip

Prohibited:

  • ❌ Write A→B only, skip B→A
  • ❌ Leave related empty with no links added
  • ❌ sources has only file, no urls

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

Video Analysis Workflow 视频案例分析助手

视频案例分析助手(Video Analysis Workflow):一键分析本地/在线视频,拆解并输出为视频案例。自动提取分镜、画面、旁白、结构,生成①案例分析 ②抽帧分镜 ③脚本模板 ④台词转写 四份报告,支持自动整理Obsidian案例库。适用于运营发行、竞品视频分析、视频编导、团队案例库沉淀。

Registry SourceRecently Updated
Research

OpenViking

RAG and semantic search via OpenViking Context Database MCP server. Query documents, search knowledge base, add files/URLs to vector memory. Use for document Q&A, knowledge management, AI agent memory, file search, semantic retrieval. Triggers on "openviking", "search documents", "semantic search", "knowledge base", "vector database", "RAG", "query pdf", "document query", "add resource".

Registry SourceRecently Updated
Research

Company Brain Core OS

Free, local, deterministic knowledge base for AI agents. 443 verified facts, instant cache (122x speedup), no hallucinations. MIT license.

Registry SourceRecently Updated
Research

Kg Note Method Obsidian

Obsidian笔记库按KG笔记法整理——四种笔记类型、链接权限、关系中转、命名规范、新旧分离。自动纠错+建关系+去重+补缺+无关内容分离+分层加载

Registry SourceRecently Updated