AI 助手
concepts/knowledge-compilation.md
对话

Knowledge Compilation

Knowledge compilation is the process of using Large Language Models to transform messy, unstructured information into clean, structured, interlinked reference material. The concept draws an analogy to software compilers: raw sources go in, and an organized wiki comes out. ^[knowledge-compilation.md]

The Problem It Solves

Most knowledge lives scattered across documents, articles, notes, and conversations. Finding what you need requires searching through all of it. A knowledge compiler processes these sources and produces a wiki where every concept has its own page, linked to related concepts. ^[knowledge-compilation.md]

How It Works

The Compilation Pipeline runs through several distinct stages:

  1. Ingestion — Raw sources (URLs, files, documents) are collected into a sources directory.
  2. Change DetectionSHA-256 hashes identify which sources have changed since the last compile.
  3. Concept Extraction — An LLM reads each changed source and extracts key concepts.
  4. Page Generation — For each concept, the LLM generates a wiki page with proper structure.
  5. Interlink Resolution — Concept mentions across pages are wrapped in wikilinks.
  6. Index Generation — A table of contents is built from all concept pages.

^[knowledge-compilation.md]

Incremental Compilation

Like a code compiler, only changed sources need reprocessing. The system tracks source hashes in a state file and skips unchanged sources entirely, saving both time and API costs. ^[knowledge-compilation.md]

Cross-Source Concepts

When multiple sources discuss the same concept, the compiler detects overlap through semantic dependency tracking. Changes to one source trigger recompilation of shared concepts using content from all contributing sources. ^[knowledge-compilation.md]

Output Format

The output uses YAML frontmatter and wikilinks, making it directly compatible with Obsidian and similar tools. Each concept page includes metadata such as title, summary, source attribution, and timestamps. ^[knowledge-compilation.md]

Sources