Knowledge Compilation
Knowledge compilation is the process of using Large Language Models to transform messy, unstructured information into clean, structured, interlinked reference material. The concept draws an analogy to software compilers: raw sources go in, and an organized wiki comes out. ^[knowledge-compilation.md]
The Problem It Solves
Most knowledge lives scattered across documents, articles, notes, and conversations. Finding what you need requires searching through all of it. A knowledge compiler processes these sources and produces a wiki where every concept has its own page, linked to related concepts. ^[knowledge-compilation.md]
How It Works
The Compilation Pipeline runs through several distinct stages:
- Ingestion — Raw sources (URLs, files, documents) are collected into a sources directory.
- Change Detection — SHA-256 hashes identify which sources have changed since the last compile.
- Concept Extraction — An LLM reads each changed source and extracts key concepts.
- Page Generation — For each concept, the LLM generates a wiki page with proper structure.
- Interlink Resolution — Concept mentions across pages are wrapped in wikilinks.
- Index Generation — A table of contents is built from all concept pages.
Incremental Compilation
Like a code compiler, only changed sources need reprocessing. The system tracks source hashes in a state file and skips unchanged sources entirely, saving both time and API costs. ^[knowledge-compilation.md]
Cross-Source Concepts
When multiple sources discuss the same concept, the compiler detects overlap through semantic dependency tracking. Changes to one source trigger recompilation of shared concepts using content from all contributing sources. ^[knowledge-compilation.md]
Output Format
The output uses YAML frontmatter and wikilinks, making it directly compatible with Obsidian and similar tools. Each concept page includes metadata such as title, summary, source attribution, and timestamps. ^[knowledge-compilation.md]
Sources
knowledge-compilation.md