Change Detection

Change Detection is the process of identifying which source documents have been modified since a previous operation, enabling systems to avoid redundant reprocessing of unchanged content. ^[knowledge-compilation.md]

How It Works

Change detection relies on cryptographic hashing to determine whether a source has changed. Each source document is fingerprinted using a SHA-256 hash, and these hashes are stored in a state file between runs. On each subsequent compilation, the system computes fresh hashes for all sources and compares them against the stored values — any mismatch signals a change. ^[knowledge-compilation.md]

Role in the Compilation Pipeline

Change detection sits early in the Knowledge Compilation pipeline, immediately after ingestion. Once changed sources are identified, only those sources are forwarded for Concept Extraction and Page Generation. Sources whose hashes match the stored state are skipped entirely. ^[knowledge-compilation.md]

Incremental Compilation

The primary benefit of change detection is enabling Incremental Compilation. Because only modified sources trigger reprocessing, the system avoids redundant LLM calls, reducing both execution time and API costs. This mirrors the behavior of code compilers, which recompile only translation units affected by a change. ^[knowledge-compilation.md]

Cross-Source Dependencies

Change detection also interacts with Cross-Source Concepts. When a source that contributes to a shared concept is modified, the system uses semantic dependency tracking to identify which compiled concept pages depend on that source. Those pages are then queued for recompilation using content from all contributing sources, not just the one that changed. ^[knowledge-compilation.md]

State File

The hashes are persisted in a state file between runs. This file acts as the system's memory of what was last processed, and is updated at the end of each successful compilation run. ^[knowledge-compilation.md]

Sources

knowledge-compilation.md