Building contextlint: A Static Analysis Linter for Markdown Document Consistency
Introduction
Section titled “Introduction”In AI-powered software development, an approach is taking hold where structured documents — requirements, specifications, and designs — are managed as an SSOT (Single Source of Truth) and AI generates code based on them. This is commonly called Specification-Driven Development (SDD).
In this approach, the quality of documents directly impacts the quality of generated code. In other words, if documents are broken, the code produced from them will be affected too.
I practice SDD-based document management in my work, and through that process, I’ve faced the challenge of “keeping Markdown documents consistent with each other.”
This article introduces contextlint, a tool I’m building to solve that challenge.
Document Structure in SDD
Section titled “Document Structure in SDD”Layered Document Hierarchy
Section titled “Layered Document Hierarchy”I organize documents in layers based on their rate of change. Specifically, I use a 3-layer structure:
docs/├── foundation/ ← Layer 1: Near-immutable (glossary, etc.)│ └── glossary.md├── standards/ ← Layer 2: Low frequency (management rules)│ └── doc_standards.md└── zones/ ← Layer 3: High frequency (requirements, specs, designs) ├── auth/ │ ├── overview.md │ ├── requirements.md │ ├── spec_session.md │ ├── table_users.md │ └── screen_login.md └── todo/ ├── overview.md ├── requirements.md ├── spec_task.md ├── table_tasks.md └── screen_task_list.mdLayer 1 holds project-wide ubiquitous language, Layer 2 holds document governance rules, and Layer 3 holds the actual requirements, specifications, and designs.
Zones as Boundaries of Responsibility
Section titled “Zones as Boundaries of Responsibility”A Zone is a unit of responsibility boundary. For example, auth (authentication) and todo (task management) each have their own document sets for independent concerns.
Dependencies between zones are explicitly managed.
Traceability from Requirements to Code
Section titled “Traceability from Requirements to Code”The benefit of SDD is the traceability chain: requirements → specifications → design → code.
requirements.md (REQ-TODO-01) ├── spec_task.md (Task CRUD specification) │ ├── screen_task_list.md (List screen field definitions) │ └── table_tasks.md (tasks table column definitions) │ └── prisma/schema.prisma └── src/todo/domain/task.ts (// REQ-TODO-01)The requirement ID REQ-TODO-01 is referenced in specification files, specifications are reflected in table and screen designs, and everything can be traced end-to-end down to the code.
When Document Consistency Breaks
Section titled “When Document Consistency Breaks”As documents grow, maintaining consistency becomes difficult.
Complexity of Stability Management
Section titled “Complexity of Stability Management”In client-facing projects, you often handle requirements definition yourself. As requirements solidify through hypotheses, demos, and feedback cycles, you need to track the status of each item.
| ID | Requirement | Stability | Basis || ----------- | ----------------------------------- | --------- | ------------------ || REQ-TODO-01 | Create, edit, and delete tasks | stable | Agreed at kickoff || REQ-TODO-02 | Set deadlines on tasks | review | Pending demo check || REQ-TODO-03 | Categorize tasks | draft | Hypothesis stage |You manage a lifecycle of draft (hypothesis) → review (awaiting confirmation) → stable (agreed), but it’s easy to miss when a stability column is blank or contains an unexpected value.
Fragility of Cross-References
Section titled “Fragility of Cross-References”It’s common to reference requirement IDs from other files.
## Requirements (What)
| ID | Requirement | Stability || --------------- | ----------------------------------- | --------- || REQ-TODO-TBL-01 | Manage task metadata | review |When this ID is referenced from another file, deleting a requirement or changing an ID breaks the reference. And this breakage is silent. No errors are thrown, and no tests fail.
Inheriting Ubiquitous Language
Section titled “Inheriting Ubiquitous Language”I inherit Layer 1’s ubiquitous language at the top of every SDD file.
## Inheritance
- [Layer 1: Glossary](../../foundation/glossary.md)- [Layer 2: Document Standards](../../standards/doc_standards.md)Whether these links are valid (whether the files exist, whether the paths are correct) becomes harder to verify as the number of files grows.
Document Proliferation
Section titled “Document Proliferation”As systems grow, zones and files multiply. When requirement IDs number in the dozens or hundreds, tracking cross-references by hand is no longer feasible.
Why Static Analysis, Not LLMs
Section titled “Why Static Analysis, Not LLMs”Delegating consistency checks to an LLM is certainly an option. However, several concerns led me to settle on static analysis.
Time Cost and Financial Cost
Section titled “Time Cost and Financial Cost”To have an LLM verify documents, you need to include all target files in the context. As documents grow, token counts increase, driving up both time cost and financial cost.
When organizations contract Cursor or Claude Code, request limits are often shared across the team. Consuming LLM requests for repetitive tasks like document consistency checks is something you’d rather avoid.
LLMs Are Not Suited for Reproducible Verification
Section titled “LLMs Are Not Suited for Reproducible Verification”LLM outputs are inherently non-deterministic. For yes/no problems like “does a required column exist?”, “are IDs unique?”, or “does the linked file exist?” — receiving a probabilistic “it’s probably fine” still leaves you needing to verify it yourself.
Being subject to this instability in AI outputs on every check leads to decision fatigue. For things that can be mechanically determined as OK/NG given defined conditions, let static analysis handle it and keep human judgment to a minimum. For verification that demands reproducibility, having an error message that clearly states “consistency is broken” is far more trustworthy.
A Tool for AI, But Not Dependent on AI
Section titled “A Tool for AI, But Not Dependent on AI”contextlint is a tool for verifying AI-oriented structured documents like those in SDD, but contextlint itself is a pure static analysis tool written in TypeScript that has zero dependency on AI / LLMs.
AI “writes,” contextlint “checks consistency.” By separating the responsibilities of generation and checking, randomness is kept out of the checking results.
Strengths of Static Analysis
Section titled “Strengths of Static Analysis”| Aspect | LLM | Static Analysis |
|---|---|---|
| Stability | Results may vary between runs | Same input always yields same output |
| Speed | Slows as token count grows | Completes in seconds |
| Cost | Consumes API / request quota | None. Run freely in CI |
This is the motivation behind building contextlint.
What Is contextlint
Section titled “What Is contextlint”contextlint is a rule-based linter that verifies consistency across Markdown documents.
It detects broken cross-references, duplicate IDs, missing sections, and structural issues deterministically, in seconds, and CI-friendly.
npm install -D @contextlint/cliIt currently provides 6 categories and 21 rules.
| Category | Overview | Rules |
|---|---|---|
| Table Validation (TBL) | Required columns, empty cells, value constraints | 6 |
| Structure Validation (SEC / STR) | Required sections, section order, required files | 3 |
| Reference Validation (REF) | Links, IDs, anchor consistency across documents | 6 |
| Checklist Validation (CHK) | Checklist item completion status | 1 |
| Context Validation (CTX) | Placeholder detection, term consistency | 2 |
| Graph Validation (GRP) | Traceability chains, circular refs, orphan documents | 3 |
Here’s how the challenges map to contextlint rules:
| Challenge | contextlint Rules |
|---|---|
| Stability management | TBL-002 (empty cell prevention), TBL-003 (allowed value constraints) |
| Cross-reference fragility | REF-002 (ID traceability), TBL-006 (ID uniqueness) |
| Ubiquitous language links | REF-001 (link target existence), REF-005 (anchor existence), CTX-002 (term consistency) |
| Document proliferation | STR-001 (required files), SEC-001 (required sections), REF-004 (cross-zone deps), GRP-001–003 (graph health) |
Just declare rules and target files in a configuration file (contextlint.config.json).
{ "$schema": "https://raw.githubusercontent.com/nozomi-koborinai/contextlint/main/schema.json", // Default validation targets (all Markdown under docs) "include": ["docs/**/*.md"], "rules": [ // Prevent empty cells in ID and Stability columns { "rule": "tbl002", "options": { "columns": ["ID", "Stability"] } }, // Restrict Stability values to draft / review / stable { "rule": "tbl003", "options": { "column": "Stability", "values": ["draft", "review", "stable"] } }, // Verify relative links between documents exist { "rule": "ref001" } ]}Rule System
Section titled “Rule System”Let’s look at each category’s rules in detail.
Table Validation (TBL)
Section titled “Table Validation (TBL)”Rules for validating Markdown table contents.
| Rule | Overview |
|---|---|
| TBL-001 | Required columns exist in a table |
| TBL-002 | No empty cells in specified columns |
| TBL-003 | Column values are within an allowed list |
| TBL-004 | Column values match a regex pattern |
| TBL-005 | Conditional constraints (constraints on one column based on another) |
| TBL-006 | Column values are unique across the entire project |
For the stability management example above, the following configuration achieves “prevent empty stability cells” and “accept only allowed values”:
{ "rules": [ { "rule": "tbl002", "options": { "columns": ["ID", "Stability"] } }, { "rule": "tbl003", "options": { "column": "Stability", "values": ["draft", "review", "stable"] } } ]}Structure Validation (SEC / STR)
Section titled “Structure Validation (SEC / STR)”Rules for validating document structure.
| Rule | Overview |
|---|---|
| SEC-001 | Required sections (headings) exist in specified files |
| SEC-002 | Section headings appear in a specified order |
| STR-001 | Required files exist in the project |
For example, you can enforce that every requirements.md has Business Value (Why) and Requirements (What) sections.
{ "rule": "sec001", "options": { "sections": ["Business Value (Why)", "Requirements (What)"], "files": "**/requirements.md" }}SEC-002 goes further by enforcing not just the “existence” but also the “order” of sections. This is useful for template-driven documents (ADRs, RFCs, etc.) where you want to standardize an order like Overview → Requirements → Design.
Checklist Validation (CHK)
Section titled “Checklist Validation (CHK)”Rules for validating Markdown checklists.
| Rule | Overview |
|---|---|
| CHK-001 | All checklist items in a specified section are completed |
Useful for enforcing checklist completion in CI — for review checklists, release gates, and similar workflows.
{ "rule": "chk001", "options": { "section": "Review Checklist", "files": "docs/reviews/*.md" }}Reference Validation (REF)
Section titled “Reference Validation (REF)”Rules for validating cross-document reference integrity.
| Rule | Overview |
|---|---|
| REF-001 | Markdown link targets exist |
| REF-002 | Cross-file traceability of requirement IDs |
| REF-003 | Stability ordering constraints (stable items don’t depend on draft) |
| REF-004 | Cross-zone dependencies are declared |
| REF-005 | Anchor fragments (#heading) exist in the target document’s headings |
| REF-006 | Image reference targets () exist |
REF-001 is the rule that verifies whether “ubiquitous language inheritance links are not broken.”
Context Validation (CTX)
Section titled “Context Validation (CTX)”Rules for validating document content quality.
| Rule | Overview |
|---|---|
| CTX-001 | Placeholder content (TODO, TBD, FIXME, etc.) has not been left behind |
| CTX-002 | Terms are consistent with the project glossary |
CTX-001 detects placeholder content like TODO or TBD left in documents. While useful during drafting, leftover placeholders risk AI generating code based on incomplete specifications.
{ "rule": "ctx001", "options": { "patterns": ["TODO", "TBD", "FIXME", "PLACEHOLDER"] }}CTX-002 verifies that terms defined in the Layer 1 glossary are used consistently across documents. This goes beyond checking that inheritance links exist (REF-001) — it validates term-level consistency.
{ "rule": "ctx002", "options": { "glossaryFile": "docs/foundation/glossary.md", "termColumn": "Term", "files": "docs/zones/**/*.md" }}Graph Validation (GRP)
Section titled “Graph Validation (GRP)”Rules that analyze inter-document dependencies as a graph and detect structural issues. contextlint’s Context Graph Engine builds the dependency graph and validates it with the following rules.
| Rule | Overview |
|---|---|
| GRP-001 | Traceability chain from requirements to code is not broken |
| GRP-002 | No circular references in document dependencies |
| GRP-003 | No orphan documents that are not referenced from anywhere |
The “requirements → specifications → design → code” traceability described earlier — GRP-001 verifies that this chain is not broken using graph analysis. GRP-002 detects circular dependencies like A → B → C → A, and GRP-003 finds “orphan” documents that are not referenced from anywhere.
In projects with dozens of files or more, finding such structural issues by eye is virtually impossible. Graph validation fills that gap.
Difference from markdownlint
Section titled “Difference from markdownlint”Unlike markdownlint, which validates Markdown syntax and formatting, contextlint specializes in semantic content consistency and cross-file integrity.
| Aspect | markdownlint | contextlint |
|---|---|---|
| Target | Syntax and formatting | Content structure and semantics |
| Examples | Heading level order, list indentation | Required table columns, ID uniqueness, link targets |
| Role | How Markdown is written | What Markdown contains |
The two are complementary — using both together is recommended.
Package Structure
Section titled “Package Structure”contextlint is a monorepo composed of 3 packages.
| Package | Role |
|---|---|
@contextlint/core | Rule engine and Markdown parser |
@contextlint/cli | CLI entry point (the contextlint command) |
@contextlint/mcp-server | MCP server for AI tool integration |
Quick Start
Section titled “Quick Start”# Installnpm install -D @contextlint/cli
# Generate config interactively (English, Japanese, Chinese, Korean)npx contextlint init
# Run (auto-detects contextlint.config.json from current directory)npx contextlintConfiguration File Example
Section titled “Configuration File Example”Here’s a configuration file from a real project:
{ "$schema": "https://raw.githubusercontent.com/nozomi-koborinai/contextlint/main/schema.json", // Validation target file patterns "include": ["docs/**/*.md"], "rules": [ // --- Table Validation --- // Prevent empty cells in ID and Stability columns { "rule": "tbl002", "options": { "columns": ["ID", "Stability"] } }, // Restrict Stability values to draft / review / stable { "rule": "tbl003", "options": { "column": "Stability", "values": ["draft", "review", "stable"] } }, // Require IDs to follow a naming convention like REQ-AUTH-01 { "rule": "tbl004", "options": { "column": "ID", "pattern": "^REQ-[A-Z]+-\\d{2,3}$" } }, // Ensure requirement IDs are unique across the project { "rule": "tbl006", "options": { "files": "**/requirements.md", "column": "ID" } },
// --- Structure Validation --- // Ensure required sections exist in requirements.md { "rule": "sec001", "options": { "sections": ["Business Value (Why)", "Requirements (What)"], "files": "**/requirements.md" } }, // Ensure required files exist in the project { "rule": "str001", "options": { "files": [ "docs/zones/auth/requirements.md", "docs/zones/todo/requirements.md" ] } },
// --- Reference Validation --- // Verify relative links between documents exist { "rule": "ref001" }, // Verify requirement IDs are correctly referenced from spec files { "rule": "ref002", "options": { "definitions": "**/requirements.md", "references": ["**/spec_*.md", "**/overview.md"], "idColumn": "ID", "idPattern": "^REQ-" } }, // Ensure stable specs don't depend on draft requirements { "rule": "ref003", "options": { "stabilityColumn": "Stability", "stabilityOrder": ["draft", "review", "stable"], "definitions": "**/requirements.md", "references": ["**/spec_*.md"] } }, // Ensure cross-zone links are declared in overview files { "rule": "ref004", "options": { "zonesDir": "docs/zones", "dependencySection": "External Zone Dependencies" } } ]}With this configuration, the following are automatically verified:
IDandStabilitycolumns in requirement tables are not empty (TBL-002)- Stability values are one of
draft/review/stable(TBL-003) - Requirement IDs follow a naming convention like
REQ-AUTH-01(TBL-004) - Requirement IDs are unique across the project (TBL-006)
- Required sections exist in
requirements.md(SEC-001) - Required files exist (STR-001)
- Relative links between documents point to existing targets (REF-001)
- Requirement IDs are referenced from specification files (REF-002)
- Stable specifications don’t depend on draft requirements (REF-003)
- Cross-zone dependencies are declared (REF-004)
Execution Output
Section titled “Execution Output”Running contextlint reports violations per file:
$ npx contextlint
docs/zones/todo/requirements.md line 5 error Column "Stability" has invalid value "wip" (allowed: draft, review, stable) TBL-003 line 5 error Column "ID" value "REQ_TODO_03" does not match pattern ^REQ-[A-Z]+-\d{2,3}$ TBL-004
docs/zones/auth/table_users.md line 8 warning Empty cell in column "Stability" TBL-002 line 12 error Link target "../../foundation/glossary.md" does not exist REF-001
2 errors, 1 warning in 2 filesCLI Subcommands
Section titled “CLI Subcommands”Starting from v0.6.0, contextlint provides subcommands for analyzing and leveraging document dependencies beyond linting.
| Command | Overview |
|---|---|
contextlint | Run lint based on the config file (default) |
contextlint init | Generate a config file interactively |
contextlint impact <file> | Show the blast radius of changes to a given file |
contextlint slice <query> | Extract a subset of documents related to a given query |
contextlint graph | Visualize the dependency graph between documents |
contextlint compile | Generate SKILL.md from documents and config (Context Compiler) |
The --watch flag enables real-time linting on file changes, and --format json outputs results in JSON.
# Watch for file changes and lint in real timenpx contextlint --watch
# Output in JSON format (useful for CI and scripting)npx contextlint --format json
# Analyze the impact of changing a specific filenpx contextlint impact docs/zones/auth/requirements.md
# Visualize the dependency graphnpx contextlint graphCI Integration
Section titled “CI Integration”Here’s an example of running contextlint in GitHub Actions:
name: contextlint
on: push: branches: [main] pull_request: branches: [main]
jobs: lint: runs-on: ubuntu-latest steps: - uses: actions/checkout@v6 - run: npx @contextlint/cliValidation runs automatically on PRs that modify documents, catching broken structures before merge.
With --format json, results are output in JSON, making it easy to integrate with CI reporters.
AI Tool Integration
Section titled “AI Tool Integration”contextlint also works as an MCP (Model Context Protocol) server.
{ "mcpServers": { "contextlint": { "command": "npx", "args": ["@contextlint/mcp-server"] } }}The MCP server provides six tools:
| Tool | Overview |
|---|---|
lint | Pass Markdown content directly with rules for inline linting |
lint-files | Specify a config file and glob patterns for file-based linting |
context-graph | Retrieve the dependency graph between documents |
context-slice | Extract a subset of documents related to a given query |
impact-analysis | Analyze the blast radius of changes to a given file |
compile-context | Generate SKILL.md from documents and config (Context Compiler) |
With this setup, AI tools like Claude and Cursor can run document linting and dependency analysis within conversations. For example:
- Have AI edit a document
- After editing, run
lint-filesvia MCP to check consistency - If violations are found, AI automatically fixes them and re-runs lint
- Use
impact-analysisto check the blast radius and update related documents
AI checking document consistency via MCP, identifying violations and suggesting fixes
You can also use Claude Code Hooks or Cursor Hooks to run the CLI directly without MCP.
Set up a hook to auto-run npx contextlint after document edits, and you get the same workflow without the MCP server.
Either way, CI acts as a guardrail to “not let broken things through,” while AI tool integration is a mechanism to “not create broken things in the first place.”
Context Compiler
Section titled “Context Compiler”The Context Compiler, added in v0.7.0, deterministically generates custom skills (SKILL.md) for Claude Code from your project’s documents and contextlint configuration. No LLM is used — same input always yields the same output.
npx contextlint compileConfiguration
Section titled “Configuration”Add a compile section to your contextlint.config.json:
{ "include": ["docs/**/*.md"], "compile": { "skill": { "name": "my-project-docs", "description": "Validate and maintain project documentation" }, "outdir": ".claude/skills/my-project" }, "rules": [...]}Generated SKILL.md Structure
Section titled “Generated SKILL.md Structure”The compile command analyzes the dependency graph and outputs a SKILL.md under .claude/skills/ with four sections:
| Section | Content |
|---|---|
| Document Architecture | File tree with graph roles (entry / hub / leaf / bridge / isolated) |
| Document Rules | Applied rules described in natural language, grouped by category |
| Document Dependencies | Dynamic context injection via impact / slice commands |
| Workflow | Guidance for creating and editing documents |
Here’s what a generated SKILL.md looks like:
---name: my-project-docsdescription: "Validate and maintain project documentation"---
## Document Architecture
### File Tree
| Path | Role ||------|------|| `docs/zones/auth/overview.md` | entry point || `docs/zones/auth/requirements.md` | hub || `docs/zones/auth/table_users.md` | leaf |
### Document Types
- **[ID, Requirement, Stability]** - 2 table(s) (ID format: `REQ-NNN`)
## Document Rules
### Table Structure- **TBL-002**: Columns "ID", "Stability" must not be empty- **TBL-003**: Column "Stability" values must be one of: draft, review, stable
### References- **REF-001**: All relative links must point to existing files
## Document Dependencies
### Impact Analysis (dynamic)!`npx contextlint impact $ARGUMENTS`
### Related Documents (dynamic)!`npx contextlint slice $ARGUMENTS`The key part is the !`npx contextlint impact $ARGUMENTS` syntax in the Document Dependencies section. This leverages Claude Code’s Dynamic Context Injection — when the skill is loaded, the commands execute automatically and their output is embedded into the prompt.
This means when Claude Code starts working on document-related tasks, the following happens automatically:
- Document structure and rules are understood from SKILL.md
contextlint impact/sliceresults reveal the blast radius of changes- Code is generated and modified based on that context
If MCP is a real-time “conversational” integration, the Context Compiler is a “pre-load context” approach.
Direct Execution from Claude Code
Section titled “Direct Execution from Claude Code”In Claude Code, you can run shell commands directly within a conversation using the ! prefix:
! npx contextlint! npx contextlint impact docs/zones/auth/requirements.md! npx contextlint graphThe output is added to the conversation context, so Claude can propose next actions based on the results. Even without setting up an MCP server, you can integrate contextlint into your AI workflow using just ! commands.
Use Cases Beyond SDD
Section titled “Use Cases Beyond SDD”This article used SDD as an example, but contextlint’s rules are general-purpose — they can be used in any project that manages documents in Markdown.
- ADR (Architecture Decision Records) — Use SEC-001 to enforce required sections (Status, Context, Decision) and TBL-003 to constrain status values
- Docs as Code in general — Auto-detect broken references, duplicate IDs, and missing files in CI
Managing documents in Git like code and setting up CI guardrails.
Adding contextlint to such Docs as Code practices, alongside markdownlint, broadens your coverage.
Conclusion
Section titled “Conclusion”Code has ESLint and type checking, but there’s no equivalent for verifying document contents.
contextlint is a tool I started building to fill that gap.
- Validates required table columns and value constraints
- Validates cross-document reference integrity
- Validates requirement ID uniqueness and traceability
- Detects term inconsistencies and leftover placeholders
- Analyzes dependency graphs to catch circular references and orphan documents
- Visualizes and leverages structure via
impact/slice/graph/compile - Runs automatically in CI, integrates with AI tools via MCP
- Validates AI-oriented documents, but does not depend on AI
If you manage documents in Markdown — whether for SDD, ADR, or anything else — give it a try.