Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

The Proofdown Language

Proofdown is a deterministic markup to present CI/testing evidence as structured, linkable, and reviewable content. It supports full Markdown (CommonMark + selected GFM) plus a small, HTML-like component system for structured viewers.

  • Determinism: Golden AST JSON guards the grammar and enforces byte-for-byte stability.
  • Safety: No scripting or raw HTML; viewer bounds and artifact integrity enforced downstream.
  • Authorability: Simple, LLM-friendly patterns with grid/card composition and artifact.* viewers.

This book documents the Proofdown language (Markdown + Components) and additive v2 semantics (viewers/attributes), the validator rules, limits, and integration contract with provenance_ssg.

Quick links:

  • Language Inventory — comprehensive list of constructs and attributes
  • Grammar (v1) — Markdown + Components (blocks, inlines, components)
  • Additive Semantics (v2) — pointers, captions, text viewer
  • Components and Artifact Viewers — details and constraints
  • Limits, Safety, Testing, Schemas
  • SSG Contract — stable API and AST JSON for downstream consumers

Language Inventory

This is a normative inventory of Proofdown constructs, their attributes, and constraints. It aggregates v1 grammar and v2 additive semantics. Validation of bounds happens downstream (validator/SSG). The parser remains syntax-only.

Blocks

  • Markdown (CommonMark + selected GFM)
    • Headings (ATX) #..#### levels 1..4
    • Paragraphs
    • Block quotes (arbitrary nested blocks)
    • Thematic breaks (---, ***, ___)
    • Code blocks (fenced/indented) with optional info string
    • Lists (unordered/ordered) with start (ordered), tight, and GFM task items ([ ]/[x])
    • Tables (GFM) with optional header row and per-column alignment
  • Components
    • HTML-like tags: <name key="value" key2=value2>; </name> closes; <name ... /> self-closing.
    • Attributes: quoted or bare; preserved exactly as strings in AST.
    • Children: zero or more blocks; components MAY contain full Markdown.

Markdown inlines

  • Text
  • Emphasis, Strong, Strikethrough (GFM)
  • Inline code
  • Soft/Hard line breaks
  • Links (url + optional title), Images (url + optional title; alt from inline text)

Structural Components

  • grid(cols, gap?)
    • Purpose: layout container for cards/sections.
    • Attributes (validator bounds): cols in 1..6 required; gap in 0..64 optional.
  • section(title)
    • Title required.
  • card(title)
    • Title required; contains child blocks.

Artifact Viewers (syntax recognized; validation enforces bounds)

  • artifact.summary(id)
  • artifact.table(id, caption?, columns?, kind?)
    • columns: comma-separated keys or RFC 6901 JSON Pointers; validated for syntax.
    • kind: optional hint for schema; advisory.
  • artifact.json(id, collapsed?, depth=0..8, json_pointer?)
    • json_pointer: RFC 6901 string; static projection.
  • artifact.markdown(id)
  • artifact.image(id, alt, max_height=128..2048, caption?)
  • artifact.link(id, download?, title?)
  • artifact.text(id, max_lines=1..500, caption?) (v2)

Limits (enforced)

  • Default: max_depth=16, max_nodes=50k, max_input_bytes=1MiB.
  • Determinism: same input ⇒ same AST byte-for-byte under serde JSON.
  • Safety: no I/O in parser; raw HTML blocks/inline are dropped; validator enforces whitelist/attribute bounds; renderer escapes and enforces viewer limits.

Grammar (v1)

The v1 grammar is Markdown + Components. The parser recognizes full CommonMark block/inline structure plus a small, HTML-like component system. Semantics (component whitelist, attribute bounds, artifact policies) are enforced downstream by the validator/SSG.

Supported Markdown blocks

  • Headings #..#### (ATX), level: 1..4
  • Paragraphs
  • Block quotes with nested blocks
  • Thematic breaks (---, ***, ___)
  • Code blocks (fenced/indented) with optional info string
  • Lists (unordered/ordered), with start, tight, and GFM task items ([ ]/[x])
  • Tables (GFM): header row optional; alignments per column

Supported Markdown inlines

  • Text, emphasis, strong, strikethrough (GFM)
  • Inline code
  • Soft/hard line breaks
  • Links (url + optional title), Images (url + optional title + alt from content)

Components

  • Start tag: <name key="value" key2=value2>
  • End tag: </name> (unless self-closing as <name ... />)
  • Attributes: quoted or bare; preserved exactly as strings in the AST
  • Children: zero or more blocks; components may contain full Markdown

Newlines & safety

  • UTF-8 input; CRLF/CR are normalized to \n during scanning
  • Raw HTML blocks/inlines are dropped by the parser for safety (use components instead)

See also: .specs/01_proofdown_language_v1.md for extended notes and examples.

Additive Semantics (v2)

v2 is additive and backward compatible with v1: it introduces semantics and viewer attributes on top of the Markdown + Components grammar. No grammar changes in v2. New attributes/viewers are validated downstream. Key additions:

  • artifact.json: json_pointer (RFC 6901) for static projection; depth bound; collapsed flag.
  • artifact.table: columns (keys or JSON Pointers); optional caption; optional kind hint for schema.
  • artifact.image: optional caption; alt remains required; max_height bounds.
  • artifact.text: new viewer with max_lines=1..500 and optional caption.

Validation and rendering layers enforce bounds, unknown attrs, and advisory schema hints.

Components

Structural containers compose content into scan-friendly layouts:

  • grid(cols, gap?): Rows/columns layout; validator enforces cols 1..6 and gap 0..64.
  • section(title): Logical grouping with heading-like intent.
  • card(title): A bordered content box; contains arbitrary blocks.

Components are recognized at parse time as Component { name, attrs, children, self_closing }. Children may contain full Markdown blocks/inlines. Raw HTML blocks/inline are dropped by the parser; use components instead.

Artifact Viewers

Artifact viewers present CI/testing evidence from content-addressed artifacts. The parser is syntax-only; bounds and integrity checks are downstream.

  • artifact.summary(id) — Show rollups or key counts.
  • artifact.table(id, caption?, columns?, kind?)
    • columns: comma-separated; tokens are simple keys or RFC 6901 JSON Pointers (e.g., /a/b, foo.bar).
    • kind: advisory mapping to schema filename; use validator helper.
  • artifact.json(id, collapsed?, depth=0..8, json_pointer?)
    • json_pointer: RFC 6901 (syntax validated); static projection.
  • artifact.markdown(id) — Rendered markdown snapshot (safe subset downstream).
  • artifact.image(id, alt, max_height=128..2048, caption?) — Always include alt.
  • artifact.link(id, download?, title?) — Links to heavy/interactive assets.
  • artifact.text(id, max_lines=1..500, caption?) — Short plaintext snippets.

Accessibility and performance:

  • Large JSON collapsed by default (use small depth).
  • Always provide alt text and captions when useful.

Limits and Safety

  • Limits
    • Depth ≤ 16, Nodes ≤ 50k, Input ≤ 1 MiB (configurable via API in CLI; defaults enforced in parser/validator).
    • Exceeding a limit yields a LimitExceeded parse/validate error with clear messaging.
  • Determinism
    • Same input ⇒ same AST byte-for-byte under serde JSON; golden tests verify this.
  • Safety model
    • Raw HTML blocks/inline are dropped by the parser for safety; use components instead. Rendering layers must escape content.
    • Artifacts are content-addressed; callers verify digest before render.
    • Fail closed on digest mismatches or missing artifacts.
    • Markdown tables, strikethrough, autolinks, and task lists are supported (GFM subset) without enabling raw HTML.

Validation and Semantics

Validation enforces a structural whitelist and attribute bounds over the parsed AST. Unknown components/attributes are errors at validation time. Examples:

  • Structural: grid(cols=1..6, gap=0..64), section(title), card(title)
  • Artifacts:
    • artifact.json(id, collapsed?, depth=0..8, json_pointer?)
    • artifact.table(id, caption?, columns?, kind?)
    • artifact.image(id, alt, max_height=128..2048, caption?)
    • artifact.text(id, max_lines=1..500, caption?)
  • Advisory schema hints for artifact.table.kind map to filenames (see schemas chapter).
  • Errors are structured with codes: UnknownComponent, UnknownAttribute, MissingAttribute, AttrType, AttrBound, DepthExceeded, NodeLimit.

Authoring Guide

This chapter summarizes the companion authoring guide and gives copy/paste templates.

  • Minimal evidence panel with grid + card and JSON+table viewers.
  • E2E run with screenshots and trace links.
  • Security/SBOM rollups with SARIF and provenance links.

See .docs/proofdown-authoring-guide.md for detailed patterns and anti-patterns.

SSG Contract (Parser ↔ SSG)

This chapter summarizes the versioned contract at .specs/10_contract_with_provenance_ssg.md. The parser performs syntax recognition only; the SSG (or validator) handles semantics and rendering.

Stable API and types:

#![allow(unused)]
fn main() {
pub struct Document { pub blocks: Vec<Block> }
#[serde(tag = "type")]
pub enum Block {
    Heading { level: u8, inlines: Vec<Inline> },
    Paragraph { inlines: Vec<Inline> },
    BlockQuote { children: Vec<Block> },
    ThematicBreak,
    CodeBlock { info: String, text: String },
    List { kind: ListKind, start: Option<u64>, tight: bool, items: Vec<ListItem> },
    Table { align: Vec<TableAlign>, header: Option<TableRow>, rows: Vec<TableRow> },
    Component(Component)
}

pub enum ListKind { Bullet, Ordered }
pub struct ListItem { pub children: Vec<Block>, pub task: Option<bool> }
pub enum TableAlign { None, Left, Center, Right }
pub struct TableRow { pub cells: Vec<Vec<Inline>> }

#[serde(tag = "type")]
pub enum Inline {
    Text { text: String }, Emph { children: Vec<Inline> }, Strong { children: Vec<Inline> },
    Strikethrough { children: Vec<Inline> }, Code { text: String }, SoftBreak, HardBreak,
    Link { url: String, title: Option<String>, children: Vec<Inline> },
    Image { url: String, title: Option<String>, alt: String }
}

pub struct Component { pub name: String, pub attrs: Vec<Attr>, pub children: Vec<Block>, pub self_closing: bool }
pub struct Attr { pub key: String, pub value: String }

pub fn parse(input: &str) -> Result<Document, ParseError>;
}

Error model and limits:

#![allow(unused)]
fn main() {
pub enum ErrorKind { Syntax, LimitExceeded }
pub struct ParseError { pub line: usize, pub col: usize, pub kind: ErrorKind, pub msg: String }
// Limits: depth ≤ 16, nodes ≤ 50k, input ≤ 1 MiB
}

Determinism and golden tests ensure byte-for-byte stability. Raw HTML blocks/inline are dropped for safety; use components or Markdown constructs instead. See the spec file for the canonical text.

Schemas

Minimal JSON Schemas provide advisory shapes for common artifact table row kinds (e.g., api, a11y, e2e, contracts). They live under .specs/schemas/ and are validated during CI with a local, deterministic checker.

  • Use proofdown_validate::schema_hint_for_kind(kind) to suggest a schema filename.
  • Schemas are informative guidance for authors and validators; rendering layers remain strict about unknown or unsafe content.

Testing and Golden Artifacts

  • Golden tests compare serialized AST JSON for fixtures under tests/fixtures/*.pml and goldens in tests/golden/*.json.
  • Regenerate goldens only via environment switch:
UPDATE_GOLDEN=1 cargo test -p proofdown_parser
  • Parser limits and error-path tests ensure safe failures with structured errors.
  • Validator unit tests enforce attribute bounds and whitelist behavior.
  • Fuzzing and property tests ensure robustness on adversarial inputs.

FAQ

  • Is Proofdown a general markup like Markdown?

    • No. It’s a minimal, purpose-built language for CI/testing evidence. It avoids inline syntax and focuses on deterministic blocks and components.
  • Where are semantics enforced?

    • The validator (or SSG) enforces whitelist and attribute bounds. The parser is syntax-only.
  • How are large JSON artifacts handled?

    • Use artifact.json with collapsed=true and a small depth. For focused views, use json_pointer to a subtree.
  • Does Proofdown support scripting or embedded HTML?

    • No. Content must be safe and deterministic.

Publishing Checklist

This page tracks tasks to publish and maintain the mdBook.

  • Create book/book.toml and book/src with chapter structure.
  • Write chapters for grammar, components, artifacts, limits, validation, authoring, contract, schemas, testing.
  • CI job: build mdBook on PRs and main.
  • Optional: Deploy to GitHub Pages from main.

Local build (requires mdbook):

cargo install mdbook
mdbook build book
open book/book/index.html

CI recommendations:

  • Add a job that runs cargo install mdbook and mdbook build book and uploads the book/book/ as an artifact.
  • Protect against external network fetches; fonts should be local.