lex Specifications
This is the formal reference for the lex document format.
Quick Start
lex documents are plain text files where structure is defined by indentation and minimal markers.
Document Title
1. First Section
Paragraphs are just text. Multiple lines
form a single paragraph until a blank line.
Nested Definition:
Definitions use a colon after the term,
with content indented immediately below.
- List items start with dashes
- Or numbers: 1. 2. 3.
- Or letters: a. b. c.
Code Example:
function hello() {
return "world";
}
:: javascript
2. Second Section
:: note :: Annotations attach metadata to content.
References link to things: [https://example.com],
citations [@author2024], and footnotes [1].
Core Concepts
Indentation
Indentation defines hierarchy. Each level is 4 spaces (or 1 tab, converted to 4 spaces).
- Content indented under a section title belongs to that section
- Deeper indentation creates nested structure
- Returning to a shallower level closes the current block
The :: Marker
The double colon (::) is the only explicit syntax in lex. It’s used for:
- Annotations:
:: label :: content - Verbatim closers:
:: language - Block annotations:
:: label ::\n content\n::
Elements
Sessions
Hierarchical sections with optional numbered titles.
1. Main Section
Content here.
1.1. Subsection
Nested content.
1.2. Another Subsection
More content.
2. Second Main Section
And so on.
Sessions require a blank line between the title and content.
Definitions
Term-explanation pairs.
Term:
The definition follows immediately after the colon.
No blank line between term and content.
Another Term:
Definitions can contain paragraphs, lists,
and even nested definitions.
Definitions do NOT have a blank line between the subject and content (unlike sessions).
Lists
Sequences of items with various markers.
- Unordered with dashes
- Another item
1. Numbered
2. Items
a. Alphabetical
b. Items
i. Roman
ii. Numerals
Lists require a preceding blank line and at least 2 items. A single item becomes a paragraph.
Paragraphs
Consecutive non-blank lines form a paragraph.
This is a paragraph. It can span
multiple lines until a blank line
separates it from the next block.
This is another paragraph.
Verbatim Blocks
Content that should not be parsed as lex.
Code Block:
console.log("Hello, world!");
const x = 42;
:: javascript
Image Reference:
:: image src="diagram.png" alt="Architecture diagram"
The closing :: can include a label and parameters.
Annotations
Metadata attached to content.
:: author :: Jane Doe
:: date :: 2024-01-15
:: note ::
This is a block annotation
with multiple lines of content.
::
Content here has the annotation attached.
Forms:
- Marker:
:: label :: - Single-line:
:: label :: inline content - Block:
:: label ::\n content\n::
Inline Formatting
| Syntax | Meaning |
|---|---|
*text* |
Strong/bold |
_text_ |
Emphasis/italic |
`text` |
Code |
#text# |
Math |
[ref] |
Reference |
Reference Types
| Pattern | Type | Example |
|---|---|---|
| URL | Link | [https://example.com] |
@key |
Citation | [@author2024, pp. 42-45] |
| Number | Footnote | [1] |
^label |
Named footnote | [^important] |
#num |
Section ref | [#2.1] |
| Path | File ref | [./data.csv] |
TK-* |
Placeholder | [TK-figure] |
Parsing Order
Elements are matched in this order:
- Verbatim blocks (require closing annotation)
- Annotations (start with
::) - Lists (blank line + 2+ items)
- Definitions (subject + immediate indent)
- Sessions (subject + blank line + indent)
- Paragraphs (fallback)
Grammar Files
The complete formal grammar is maintained in the comms repository:
specs/grammar-core.lex- Core tokens and element grammarspecs/grammar-line.lex- Line-level token classificationspecs/grammar-inline.lex- Inline formatting and references