Nested Table Optimized Notation (NTON) is a schema-driven serialization format. It achieves high data density by defining data structures (DEF) and recurring values (REF) in a header, allowing the data body to use positional encoding with optional field names for clarity.
NTON is designed for LLM contexts where token efficiency matters, while maintaining robustness and evolvability.
v0.03 Focus: Eliminates off-by-one errors and enables truncation detection through mandatory naming rules and optional metadata.
An NTON document consists of three optional sections, which MUST appear in this order:
- Definitions (
DEF) - References (
REF) - Data Stream (
STREAM)
NTON files MUST be encoded in UTF-8.
Types function as structs. They define the fields and their order in the data stream.
Syntax: DEF <TypeName>: { <field>, <field>:<Type>, ... }
- Fields without a type default to Primitive (string, number, boolean, null).
- Arrays are denoted by
[]. - Optional fields are denoted by
?.
Examples:
# Simple type
DEF User: { id, name, email }
# Optional fields
DEF User: { id, name, email?, phone? }
# Typed fields
DEF Group: { name, owner:User, members:User[] }
# Optional arrays
DEF Project: {
id,
name,
status,
active,
manager_id?,
budget,
milestones:Milestone[]?
}
To prevent off-by-one errors and ambiguity, NTON enforces the following rules:
Rule 1: Optional fields MUST use named syntax when present.
DEF User: { id, name, email?, phone? }
# VALID:
{U1, "Alice", email="alice@example.com"} # Named optional field
{U2, "Bob", email="bob@example.com", phone="+1-555-1234"}
{U3, "Carol"} # Optional fields omitted
# INVALID:
{U4, "Dave", "dave@example.com"} # Ambiguous - is this email or phone?
Rule 2: Positional encoding can only be used for required fields.
DEF Task: { id, title, priority, hours? }
# VALID:
{T1, "Fix bug", 8} # Required fields positional, optional omitted
{T2, "Add feature", 5, hours=40} # Required positional, optional named
# INVALID:
{T3, "Refactor", 3, 20} # Last field MUST be named (hours=20)
Rationale: This eliminates ambiguity when optional fields are sparse or when data is truncated.
Reference tables allow string deduplication via variable substitution.
Syntax: REF <RefName>: { $<Var>: "<Value>", ... }
- Variables MUST start with
$. - Variables can be used anywhere a string value is expected.
Example:
REF Status: {
$IP: "In Progress",
$C: "Completed",
$P: "Planning"
}
REF Departments: {
$T: "Technology",
$M: "Marketing",
$F: "Finance"
}
- Booleans:
TorF(alsotrueorfalse) - Null:
null,~, or_ - Strings: Unquoted if alphanumeric with no spaces. Double-quoted otherwise.
- Numbers: Standard integer or floating-point:
42,3.14,-17,1.23e-4 - Dates: ISO 8601 format:
2025-12-15or"2025-12-15T10:30:00Z" - Variables: References to
REFtables (e.g.,$IP,$Alice)
Objects are enclosed in { } and contain comma-separated values.
Pure Positional (Most Compact):
{U1, "Alice", "alice@example.com"}
Named Fields (Most Readable):
{id=U1, name="Alice", email="alice@example.com"}
Hybrid (Best Balance):
{U1, "Alice", email="alice@example.com"} # Mix positional and named
Arrays are enclosed in [ ] and contain comma-separated items.
[
{U1, "Alice", "alice@example.com"},
{U2, "Bob", "bob@example.com"}
]
Objects and arrays can be nested to any depth using explicit delimiters.
{
id=P001,
name="Alpha Initiative",
milestones=[
{
name="Design Phase",
workers=[
{W1, 65.50, "Alice"},
{W2, 85.00, "Bob"}
]
}
]
}
Whitespace is FLEXIBLE:
- Indentation is cosmetic only (for human readability)
- Parser ignores leading/trailing whitespace
- Line breaks are optional
- Any consistent indentation style is acceptable (2 spaces, 4 spaces, tabs)
All of these are equivalent:
{U1, "Alice", "alice@example.com"}
{ U1, "Alice", "alice@example.com" }
{
U1,
"Alice",
"alice@example.com"
}
A stream begins with STREAM <TypeName>: followed by one or more records.
Syntax: STREAM <TypeName> [(count=N)]: where count is optional.
Simple Stream:
STREAM User (count=3):
{U1, "Alice", email="alice@example.com"}
{U2, "Bob", email="bob@example.com"}
{U3, "Carol"} # Optional field omitted
Without count (streaming contexts):
STREAM User:
{U1, "Alice", email="alice@example.com"}
{U2, "Bob", email="bob@example.com"}
Benefit: Parsers can detect truncation by comparing actual vs expected record count.
Nested Stream:
STREAM Project:
{
P001,
"Alpha Initiative",
status=$IP,
active=T,
budget=500000,
milestones=[
{name="Design", date=2025-12-15, completion=1.0},
{name="Development", date=2026-01-20, completion=0.5}
]
}
# Line comment
{U1, "Alice"} # Inline comment
/*
Block comment
spanning multiple lines
*/
Use ... to explicitly indicate intentional truncation or partial data display.
# Showing first 2 of 1000 records
STREAM User (count=1000):
{U1, "Alice", email="alice@example.com"}
{U2, "Bob", email="bob@example.com"}
...
Arrays with truncation:
workers=[
{W1, 65.50, manager="Alice"},
{W2, 85.00, manager="Bob"},
...
]
Benefit: Distinguishes intentional truncation from parsing errors.
document = *definition *reference *stream
definition = "DEF" SP type-name ":" SP "{" field-list "}" LF
field = field-name [":" type-name] ["?"]
reference = "REF" SP ref-name ":" SP "{" var-list "}" LF
stream = "STREAM" SP type-name [SP "(" "count" "=" number ")"] ":" LF *record
record = object LF
object = "{" [field-value *("," field-value)] "}"
field-value = [field-name "="] value
; field-name "=" REQUIRED for optional fields when present
array = "[" [value *("," value) ["," "..."]] "]"
; "..." indicates explicit truncation
value = primitive / object / array / variable
primitive = string / number / boolean / null / dateFields marked with ? in the DEF can be omitted or set to null.
IMPORTANT: When present, optional fields MUST use named syntax (see section 3.2).
DEF User: { id, name, email?, phone? }
# All valid:
{U1, "Alice", email="alice@example.com", phone="(555) 1234"} # Named
{U2, "Bob", email="bob@example.com"} # Named
{U3, "Carol", email=null, phone=null} # Explicit null
{U4, "Dave"} # Omitted
# INVALID (v0.03):
{U5, "Eve", "eve@example.com", "(555) 1234"} # Positional optional fields
Adding optional fields is backward compatible:
# Version 1
DEF User: { id, name, email }
# Version 2 (backward compatible)
DEF User: { id, name, email, phone?, created_at? }
Old data remains valid under the new schema.
Parsers MUST validate:
- Required fields: Missing required fields are parse errors
- Optional field syntax: Optional fields present without names are errors (v0.03+)
- Unclosed delimiters:
{[without matching}]are errors - Type mismatches: Attempt coercion, error if impossible
- Record counts: Mismatch between declared count and actual count triggers warning
Parsers SHOULD be forgiving:
- Trailing commas: Allowed and ignored
- Missing optional fields: Treated as
null - Extra unknown fields: Ignored with a warning
- Whitespace variations: Flexible indentation and line breaks
- Extension:
.nton - MIME type:
application/nton
# GlobalTech Project Data - NTON v0.03
DEF Worker: { id, rate, manager_name? }
DEF Milestone: {
name,
date,
completion,
workers:Worker[]?
}
DEF Project: {
id,
name,
status,
active,
manager_id?,
budget,
milestones:Milestone[]?
}
REF Status: {
$IP: "In Progress",
$C: "Completed",
$P: "Planning"
}
REF Managers: {
$Alice: "Alice",
$Bob: "Bob",
$Carol: "Carol"
}
STREAM Project (count=2):
{
P001,
"Alpha Initiative",
status=$IP,
active=T,
manager_id=M1,
budget=500000,
milestones=[
{
"Design Sprints",
2025-12-15,
1.0,
workers=[
{W1, 65.50, manager_name=$Alice},
{W2, 85.00, manager_name=$Bob}
]
},
{
"Prototype Approval",
2026-01-20,
0.9,
workers=[
{W1, 65.50, manager_name=$Alice},
{W3, 72.25, manager_name=$Carol}
]
}
]
}
{
P002,
"HR Portal V2",
status=$C,
active=F,
budget=120000,
milestones=[
{"Requirements", 2025-10-01, 1.0},
{"Launch", 2025-11-20, 1.0}
]
}
| Feature | Benefit |
|---|---|
Explicit delimiters { } [ ] |
Robust parsing, no indentation errors |
| Optional field names | Clarity when needed, brevity when obvious |
Optional fields ? |
Schema evolution without breaking changes |
| Flexible whitespace | LLM-friendly, human-friendly |
| No count markers | Simpler, less error-prone |
| Hybrid syntax | Balance between compression and readability |
| Feature | Benefit |
|---|---|
| Mandatory names for optional fields | Eliminates off-by-one errors and ambiguity |
| Optional stream record counts | Enables truncation detection |
Explicit truncation markers (...) |
Distinguishes intentional vs accidental truncation |
| Strict validation rules | Catches errors early in parsing |
| Required field enforcement | Prevents incomplete data |
Breaking change from v0.02: Optional fields MUST use named syntax when present. This trade-off sacrifices ~10-20% token efficiency on sparse data to gain robustness and eliminate parsing ambiguity.