A language-independent specification for compiling RFC 8927 JSON Type Definition schemas into target-language source code that validates JSON documents. The generated code contains exactly the checks the schema requires -- no interpreter, no AST, no runtime stack, no dead code.
| Term | Meaning |
|---|---|
| schema | A JSON object conforming to RFC 8927. |
| instance | The JSON value being validated at runtime. |
| form | One of the 8 mutually-exclusive schema shapes defined in RFC 8927 plus the nullable modifier. |
| AST node | An immutable, tagged value representing one compiled schema form. Used during generation, discarded after. |
| error | A pair of JSON Pointers: (instancePath, schemaPath). |
| definitions | A flat string-keyed map of named AST nodes, resolved at compile time. Each becomes a generated function. |
A JTD code generator operates in two phases:
- Parse: Read the JTD schema JSON and compile it into an intermediate AST of immutable nodes (Section 3).
- Emit: Walk the AST and emit target-language source code. Each AST node maps to a specific code pattern. The AST is discarded after emission (Section 5).
The generated code is a standalone validation function. When executed against
a JSON instance, it produces the same (instancePath, schemaPath) error
pairs that RFC 8927 Section 3.3 specifies.
The AST is used only during generation. It is not present in the output.
Node =
| Empty -- {}
| Ref { name: String } -- {"ref": "..."}
| Type { type: TypeKeyword } -- {"type": "..."}
| Enum { values: List<String> } -- {"enum": [...]}
| Elements { schema: Node } -- {"elements": ...}
| Properties { required: Map<String, Node>, -- {"properties": ...}
optional: Map<String, Node>, -- {"optionalProperties": ...}
additional: Boolean } -- {"additionalProperties": ...}
| Values { schema: Node } -- {"values": ...}
| Discrim { tag: String, mapping: Map<String,Node>} -- {"discriminator":...,"mapping":...}
| Nullable { inner: Node } -- any form + "nullable": true
TypeKeyword is one of the 12 strings defined in RFC 8927 Section 2.2.3:
TypeKeyword = boolean | string | timestamp
| int8 | uint8 | int16 | uint16 | int32 | uint32
| float32 | float64
compile(json, isRoot=true, definitions) -> Node:
REQUIRE json is a JSON object
IF isRoot:
IF json has key "definitions":
REQUIRE json["definitions"] is a JSON object
-- Pass 1: register all keys as placeholders for forward refs
FOR EACH key in json["definitions"]:
definitions[key] = PLACEHOLDER
-- Pass 2: compile each definition
FOR EACH key in json["definitions"]:
definitions[key] = compile(json["definitions"][key], isRoot=false, definitions)
ELSE:
REQUIRE json does NOT have key "definitions"
-- Detect form
forms = []
IF json has "ref": forms += "ref"
IF json has "type": forms += "type"
IF json has "enum": forms += "enum"
IF json has "elements": forms += "elements"
IF json has "values": forms += "values"
IF json has "discriminator": forms += "discriminator"
IF json has "properties" OR json has "optionalProperties":
forms += "properties"
REQUIRE |forms| <= 1
-- Compile form
node = MATCH forms:
[] -> Empty
["ref"] -> compileRef(json, definitions)
["type"] -> compileType(json)
["enum"] -> compileEnum(json)
["elements"] -> compileElements(json, definitions)
["properties"] -> compileProperties(json, definitions)
["values"] -> compileValues(json, definitions)
["discriminator"]-> compileDiscriminator(json, definitions)
-- Nullable modifier wraps any form
IF json has "nullable" AND json["nullable"] == true:
node = Nullable { inner: node }
RETURN node
Ref:
compileRef(json, definitions):
name = json["ref"] -- must be a string
REQUIRE name IN definitions -- forward refs are valid (placeholder exists)
RETURN Ref { name }
Type:
compileType(json):
t = json["type"] -- must be a string
REQUIRE t IN TypeKeyword
RETURN Type { type: t }
Enum:
compileEnum(json):
values = json["enum"] -- must be a non-empty array of strings
REQUIRE no duplicates in values
RETURN Enum { values }
Elements:
compileElements(json, definitions):
inner = compile(json["elements"], isRoot=false, definitions)
RETURN Elements { schema: inner }
Properties:
compileProperties(json, definitions):
req = {}
opt = {}
IF json has "properties":
FOR EACH (key, schema) in json["properties"]:
req[key] = compile(schema, isRoot=false, definitions)
IF json has "optionalProperties":
FOR EACH (key, schema) in json["optionalProperties"]:
opt[key] = compile(schema, isRoot=false, definitions)
REQUIRE keys(req) INTERSECT keys(opt) == {}
additional = json.get("additionalProperties", false)
RETURN Properties { required: req, optional: opt, additional }
Values:
compileValues(json, definitions):
inner = compile(json["values"], isRoot=false, definitions)
RETURN Values { schema: inner }
Discriminator:
compileDiscriminator(json, definitions):
tag = json["discriminator"] -- must be a string
REQUIRE json has "mapping"
mapping = {}
FOR EACH (key, schema) in json["mapping"]:
node = compile(schema, isRoot=false, definitions)
REQUIRE node is Properties -- not Nullable, not any other form
REQUIRE tag NOT IN node.required
REQUIRE tag NOT IN node.optional
mapping[key] = node
RETURN Discrim { tag, mapping }
After compilation, the following are guaranteed:
- Every
Ref.nameresolves to an entry indefinitions. - Every
Discrim.mappingvalue is aPropertiesnode (not nullable). - No
Propertiesnode has overlapping required/optional keys. - The AST is immutable. No node is modified after construction.
Exact semantics for each TypeKeyword. The code generator emits exactly
this check, inlined, for each type keyword it encounters.
value is a JSON boolean (true or false)
Target-language expression examples:
- JavaScript:
typeof v === "boolean" - Java:
v instanceof JsonBoolean - Python:
isinstance(v, bool)
value is a JSON string
Target-language expression examples:
- JavaScript:
typeof v === "string" - Java:
v instanceof JsonString - Python:
isinstance(v, str)
value is a JSON string
AND value matches the RFC 3339 date-time production
(regex: ^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:(\d{2}|60)(\.\d+)?(Z|[+-]\d{2}:\d{2})$)
AND the date-time is parseable (accounting for leap seconds by
normalizing :60 to :59 before parsing)
Target-language expression examples:
- JavaScript:
typeof v === "string" && !Number.isNaN(Date.parse(v))(simplified; a full implementation needs the regex for leap-second support) - Java: regex match +
OffsetDateTime.parse(normalized)
value is a JSON number (any finite number; no range check)
RFC 8927 does not distinguish float32 from float64 at the validation level. Both accept any JSON number.
Target-language expression examples:
- JavaScript:
typeof v === "number" && Number.isFinite(v) - Java:
v instanceof JsonNumber
All integer types share the same two-step check:
value is a JSON number
AND value has zero fractional part (floor(value) == value)
AND value is within the type's range (inclusive)
| Type | Min | Max |
|---|---|---|
| int8 | -128 | 127 |
| uint8 | 0 | 255 |
| int16 | -32768 | 32767 |
| uint16 | 0 | 65535 |
| int32 | -2147483648 | 2147483647 |
| uint32 | 0 | 4294967295 |
Note: 3.0 is a valid int8. 3.5 is not. This is value-based, not
syntax-based.
Target-language expression examples:
- JavaScript (uint8):
typeof v === "number" && Number.isInteger(v) && v >= 0 && v <= 255 - Java (uint8):
v instanceof JsonNumber n && n.toDouble() == Math.floor(n.toDouble()) && n.toLong() >= 0 && n.toLong() <= 255
The code generator walks the AST and emits target-language source code. Each AST node maps to a specific code pattern. The central rule:
Emit only what the schema requires. If the schema does not mention a form, the generated code does not contain any logic for that form.
The generator emits:
-
One function per definition -- named
validate_<defName>, taking(instance, errors, instancePath)as parameters. Only emitted if the schema has definitions. -
One exported
validate(instance)function -- the entry point. Creates the error list, calls the root validation logic, returns the error list. -
No helpers, no libraries, no imports. Every check is inlined. If the schema uses only
"type": "string", the generated code contains onetypeofcheck and nothing else.
Emit nothing. No check. No code.
If an Empty node is a required property value, the generated code checks that the key exists but does not validate the value:
// Schema: {"properties": {"data": {}}}
if (!("data" in obj)) e.push({instancePath: p, schemaPath: sp + "/properties/data"});
// No else branch -- empty schema accepts any valueEmit a null guard before the inner check:
// Schema: {"type": "string", "nullable": true}
if (v !== null) {
if (typeof v !== "string") e.push({instancePath: p, schemaPath: sp + "/type"});
}If the inner node is Empty, the nullable wraps nothing -- emit only the null guard (which passes everything, so emit nothing at all).
Emit the type-specific check inlined. No helper function.
// "type": "string"
if (typeof v !== "string") e.push({instancePath: p, schemaPath: sp + "/type"});
// "type": "uint8"
if (typeof v !== "number" || !Number.isInteger(v) || v < 0 || v > 255)
e.push({instancePath: p, schemaPath: sp + "/type"});
// "type": "boolean"
if (typeof v !== "boolean") e.push({instancePath: p, schemaPath: sp + "/type"});
// "type": "float64"
if (typeof v !== "number" || !Number.isFinite(v))
e.push({instancePath: p, schemaPath: sp + "/type"});Emit a set-membership check. For small enums, inline the array. For large enums, a code generator MAY hoist the array to module scope as a constant.
// "enum": ["a", "b", "c"]
if (typeof v !== "string" || !["a","b","c"].includes(v))
e.push({instancePath: p, schemaPath: sp + "/enum"});Note: the string type guard is required because RFC 8927 specifies that non-string values fail enum validation.
Emit an array type guard, then a loop. The loop body is the generated check for the element schema.
// "elements": {"type": "string"}
if (!Array.isArray(v)) {
e.push({instancePath: p, schemaPath: sp + "/elements"});
} else {
for (let i = 0; i < v.length; i++) {
if (typeof v[i] !== "string")
e.push({instancePath: p + "/" + i, schemaPath: sp + "/elements/type"});
}
}If the element schema is a complex type (Properties, Discrim), emit a function call in the loop body instead of inlining.
For nested arrays (arrays of arrays), a code generator MAY inline nested loops up to a configurable depth (e.g. 3 levels) for performance, falling back to function calls beyond that depth.
Emit an object type guard, then:
- One presence check per required key.
- Inlined value checks for each required and optional property.
- A key-rejection loop if
additional == false.
// Schema: {"properties":{"name":{"type":"string"}}, "optionalProperties":{"age":{"type":"uint8"}}}
if (v === null || typeof v !== "object" || Array.isArray(v)) {
e.push({instancePath: p, schemaPath: sp + "/properties"});
} else {
// Required properties
if (!("name" in v)) e.push({instancePath: p, schemaPath: sp + "/properties/name"});
else if (typeof v["name"] !== "string")
e.push({instancePath: p + "/name", schemaPath: sp + "/properties/name/type"});
// Optional properties
if ("age" in v) {
const a = v["age"];
if (typeof a !== "number" || !Number.isInteger(a) || a < 0 || a > 255)
e.push({instancePath: p + "/age", schemaPath: sp + "/optionalProperties/age/type"});
}
// Additional properties (only emitted when additional == false)
for (const k in v) {
if (k !== "name" && k !== "age")
e.push({instancePath: p + "/" + k, schemaPath: sp});
}
}If additional is true, the for-in loop is not emitted at all.
If a property value's schema is a complex type (Properties, Elements, etc.), emit a function call instead of inlining. If it is a leaf (Type, Enum, Empty), inline it.
Emit an object type guard, then a for-in loop. The loop body is the generated check for the value schema.
// "values": {"type": "string"}
if (v === null || typeof v !== "object" || Array.isArray(v)) {
e.push({instancePath: p, schemaPath: sp + "/values"});
} else {
for (const k in v) {
if (typeof v[k] !== "string")
e.push({instancePath: p + "/" + k, schemaPath: sp + "/values/type"});
}
}Emit a 5-step sequential check, then a switch/if-else dispatching to the variant validator.
// "discriminator": "type", "mapping": {"a": {...}, "b": {...}}
if (v === null || typeof v !== "object" || Array.isArray(v)) {
e.push({instancePath: p, schemaPath: sp + "/discriminator"});
} else if (!("type" in v)) {
e.push({instancePath: p, schemaPath: sp + "/discriminator"});
} else if (typeof v["type"] !== "string") {
e.push({instancePath: p + "/type", schemaPath: sp + "/discriminator"});
} else if (v["type"] === "a") {
validate_variant_a(v, e, p, sp + "/mapping/a");
} else if (v["type"] === "b") {
validate_variant_b(v, e, p, sp + "/mapping/b");
} else {
e.push({instancePath: p + "/type", schemaPath: sp + "/mapping"});
}Each variant validator is a generated Properties check. The discriminator tag field is excluded from additional-properties checking and from property validation in the variant (it was already validated by the discriminator check).
Emit a function call to the generated definition validator:
// "ref": "address"
validate_address(v, e, p, sp);Each definition becomes a generated function. The function body is the emitted code for the definition's AST node.
A code generator SHOULD inline checks for leaf nodes (Type, Enum, Empty) directly into their parent's generated code.
A code generator SHOULD emit separate functions for:
- Each definition (called via Ref).
- Each Properties or Discrim node that appears as the child of Elements, Values, or other container nodes.
- Each discriminator variant.
A code generator MUST NOT emit helper functions, type-checking utilities, or library imports that are not required by the specific schema being compiled.
Recursive refs (a definition that ultimately references itself) are legal in RFC 8927. In generated code, this becomes recursive function calls:
// Schema: {"definitions":{"node":{"properties":{"next":{"ref":"node","nullable":true}}}},
// "ref":"node"}
function validate_node(v, e, p, sp) {
if (v === null || typeof v !== "object" || Array.isArray(v)) {
e.push({instancePath: p, schemaPath: sp});
return;
}
if (!("next" in v)) {
e.push({instancePath: p, schemaPath: sp + "/properties/next"});
} else if (v["next"] !== null) {
validate_node(v["next"], e, p + "/next", sp + "/properties/next");
}
}
export function validate(instance) {
const e = [];
validate_node(instance, e, "", "");
return e;
}The target-language call stack provides the implicit work stack. For most real-world schemas, recursion depth is bounded by the document's structure.
When emitting a variant Properties check inside a discriminator, the code generator MUST:
- Exclude the tag field from additional-properties rejection.
- Not emit a value check for the tag field (it was already validated as a string by the discriminator check).
This means the generated known-key set in the for-in loop includes the tag field name, and no property check is emitted for it.
Errors follow RFC 8927 Section 3.3, which defines error indicators as pairs of JSON Pointers:
Error = {
instancePath: String, -- JSON Pointer (RFC 6901) into the instance
schemaPath: String -- JSON Pointer (RFC 6901) into the schema
}
The instancePath points to the value that failed. The schemaPath points
to the schema keyword that caused the failure.
The schema path is built at generation time and baked into the generated code as string literals. Each emission rule appends to the schema path:
| Form | Appended path component(s) |
|---|---|
| Type | /type |
| Enum | /enum |
| Elements (type guard) | /elements |
| Elements (child) | /elements |
| Properties (type guard) | /properties (or /optionalProperties if schema has no properties member) |
| Properties (missing key) | /properties/<key> |
| Properties (additional) | (nothing -- error at current path) |
| Properties (child req) | /properties/<key> |
| Properties (child opt) | /optionalProperties/<key> |
| Values (type guard) | /values |
| Values (child) | /values |
| Discrim (not object) | /discriminator |
| Discrim (tag missing) | /discriminator |
| Discrim (tag not string) | /discriminator |
| Discrim (tag not in map) | /mapping |
| Discrim (variant) | /mapping/<tagValue> |
Schema paths are string literals in the generated code. They do not change at runtime.
Instance paths are built at runtime via string concatenation:
| Descent into | Appended to instancePath |
|---|---|
Array element at index i |
"/" + i |
Object property with key k |
"/" + k |
| Discriminator tag value | "/" + tagFieldName |
| Discriminator variant | (nothing -- same object) |
| Ref target | (nothing -- transparent) |
Generated code conforms to this spec if:
-
For any valid RFC 8927 schema and any JSON instance, the generated
validate(instance)function returns the same set of(instancePath, schemaPath)error pairs that RFC 8927 Section 3.3 specifies. -
The generated code passes the official JTD validation test suite (
validation.jsonfromjson-typedef-spec) when used as the validation engine. -
The code generator rejects invalid schemas at generation time per the constraints in Section 3.4.
-
The generated code contains no dead code: no helper functions, loops, branches, or checks that the schema does not require.
-
Validation does not short-circuit. All errors are collected in a single pass.
Schema:
{
"properties": {
"name": { "type": "string" },
"age": { "type": "uint8" },
"tags": { "elements": { "type": "string" } }
},
"optionalProperties": {
"email": { "type": "string" }
}
}Properties {
required: {
"name" -> Type { type: "string" },
"age" -> Type { type: "uint8" },
"tags" -> Elements { schema: Type { type: "string" } }
},
optional: {
"email" -> Type { type: "string" }
},
additional: false
}
export function validate(instance) {
const e = [];
if (instance === null || typeof instance !== "object" || Array.isArray(instance)) {
e.push({instancePath: "", schemaPath: "/properties"});
return e;
}
if (!("name" in instance)) e.push({instancePath: "", schemaPath: "/properties/name"});
else if (typeof instance["name"] !== "string")
e.push({instancePath: "/name", schemaPath: "/properties/name/type"});
if (!("age" in instance)) e.push({instancePath: "", schemaPath: "/properties/age"});
else {
const v = instance["age"];
if (typeof v !== "number" || !Number.isInteger(v) || v < 0 || v > 255)
e.push({instancePath: "/age", schemaPath: "/properties/age/type"});
}
if (!("tags" in instance)) e.push({instancePath: "", schemaPath: "/properties/tags"});
else if (!Array.isArray(instance["tags"]))
e.push({instancePath: "/tags", schemaPath: "/properties/tags/elements"});
else {
const arr = instance["tags"];
for (let i = 0; i < arr.length; i++) {
if (typeof arr[i] !== "string")
e.push({instancePath: "/tags/" + i, schemaPath: "/properties/tags/elements/type"});
}
}
if ("email" in instance && typeof instance["email"] !== "string")
e.push({instancePath: "/email", schemaPath: "/optionalProperties/email/type"});
for (const k in instance) {
if (k !== "name" && k !== "age" && k !== "tags" && k !== "email")
e.push({instancePath: "/" + k, schemaPath: ""});
}
return e;
}No helper functions. No dead code. Every line corresponds to a specific constraint in the schema.
Instance:
{ "name": "Alice", "age": 300, "tags": ["a", 42], "extra": true }Errors produced:
[
{ "instancePath": "/age", "schemaPath": "/properties/age/type" },
{ "instancePath": "/tags/1", "schemaPath": "/properties/tags/elements/type" },
{ "instancePath": "/extra", "schemaPath": "" }
]age: 300 is a number with zero fractional part, but 300 > 255 (uint8 max).tags/1: 42 is not a string.extra: not in required or optional properties, andadditionalPropertiesdefaults tofalse.