feat: add validateEvent() to all 12 language emitters#34
Conversation
Extends codegen from tag-only validation to full event validation. Each emitter now generates a validateEvent() function that checks: - Type guards (event shape, kind integer, content string, tags array) - Base fields (id=hex64, pubkey=hex64, sig=hex128, created_at>=0) - Content constraints (minLength, maxLength, pattern, enum per kind) - Tag dispatch (delegates to existing validateKindTags) Languages: TypeScript, C, Rust (3 API modes), Swift, Python, Go, Java, Kotlin, Dart, C#, C++, PHP, Ruby. 67 new tests across 13 test files (975 total). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 7 minutes and 8 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (13)
📝 WalkthroughWalkthroughAdds per-kind content-constraint planning and generation across all language emitters, and emits event-level validators that validate base fields, dispatch kind-specific content checks, and delegate tag-structure validation. Tests added to assert generated event validators and content/tag dispatch in each language. Changes
Sequence DiagramsequenceDiagram
participant Client
participant EventValidator as Event Validator (validate_event)
participant FieldChecker as Field Validator (hex / created_at)
participant ContentDispatcher as Content Dispatcher (per-kind)
participant TagValidator as Tag Validator (validateKindTags)
Client->>EventValidator: validate_event(event)
EventValidator->>FieldChecker: check id/pubkey/sig/created_at
FieldChecker-->>EventValidator: field errors (if any)
EventValidator->>ContentDispatcher: dispatch by kind -> apply content checks
ContentDispatcher-->>EventValidator: content errors (if any)
EventValidator->>TagValidator: validate tags (coerce/shape)
TagValidator-->>EventValidator: tag errors (if any)
EventValidator-->>Client: aggregated ValidationErrors
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 19
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/emit-validators.ts (1)
336-352:⚠️ Potential issue | 🟠 MajorAlways enforce the base
content/tagsshape checks.Because these blocks are wrapped in
if (contentKinds.length > 0)andif (sorted.length > 0), a schema set with only tag constraints never validatescontent, and one with only content constraints never validatestags. That leavesvalidateEvent()short of the new event-shape contract.Based on learnings, NEVER update only some emitters — ALL 12 language emitters MUST stay in sync.
Also applies to: 355-373
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/emit-validators.ts` around lines 336 - 352, The generated validator currently skips the baseline "content" and "tags" shape checks when contentKinds or sorted are empty (because the code that emits those checks is wrapped in if (contentKinds.length > 0)/if (sorted.length > 0)), causing validateEvent to miss required base-shape validation; update the emitter so the generic checks for content existence/type and tags existence/type are always emitted outside those conditionals (keep the specific per-kind logic from renderContentActions and the per-tag constraints inside their respective contentKinds/ sorted blocks), i.e., move or duplicate the base checks for the "content" and "tags" paths so they run unconditionally while still applying the specialized cases when contentKinds or sorted contain entries.
🧹 Nitpick comments (1)
tests/emit-validators.test.ts (1)
305-308: Tighten dispatch assertion to target the call site, not just symbol presence.
includes('validateKindTags(')can pass on declaration alone. Consider asserting the emitted call expression fromvalidateEventfor stronger regression protection.💡 Suggested assertion update
it('dispatches to validateKindTags', () => { const output = emitValidatorsFile([], [kindWithContent, kind9735]); - assert.ok(output.includes('validateKindTags(')); + assert.ok(output.includes('errors.push(...validateKindTags(kind, tags));')); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/emit-validators.test.ts` around lines 305 - 308, Test currently checks only that the symbol validateKindTags appears in the emitted output; tighten it to assert the actual call from within validateEvent so declarations don't satisfy the test. Update the test that calls emitValidatorsFile(...) to search for the specific call expression emitted from validateEvent (e.g., the code pattern where validateEvent invokes validateKindTags) and assert that exact snippet is present rather than just includes('validateKindTags('); locate validateEvent and validateKindTags in the generated output to form the precise string to assert.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/emit-c.ts`:
- Around line 481-495: The current content null-check is incorrectly guarded by
contentKinds.length > 0 so notes with missing content pass validation when only
tag constraints exist; move the required-field check out of that conditional:
always call ndb_note_content(note) and if it is NULL invoke
SCHEMATA_EMIT_ERR(errs, n, max_errs, "content", "content is required"), and only
then, if contentKinds.length > 0, perform the per-kind switch using
adapter.kindExpr and renderContentActionsC(actions, helpers, '_content') to run
length/pattern/enum checks; ensure references to errs, n, max_errs,
adapter.kindExpr, contentKinds, ndb_note_content, SCHEMATA_EMIT_ERR, and
renderContentActionsC are used as described.
- Around line 791-803: The declaration of schemata_validate_event is currently
gated on constrainedKinds.length > 0 || contentPlans.size > 0 which suppresses
the shared base-field validator when all schemas are bare; change the condition
so schemata_validate_event is emitted whenever validators are being generated
(i.e., when your overall validator-generation flag/indicator is true or when any
validator plans exist), not based on constrainedKinds or contentPlans; update
the same logic that controls emission around the other place noted (the block at
the other occurrence) so both use the new "validators being generated" predicate
and reference schemata_validate_event accordingly.
- Around line 537-543: The tag-dispatch block currently forwards tags and
tag_lens directly into schemata_validate (and the generated
schemata_validate_event), which can dereference null pointers when num_tags > 0;
before calling schemata_validate (in the block that builds the "Tag dispatch"
code around variables kind, tags, tag_lens, num_tags, errs, n), add an explicit
check that if num_tags > 0 then both tags and tag_lens are non-NULL and return
an error (increment n / write to errs) if not; update schemata_validate_event
(or the generated validation entrypath) to also reject incoherent tag-array
inputs so the runtime never dereferences null pointers when num_tags > 0.
In `@src/emit-csharp.ts`:
- Around line 372-375: The generated C# validation currently narrows numeric
fields (e.g., the emitted checks that produce 'kindRaw is not int kind' and
similar for 'caRaw is not int') to 32-bit ints; update the emitter code in
emit-csharp.ts so the emitted C# accepts 64-bit integers by checking for both
int and long (e.g., test kindRaw is int || kindRaw is long) and then normalize
using Convert.ToInt64(kindRaw) (assign to a long variable) before using it;
apply the same change for created_at/caRaw emission so the generated C# uses
long for these fields and logs the same ValidationError only when the value is
not an integral numeric type.
In `@src/emit-go.ts`:
- Around line 433-525: ValidateEvent currently only accepts JSON-decoder shapes
(float64, []interface{}) and skips content/tags when contentKinds or sorted are
empty; update ValidateEvent to (1) unconditionally perform core shape checks for
"content" and "tags" regardless of contentKinds/sorted, and (2) use type
switches on kindRaw/caRaw/content/tagsRaw to accept native Go numeric types
(int, int64, float64 where value is integer) and native tag shapes ([][]string,
[]string elements, or []interface{} of []interface{}), converting them into the
internal forms before validating; adjust the existing checks around variables
kindRaw, caRaw, contentRaw, tagsRaw and reuse helpers checkHex64/checkHex128 and
ValidateKindTags, ensuring numeric integer checks accept int/int64 and
float64-without-fraction and that tags validation canonicalizes [][]string from
both [][]string and []interface{} inputs.
- Around line 727-731: The current conditional prevents emitting the top-level
ValidateEvent when all kinds are bare by only calling emitEventDispatchGo if
constrainedKinds.length > 0 || contentPlans.size > 0; change it to always
generate eventDispatchCode from the original input schema set (e.g., the primary
kinds/schemaSet variable used to drive emission) by invoking emitEventDispatchGo
unconditionally (or guard on the original schema list being non-empty), and
update emitEventDispatchGo so its per-kind branches simply no-op when a kind has
no extra constraints (leaving shared base-field checks in the emitted
ValidateEvent). Ensure references include eventDispatchCode,
emitEventDispatchGo, constrainedKinds and contentPlans so reviewers can verify
the change.
In `@src/emit-kotlin.ts`:
- Around line 395-436: The generated Kotlin validator currently emits the base
presence/type checks for "content" and "tags" only when contentKinds or sorted
are non-empty; change emit-kotlin.ts so the "content" presence check, the
contentRaw is String type check and the error branches are always emitted in
validateEvent (unconditionally), and only emit the inner when (kind) { ... }
dispatch (using renderContentActionsKotlin and contentKinds) when contentKinds
is non-empty; similarly always emit the "tags" presence/type/list-of-string
checks and error branches unconditionally and only call/emit
errors.addAll(validateKindTags(kind, tags)) when sorted (constrainedKinds) is
non-empty; reference symbols to modify: contentKinds,
renderContentActionsKotlin, kind, validateKindTags, and sorted to ensure the
base guards are unconditional while keeping per-kind dispatch conditional.
In `@src/emit-php.ts`:
- Around line 448-482: The event validator schemata_validate_event() currently
only emits the 'content' presence/type checks when contentKinds is non-empty and
the 'tags' checks when sorted (constrainedKinds) is non-empty, leaving
incomplete validators for tag-only or content-only schemas; always emit the
outer guards for 'content' and 'tags' (the presence and type checks)
unconditionally, but keep the kind-specific dispatch inside the content string
branch (i.e. keep the switch populated by renderContentActionsPhp(...) and case
${kindNumber} as-is), and always emit the tags handling block that fills $tags
and calls schemata_validate_kind_tags($kind, $tags). Ensure you remove the
surrounding conditionals based on contentKinds and sorted so the base checks for
'content' and 'tags' are always present while preserving the inner kind-specific
code paths.
- Around line 396-399: The emitted PHP for the 'check_content_enum' case is not
escaping enum values or the error message; update the array element emission to
use phpString(v) for each value (instead of `'${v}'`) and pass the error message
through phpString() when constructing SchemataValidationError so values like
"O'Reilly" don't break the generated PHP—refer to the 'check_content_enum' case,
use the same phpString(v) pattern as in renderPatternCheckPhp for action.values,
and wrap the message argument to SchemataValidationError with phpString().
- Around line 378-385: The mb_strlen() calls in the check_content_min_length and
check_content_max_length branches should be pinned to UTF-8 to avoid depending
on mb_internal_encoding(); update the two occurrences (inside the case
'check_content_min_length' and case 'check_content_max_length' that build the
validation PHP) to pass 'UTF-8' as the second argument to mb_strlen($content,
'UTF-8') so length checks are consistent across environments.
In `@src/emit-python.ts`:
- Around line 562-564: The current guard around calling emitEventDispatchPython
(using constrainedKinds.length and contentPlans.size) prevents emitting the
shared validate_event when only bare kinds exist; remove that gate so
emitEventDispatchPython(constrainedKinds, contentPlans, helpers) is invoked
whenever validators are being generated (i.e., whenever other validators are
emitted), not only when per-kind constraints exist, and keep the per-kind
branches inside emitEventDispatchPython/validate_event (and any code referencing
constrainedKinds/contentPlans) as no-ops when there are no extra rules so the
shared id/pubkey/sig/created_at checks remain emitted.
- Around line 392-423: The content and tags top-level shape/type validations
should run unconditionally inside the generated validate_event logic: always
check event.get("content") for None and string type (and append ValidationError
with path="content" where appropriate), then—only if contentKinds.length >
0—perform the nested kind dispatch using the existing
renderContentActionsPython(actions, helpers) blocks for each kindNumber;
likewise, always check event.get("tags") for None, list type, and per-item
list-of-strings validation (appending errors and defaulting tags.append([]) on
bad items), then call validate_kind_tags(kind, tags) only if sorted.length > 0
for kind-specific checks. Ensure you update the code that currently gates these
checks behind the contentKinds and sorted conditionals so the general shape
validations run regardless while preserving the existing conditional
kind-dispatch behavior (refer to symbols contentKinds, sorted,
renderContentActionsPython, validate_kind_tags).
In `@src/emit-ruby.ts`:
- Around line 394-433: The generated Ruby validator currently emits the
'content' and 'tags' presence/type guards only when contentKinds or sorted are
non-empty; change emit-ruby.ts so the base checks for content and tags are
always emitted regardless of contentKinds/sorted, while keeping the per-kind
dispatch conditional. Concretely, remove or lift the surrounding if
(contentKinds.length > 0) and if (sorted.length > 0) guards so you always push
the "unless event.key?('content')" / "content_raw is_a?(String)" block (and
still only include the case kind ... when ... blocks by iterating contentKinds
and calling renderContentActionsRuby) and always push the tags block that builds
tags, validates element shapes, and calls validate_kind_tags(kind, tags). Ensure
references to contentKinds, renderContentActionsRuby, sorted, and
validate_kind_tags remain and only the outer conditionals are removed so all
emitters validate base field shapes unconditionally.
- Around line 351-353: The branch handling case 'check_content_enum' builds Ruby
string literals by hand, which breaks if a value contains a single quote; change
it to call rubyString() for each enum value when constructing vals (replace
action.values.map(v => `'${v}'`) with action.values.map(v => rubyString(v))) and
also use rubyString(...) when embedding the allowed-values message text (replace
the raw interpolation of action.values.join(', ') with a rubyString-produced
string) so both the array literal and the error message are safely escaped; keep
the push to lines (the ValidationError.new call) but use these rubyString-based
pieces.
In `@src/emit-rust.ts`:
- Around line 681-686: The current guard around emitEventDispatchRust prevents
generation of the validate_event API when both constrainedKinds and contentPlans
are empty; remove the conditional so emitEventDispatchRust(...) is called
unconditionally (always assign eventDispatchCode =
emitEventDispatchRust(constrainedKinds, contentPlans, helpers, adapter, api)) so
validate_event is always emitted; ensure any ordering expectations
(pre-registering helpers like check_hex_128) still hold after making this call
unconditional and keep the eventDispatchCode variable usage unchanged.
- Around line 398-409: The enum error message injects raw action.values into a
Rust string literal (in the 'check_content_enum' case), which can break the
generated code for values containing " or \; extract a small escaping helper
(e.g., escapeRustString) in src/emit-rust.ts that mirrors the escapes used in
the 'check_content_pattern' path (replace backslashes and double quotes), use it
when building the enum message (replace action.values.join(', ') with escaped
values joined) and also refactor the 'check_content_pattern' case to call the
same helper so both message constructions and any other places that embed
strings into Rust literals use the shared safe escaping function (refer to
renderPatternCheckRust, the 'check_content_pattern' and 'check_content_enum'
cases, and where lines.push is used to append the ValidationError message).
- Around line 388-395: In the Rust emitter cases 'check_content_min_length' and
'check_content_max_length' in src/emit-rust.ts, replace uses of content.len()
(byte count) with content.chars().count() to accurately count Unicode characters
for min/max length validation; update the generated condition expressions in the
case handlers so the emitted Rust uses content.chars().count() in place of
content.len() to match the "character(s)" message (also review the corresponding
cases in other emitters to apply the same chars/code-unit-aware counting where
needed).
In `@src/emit-swift.ts`:
- Around line 415-449: The event-level validator currently only emits the
content presence/type checks when contentKinds.length > 0 and only emits tags
presence/type checks when sorted.length > 0, causing missing guards for
content-only or tag-only schemas; update the emitter so the "if let content =
event[\"content\"] as? String { ... } else if event[\"content\"] != nil { ... }
else { ... }" block and the "if let rawTags = event[\"tags\"] as? [[Any]] { ...
} else if event[\"tags\"] != nil { ... } else { ... }" block are always emitted
regardless of contentKinds or sorted, while keeping the inner dispatch logic
(the switch over kind generated from contentKinds and the call to
validateKindTags(kind: tags:) generated from sorted) conditional; locate the
code that builds these blocks (symbols: contentKinds, sorted,
renderContentActionsSwift, validateKindTags, and the validateEvent emission
scope) and move/duplicate the presence/type guard emission outside the
contentKinds.length and sorted.length conditionals so both guards are always
present but the per-kind dispatch remains guarded.
In `@tests/emit-python.test.ts`:
- Around line 80-89: The test uses kindWithContent (no tag constraints) but
asserts for the presence of validate_kind_tags(...) by searching the whole
output, causing a false positive because the function definition appears
elsewhere; update the test to either (a) assert that the validate_event function
body contains a call to validate_kind_tags by locating the validate_event
definition and checking its body for "validate_kind_tags(" or (b) use a fixture
KindShape with requiredTags non-empty (e.g., modify kindWithContent to include
requiredTags) so validate_event must call validate_kind_tags; apply the same
change to the sibling assertions referenced around lines 231-234 for
Ruby/Swift/Dart/Java tests.
---
Outside diff comments:
In `@src/emit-validators.ts`:
- Around line 336-352: The generated validator currently skips the baseline
"content" and "tags" shape checks when contentKinds or sorted are empty (because
the code that emits those checks is wrapped in if (contentKinds.length > 0)/if
(sorted.length > 0)), causing validateEvent to miss required base-shape
validation; update the emitter so the generic checks for content existence/type
and tags existence/type are always emitted outside those conditionals (keep the
specific per-kind logic from renderContentActions and the per-tag constraints
inside their respective contentKinds/ sorted blocks), i.e., move or duplicate
the base checks for the "content" and "tags" paths so they run unconditionally
while still applying the specialized cases when contentKinds or sorted contain
entries.
---
Nitpick comments:
In `@tests/emit-validators.test.ts`:
- Around line 305-308: Test currently checks only that the symbol
validateKindTags appears in the emitted output; tighten it to assert the actual
call from within validateEvent so declarations don't satisfy the test. Update
the test that calls emitValidatorsFile(...) to search for the specific call
expression emitted from validateEvent (e.g., the code pattern where
validateEvent invokes validateKindTags) and assert that exact snippet is present
rather than just includes('validateKindTags('); locate validateEvent and
validateKindTags in the generated output to form the precise string to assert.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1d1c2e04-f103-4062-b27c-f54df701efc2
📒 Files selected for processing (26)
src/emit-c.tssrc/emit-cpp.tssrc/emit-csharp.tssrc/emit-dart.tssrc/emit-go.tssrc/emit-java.tssrc/emit-kotlin.tssrc/emit-php.tssrc/emit-python.tssrc/emit-ruby.tssrc/emit-rust.tssrc/emit-swift.tssrc/emit-validators.tstests/emit-c.test.tstests/emit-cpp.test.tstests/emit-csharp.test.tstests/emit-dart.test.tstests/emit-go.test.tstests/emit-java.test.tstests/emit-kotlin.test.tstests/emit-php.test.tstests/emit-python.test.tstests/emit-ruby.test.tstests/emit-rust.test.tstests/emit-swift.test.tstests/emit-validators.test.ts
| const kindWithContent: KindShape = { | ||
| kindNumber: 13, | ||
| nip: 'nip-59', | ||
| requiredTags: [], | ||
| perItemConditionals: [], | ||
| arrayLevelConditionals: [], | ||
| anyOfTagGroups: [], | ||
| contentConstraints: { minLength: 1 }, | ||
| category: 'bare', | ||
| }; |
There was a problem hiding this comment.
This assertion is a false positive with kindWithContent.
kindWithContent has no tag constraints, so validate_event() can omit the validate_kind_tags(...) call entirely. output.includes('validate_kind_tags(') still passes because it matches the generated def validate_kind_tags(...) definition later in the file. Assert against the validate_event body specifically, or use a tag-constrained fixture. The same blind spot exists in the sibling Ruby/Swift/Dart/Java additions.
Also applies to: 231-234
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/emit-python.test.ts` around lines 80 - 89, The test uses
kindWithContent (no tag constraints) but asserts for the presence of
validate_kind_tags(...) by searching the whole output, causing a false positive
because the function definition appears elsewhere; update the test to either (a)
assert that the validate_event function body contains a call to
validate_kind_tags by locating the validate_event definition and checking its
body for "validate_kind_tags(" or (b) use a fixture KindShape with requiredTags
non-empty (e.g., modify kindWithContent to include requiredTags) so
validate_event must call validate_kind_tags; apply the same change to the
sibling assertions referenced around lines 231-234 for Ruby/Swift/Dart/Java
tests.
- Make content/tags presence checks unconditional (not gated by constraints) - Always emit validateEvent even when all kinds are bare - Go: accept int/int64/float64 for kind/created_at, [][]string for tags - C#: accept int and long for kind/created_at (Y2038-safe) - Kotlin: accept Int and Long for kind/created_at - Rust: content.chars().count() instead of content.len() (Unicode) - PHP: mb_strlen with explicit UTF-8 encoding, escape enum values - Ruby: escape enum values via rubyString() - C: null-check tags/tag_lens before dispatch Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 10
♻️ Duplicate comments (4)
src/emit-csharp.ts (1)
376-382:⚠️ Potential issue | 🟠 MajorKeep
kindas 64-bit through dispatch.The new
longbranch still narrows back toint, so values aboveInt32.MaxValuewrap/truncate (or throw in checked builds) before the content/tag switch runs. Storekindaslong; Line 630'sValidateKindTags(...)signature is the downstream constraint.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/emit-csharp.ts` around lines 376 - 382, The code narrows kind to int and thus loses values > Int32.MaxValue; change the local declaration and uses so kind is a 64-bit long: replace the int kind declaration with long kind, assign kind = ki when kindRaw is int (cast to long) and kind = kl when kindRaw is long, and update any downstream call sites (notably ValidateKindTags(...)) to accept a long parameter instead of int so the dispatch/switch operates on a 64-bit value without truncation.src/emit-go.ts (1)
519-548:⚠️ Potential issue | 🟠 MajorAccept
[]stringentries inside[]interface{}tags.Native Go callers can legitimately pass
map[string]interface{}{"tags": []interface{}{[]string{"e", id}}}. The inner assertion only accepts[]interface{}, so those valid tags are still reported astags[i] must be an array of strings; add a[]stringbranch before the element-by-element conversion.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/emit-go.ts` around lines 519 - 548, The tag parsing currently treats inner elements only as []interface{} so valid []string entries passed by Go callers are rejected; inside the loop that iterates tt (the []interface{} case for tagsRaw) update the element handling in the block for each raw (where you check if arr, ok := raw.([]interface{})) to first check for a []string type (e.g., if sarr, ok := raw.([]string) { tags = append(tags, sarr); continue }) before the existing []interface{} conversion; ensure you still append a ValidationError to errors and tags = append(tags, nil) when neither []string nor []interface{} branches match so tagsValid, errors and ValidationError usage remain consistent.src/emit-c.ts (1)
537-548:⚠️ Potential issue | 🟠 MajorStill missing an error for incoherent tag-array inputs.
This avoids the old null dereference, but
num_tags > 0 && (!tags || !tag_lens)andnum_tags < 0now just skip tag validation and return success. Emit a"tags must be an array"error before dispatch.🛠️ Suggested fix
- lines.push(' if (num_tags > 0 && tags && tag_lens) {'); + lines.push(' if (num_tags < 0 || (num_tags > 0 && (!tags || !tag_lens))) {'); + lines.push(' SCHEMATA_EMIT_ERR(errs, n, max_errs, "tags", "tags must be an array");'); + lines.push(' } else {'); lines.push(' int _remaining = max_errs - n;'); lines.push(' if (_remaining > 0) {'); lines.push(' n += schemata_validate(kind, tags, tag_lens, num_tags, errs + n, _remaining);'); lines.push(' }'); - lines.push(' } else if (num_tags == 0) {'); - lines.push(' int _remaining = max_errs - n;'); - lines.push(' if (_remaining > 0) {'); - lines.push(' n += schemata_validate(kind, tags, tag_lens, 0, errs + n, _remaining);'); - lines.push(' }'); lines.push(' }');🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/emit-c.ts` around lines 537 - 548, Detect incoherent tag-array inputs before the existing tag dispatch: if num_tags > 0 and (tags == NULL || tag_lens == NULL) or if num_tags < 0, produce a validation error "tags must be an array" into the errs buffer (respecting max_errs and incrementing n) instead of skipping validation; implement this check just before the current if (num_tags > 0 && tags && tag_lens) block and use the same pattern as other error writes (compute int _remaining = max_errs - n; if (_remaining > 0) write the error into errs + n and increment n) so schemata_validate is only called with coherent inputs.src/emit-rust.ts (1)
402-410:⚠️ Potential issue | 🟠 MajorUse one full Rust-string escaper for content error messages.
The new
msgValshandling only escapes\\and". Raw\n,\r, or\tin schema regexes or enum values will still break the emitted Rust source.🛠️ Suggested fix
function renderContentActionsRust( actions: ContentAction[], helpers: Set<string>, ): string[] { + const escapeRustString = (s: string) => + s + .replace(/\\/g, '\\\\') + .replace(/"/g, '\\"') + .replace(/\n/g, '\\n') + .replace(/\r/g, '\\r') + .replace(/\t/g, '\\t'); const lines: string[] = []; for (const action of actions) { switch (action.type) { @@ case 'check_content_pattern': { const r = renderPatternCheckRust(action.native, 'content'); for (const h of r.helpers) helpers.add(h); + const msg = escapeRustString(`content must match pattern ${action.regex}`); lines.push(` if !(${r.expr}) {`); - lines.push(` errors.push(ValidationError { path: "content", message: "content must match pattern ${action.regex.replace(/\\/g, '\\\\').replace(/"/g, '\\"')}" });`); + lines.push(` errors.push(ValidationError { path: "content", message: "${msg}" });`); lines.push(' }'); break; } case 'check_content_enum': { const checks = action.values.map(v => `content == ${JSON.stringify(v)}`).join(' || '); - const msgVals = action.values.join(', ').replace(/\\/g, '\\\\').replace(/"/g, '\\"'); + const msg = escapeRustString(`content must be one of: ${action.values.join(', ')}`); lines.push(` if !(${checks}) {`); - lines.push(` errors.push(ValidationError { path: "content", message: "content must be one of: ${msgVals}" });`); + lines.push(` errors.push(ValidationError { path: "content", message: "${msg}" });`); lines.push(' }'); break; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/emit-rust.ts` around lines 402 - 410, The emitted Rust string for error messages currently only escapes backslashes and quotes for msgVals (and regex earlier), which leaves raw newlines, carriage returns, tabs, etc., that will break generated Rust; add and use a single escaping helper (e.g., escapeRustString) and apply it wherever you build Rust string literals from schema content — specifically replace the current msgVals processing and the regex escape usage in the lines that push ValidationError messages (the lines that build the "content must be one of..." and "content must match pattern ..." messages) with calls to that helper so it escapes \, ", \n, \r, \t, and any other Rust-string-sensitive characters consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/emit-c.ts`:
- Around line 472-479: schemata_validate_event currently dereferences note via
ndb_note_id/ndb_note_pubkey/ndb_note_created_at without checking for NULL; add
an early NULL check at the top of schemata_validate_event (before any ndb_*
calls) that uses SCHEMATA_EMIT_ERR to record a "note" / "note is required" error
(so it increments n appropriately) and then returns n to avoid further
dereference; reference the schemata_validate_event function and the ndb_note_id,
ndb_note_pubkey, ndb_note_created_at calls when making the change.
- Around line 436-440: The generated checks under the 'check_content_min_length'
and 'check_content_max_length' cases use byte-length checks (strlen / .size /
len) but must count UTF-8 characters; change the emitted expressions to call
UTF-8-aware character-count helpers instead of raw byte-length functions: for C
replace strlen(${contentVar}) with something like utf8_char_count(${contentVar})
and add/ensure a helper size_t utf8_char_count(const char *s) that counts
Unicode codepoints; for C++ replace .size() with a
utf8_char_count_cpp(${contentVar}) or equivalent helper that counts UTF-8
characters (or convert to u32/u16 string and measure), and for Go replace
len(${contentVar}) with utf8.RuneCountInString(${contentVar}) and import
"unicode/utf8"; keep the rest of the emitted error call (SCHEMATA_EMIT_ERR(errs,
n, max_errs, "content", ...)) unchanged so only the length expression is swapped
to the UTF-8-aware helper.
- Around line 442-450: The error messages emitted by check_content_pattern and
check_content_enum do not fully escape special characters (e.g. \n, \r, \t,
backslash, and quotes), causing invalid/incorrect C string literals; update the
emit logic so that the pattern and enum strings produced for SCHEMATA_EMIT_ERR
are passed through a full C-string escaping routine (covering backslash,
double-quote, newline, carriage return, tab, etc.) before being inlined in
lines; modify the code paths that build those messages in check_content_pattern
(where action.regex is used and renderPatternCheckC is referenced) and
check_content_enum (where action.values.join(...) and the strcmp clauses are
built) to call a shared escape helper (or extend renderPatternCheckC helpers) so
emitted messages are syntactically correct and consistent with other emitters.
In `@src/emit-dart.ts`:
- Around line 345-357: The emitted Dart literals in the check_content_pattern
and check_content_enum cases are hand-escaped and can produce invalid Dart when
values contain quotes, backslashes, dollar signs, or newlines; replace manual
escaping with the helper dartString() to produce safe Dart string literals. In
the 'check_content_pattern' case use dartString(action.regex) for the pattern in
the ValidationError message (keep r.expr and helpers from
renderPatternCheckDart), and in the 'check_content_enum' case map action.values
through dartString(v) when building the array literal and use
dartString(action.values.join(', ')) (or individually for the message) for the
message text so all emitted strings are properly escaped. Ensure you
import/retain the dartString utility and update both places that currently build
strings with replace(...) calls.
- Around line 411-418: The generated Dart switch cases for contentKinds are
missing a terminating statement and can fall through; update the loop that emits
each case (the block that iterates over contentKinds and calls
renderContentActionsDart) to append a terminating statement such as "break;"
after the lines produced by renderContentActionsDart for each case. Concretely,
in the code that builds lines for each case (referencing contentKinds and
renderContentActionsDart), push an additional line with 'break;' (properly
indented) after renderContentActionsDart(...) so every case ends with a break
and the Dart analyzer error is resolved.
In `@src/emit-go.ts`:
- Around line 388-396: The generated Go validators for the cases
'check_content_min_length' and 'check_content_max_length' use len(content) which
counts bytes; update the code generation in emit-go.ts to emit
utf8.RuneCountInString(content) instead of len(content) for both cases and
ensure the generated Go file imports "unicode/utf8" (add that import into the
emitter's import list if missing) so the validator counts runes (Unicode code
points) per JSON Schema semantics.
In `@src/emit-java.ts`:
- Around line 379-387: The Java emitter currently uses content.length(), which
counts UTF-16 code units rather than Unicode code points; update the two cases
'check_content_min_length' and 'check_content_max_length' in emit-java.ts so the
generated Java uses content.codePointCount(0, content.length()) for the
comparisons (replace both occurrences of content.length() in the if conditions)
while keeping the existing ValidationError messages and surrounding generated
lines intact.
In `@src/emit-kotlin.ts`:
- Around line 369-377: The code currently converts a Long to Int without bounds
checking in the when branch for kindRaw, which can silently wrap large Longs;
update the is Long branch (handling kindRaw -> kind) to check that kindRaw is
within Int.MIN_VALUE..Int.MAX_VALUE, and if out of range add
ValidationError("kind", "kind must be an integer") to errors and return errors
before calling kindRaw.toInt(), so only safe conversions reach the kind
variable.
- Around line 325-337: The length checks generated in the
'check_content_min_length' and 'check_content_max_length' cases use Kotlin's
content.length (UTF-16 code units) which miscounts Unicode code points; update
those emitted lines to use content.codePointCount(0, content.length) instead of
content.length so min/max validations operate on Unicode code points, leaving
the rest of the generated messages and logic unchanged; locate the string
templates in the switch cases handling 'check_content_min_length' and
'check_content_max_length' in emit-kotlin.ts and replace occurrences of
"content.length" with "content.codePointCount(0, content.length)" (no other
changes to renderPatternCheckKotlin usage).
In `@src/emit-rust.ts`:
- Around line 487-503: The generated validate_event function is using method
calls on the Event value (event.kind(), event.content(), event.tags()) but
nostr::Event exposes those as public fields; update validate_event to use field
access instead: use event.kind (and then call .as_u16() on that Kind), use
event.content (and call .as_str() if needed), and use event.tags (or &event.tags
if the callee expects a reference) when passing to validate_kind_tags; adjust
any corresponding expressions in renderContentActionsRust output and the
validate_kind_tags call to match field access on the Event struct.
---
Duplicate comments:
In `@src/emit-c.ts`:
- Around line 537-548: Detect incoherent tag-array inputs before the existing
tag dispatch: if num_tags > 0 and (tags == NULL || tag_lens == NULL) or if
num_tags < 0, produce a validation error "tags must be an array" into the errs
buffer (respecting max_errs and incrementing n) instead of skipping validation;
implement this check just before the current if (num_tags > 0 && tags &&
tag_lens) block and use the same pattern as other error writes (compute int
_remaining = max_errs - n; if (_remaining > 0) write the error into errs + n and
increment n) so schemata_validate is only called with coherent inputs.
In `@src/emit-csharp.ts`:
- Around line 376-382: The code narrows kind to int and thus loses values >
Int32.MaxValue; change the local declaration and uses so kind is a 64-bit long:
replace the int kind declaration with long kind, assign kind = ki when kindRaw
is int (cast to long) and kind = kl when kindRaw is long, and update any
downstream call sites (notably ValidateKindTags(...)) to accept a long parameter
instead of int so the dispatch/switch operates on a 64-bit value without
truncation.
In `@src/emit-go.ts`:
- Around line 519-548: The tag parsing currently treats inner elements only as
[]interface{} so valid []string entries passed by Go callers are rejected;
inside the loop that iterates tt (the []interface{} case for tagsRaw) update the
element handling in the block for each raw (where you check if arr, ok :=
raw.([]interface{})) to first check for a []string type (e.g., if sarr, ok :=
raw.([]string) { tags = append(tags, sarr); continue }) before the existing
[]interface{} conversion; ensure you still append a ValidationError to errors
and tags = append(tags, nil) when neither []string nor []interface{} branches
match so tagsValid, errors and ValidationError usage remain consistent.
In `@src/emit-rust.ts`:
- Around line 402-410: The emitted Rust string for error messages currently only
escapes backslashes and quotes for msgVals (and regex earlier), which leaves raw
newlines, carriage returns, tabs, etc., that will break generated Rust; add and
use a single escaping helper (e.g., escapeRustString) and apply it wherever you
build Rust string literals from schema content — specifically replace the
current msgVals processing and the regex escape usage in the lines that push
ValidationError messages (the lines that build the "content must be one of..."
and "content must match pattern ..." messages) with calls to that helper so it
escapes \, ", \n, \r, \t, and any other Rust-string-sensitive characters
consistently.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 526133e0-11ac-4be9-88b5-e20ccfafce91
📒 Files selected for processing (15)
src/emit-c.tssrc/emit-cpp.tssrc/emit-csharp.tssrc/emit-dart.tssrc/emit-go.tssrc/emit-java.tssrc/emit-kotlin.tssrc/emit-php.tssrc/emit-python.tssrc/emit-ruby.tssrc/emit-rust.tssrc/emit-swift.tssrc/emit-validators.tstests/emit-php.test.tstests/emit-rust.test.ts
✅ Files skipped from review due to trivial changes (2)
- src/emit-cpp.ts
- src/emit-python.ts
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/emit-rust.test.ts
- tests/emit-php.test.ts
- src/emit-php.ts
| case 'check_content_pattern': { | ||
| const r = renderPatternCheckC(action.native, contentVar); | ||
| for (const h of r.helpers) helpers.add(h); | ||
| lines.push(` if (!(${r.expr})) SCHEMATA_EMIT_ERR(errs, n, max_errs, "content", "content must match pattern ${action.regex.replace(/\\/g, '\\\\').replace(/"/g, '\\"')}");`); | ||
| break; | ||
| } | ||
| case 'check_content_enum': { | ||
| const checks = action.values.map(v => `strcmp(${contentVar}, ${JSON.stringify(v)}) == 0`).join(' || '); | ||
| lines.push(` if (!(${checks})) SCHEMATA_EMIT_ERR(errs, n, max_errs, "content", "content must be one of: ${action.values.join(', ')}");`); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's check the current state of src/emit-c.ts around lines 442-450
cat -n src/emit-c.ts | sed -n '430,460p'Repository: nostrability/schemata-codegen
Length of output: 1679
🏁 Script executed:
# Also check how many emit files exist and if they have similar patterns
fd 'emit-.*\.ts$' src/ | head -20Repository: nostrability/schemata-codegen
Length of output: 395
🏁 Script executed:
# Search for renderContentActionsC function to understand context better
rg -A 30 'function renderContentActionsC' src/emit-c.tsRepository: nostrability/schemata-codegen
Length of output: 1492
🏁 Script executed:
# Search for similar enum/pattern escaping patterns in other emitters
rg -n 'check_content_enum|check_content_pattern' src/emit-*.tsRepository: nostrability/schemata-codegen
Length of output: 1577
🏁 Script executed:
# Check how Rust emitter handles check_content_enum escaping
rg -A 10 "case 'check_content_enum'" src/emit-rust.tsRepository: nostrability/schemata-codegen
Length of output: 559
🏁 Script executed:
# Check how Python emitter handles check_content_enum escaping
rg -A 10 "case 'check_content_enum'" src/emit-python.tsRepository: nostrability/schemata-codegen
Length of output: 446
🏁 Script executed:
# Check how Java emitter handles check_content_enum escaping
rg -A 10 "case 'check_content_enum'" src/emit-java.tsRepository: nostrability/schemata-codegen
Length of output: 519
🏁 Script executed:
# Check how C# emitter handles check_content_enum escaping
rg -A 10 "case 'check_content_enum'" src/emit-csharp.tsRepository: nostrability/schemata-codegen
Length of output: 526
🏁 Script executed:
# Check how Rust emitter handles check_content_pattern escaping
rg -A 5 "case 'check_content_pattern'" src/emit-rust.ts | head -20Repository: nostrability/schemata-codegen
Length of output: 495
🏁 Script executed:
# Check how Python emitter handles check_content_pattern escaping
rg -A 5 "case 'check_content_pattern'" src/emit-python.ts | head -20Repository: nostrability/schemata-codegen
Length of output: 460
🏁 Script executed:
# Check all 12 emitters to see which ones use JSON.stringify for escaping in enum messages
for f in src/emit-{c,cpp,csharp,dart,go,java,kotlin,php,python,ruby,rust,swift}.ts; do
echo "=== $f ===";
rg -A 3 "case 'check_content_enum'" "$f" | head -6;
doneRepository: nostrability/schemata-codegen
Length of output: 4140
🏁 Script executed:
# Check if there are language-specific string escaper helpers
rg -n 'function phpString|function rubyString' src/emit-*.tsRepository: nostrability/schemata-codegen
Length of output: 194
🏁 Script executed:
# Check what phpString does
rg -A 10 'function phpString' src/emit-php.tsRepository: nostrability/schemata-codegen
Length of output: 382
🏁 Script executed:
# Check what rubyString does
rg -A 10 'function rubyString' src/emit-ruby.tsRepository: nostrability/schemata-codegen
Length of output: 346
🏁 Script executed:
# Check pattern escaping in all emitters to compare with enum escaping
for f in src/emit-{c,cpp,csharp,dart,go,java,kotlin,php,python,ruby,rust,swift}.ts; do
echo "=== $f ===";
rg -B 1 -A 3 "check_content_pattern.*action.regex" "$f" | head -8;
doneRepository: nostrability/schemata-codegen
Length of output: 372
🏁 Script executed:
# Check pattern message escaping more directly - look at the exact lines where pattern regex is embedded
rg -n "content must match pattern" src/emit-*.tsRepository: nostrability/schemata-codegen
Length of output: 2393
🏁 Script executed:
# For C, check the exact escaping applied to action.regex
rg -B 2 -A 2 'action.regex.replace' src/emit-c.tsRepository: nostrability/schemata-codegen
Length of output: 400
🏁 Script executed:
# For Rust, check the exact escaping applied to action.regex
rg -B 2 -A 2 'action.regex.replace' src/emit-rust.tsRepository: nostrability/schemata-codegen
Length of output: 404
🏁 Script executed:
# Check all emitters for how they handle action.regex in pattern messages
rg -n 'action.regex' src/emit-*.ts | grep -E '(enum|pattern)'Repository: nostrability/schemata-codegen
Length of output: 2241
🏁 Script executed:
# More focused: extract the exact line where pattern message is constructed in each emitter
for f in src/emit-{c,cpp,csharp,dart,go,java,kotlin,php,python,ruby,rust,swift}.ts; do
echo "=== $(basename $f) ===";
rg -A 1 "case 'check_content_pattern'" "$f" | grep -A 4 "action.regex" | head -3;
doneRepository: nostrability/schemata-codegen
Length of output: 324
🏁 Script executed:
# Check Dart enum and pattern handling more carefully
rg -B 2 -A 5 "case 'check_content_enum'" src/emit-dart.tsRepository: nostrability/schemata-codegen
Length of output: 460
🏁 Script executed:
# Check what 'escaped' is in Dart pattern
rg -B 10 "check_content_pattern" src/emit-dart.ts | head -20Repository: nostrability/schemata-codegen
Length of output: 719
🏁 Script executed:
# Summary: which emitters have incomplete escaping for enum messages?
echo "=== C enum ==="
grep -A 1 "case 'check_content_enum'" src/emit-c.ts | tail -1
echo -e "\n=== C++ enum ==="
grep -A 2 "case 'check_content_enum'" src/emit-cpp.ts | tail -1
echo -e "\n=== Dart enum ==="
grep -A 2 "case 'check_content_enum'" src/emit-dart.ts | tail -1
echo -e "\n=== Rust enum ==="
grep -A 3 "case 'check_content_enum'" src/emit-rust.ts | tail -1Repository: nostrability/schemata-codegen
Length of output: 424
🏁 Script executed:
# Check if C and C++ have the same function signature to understand if they need to stay in sync
rg -n "function renderContentActionsC|function renderContentActionsCpp" src/emit-*.tsRepository: nostrability/schemata-codegen
Length of output: 236
🏁 Script executed:
# Verify what happens in Rust enum - does it actually escape the enum values?
rg -B 1 -A 5 "const msgVals = action.values" src/emit-rust.tsRepository: nostrability/schemata-codegen
Length of output: 498
🏁 Script executed:
# Verify the full picture: which emitters DON'T use proper escaping for enum messages
for f in src/emit-{c,cpp,csharp,dart,go,java,kotlin,php,python,ruby,rust,swift}.ts; do
echo "=== $(basename $f) ===";
rg -A 3 "case 'check_content_enum'" "$f" | grep -E "(action.values.join|JSON.stringify|phpString|rubyString)" | head -1;
doneRepository: nostrability/schemata-codegen
Length of output: 1422
🏁 Script executed:
# Check if there are any tests that would catch this issue
rg -l "check_content_enum|check_content_pattern" tests/Repository: nostrability/schemata-codegen
Length of output: 102
🏁 Script executed:
# Look at test cases to understand what schema values might contain these chars
rg -n "\\\\n|\\\\r|\\\\t" tests/*.ts 2>/dev/null | head -20Repository: nostrability/schemata-codegen
Length of output: 1688
This is a cross-emitter C-string escaping issue affecting multiple language targets.
The review identifies a real problem in C, but the issue extends beyond C alone. Multiple emitters have incomplete escaping in content validation error messages:
- C, C++: Both
check_content_enumandcheck_content_patternmessages lack escaping for\n,\r,\t. The enum message has no escaping at all; the pattern message only escapes\\and". - Dart:
check_content_enummessage embeds rawaction.values.join(', ')with no escaping. - Rust:
check_content_enumapplies partial escaping (only\\and"), missing newlines/tabs/carriage returns. - PHP, Ruby:
check_content_patternmessages use language-specific single-quote escaping, but this is incomplete for message contexts where double-quote or C-style escaping is needed.
Per coding guidelines, all 12 language emitters must stay in sync. A comprehensive fix is required across all affected targets to ensure schema values containing special characters generate valid, correctly-displayed error messages.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/emit-c.ts` around lines 442 - 450, The error messages emitted by
check_content_pattern and check_content_enum do not fully escape special
characters (e.g. \n, \r, \t, backslash, and quotes), causing invalid/incorrect C
string literals; update the emit logic so that the pattern and enum strings
produced for SCHEMATA_EMIT_ERR are passed through a full C-string escaping
routine (covering backslash, double-quote, newline, carriage return, tab, etc.)
before being inlined in lines; modify the code paths that build those messages
in check_content_pattern (where action.regex is used and renderPatternCheckC is
referenced) and check_content_enum (where action.values.join(...) and the strcmp
clauses are built) to call a shared escape helper (or extend renderPatternCheckC
helpers) so emitted messages are syntactically correct and consistent with other
emitters.
Go: inner tags may be []string (not just []interface{}) when the outer
slice is []interface{} — use type switch to accept both shapes.
C: when num_tags > 0 but tags/tag_lens pointers are NULL, emit an error
instead of silently skipping tag validation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…vent - C: strlen → schemata_utf8_char_count helper, cStringEscape for \n\r\t, nostrdb NULL note check, num_tags < 0 guard - C++: .size() → utf8_char_count helper - Go: len() → utf8.RuneCountInString() - Java: .length() → .codePointCount(0, content.length()) - Kotlin: .length → .codePointCount(), Long→Int bounds check - C#: Long→int bounds check (MinValue/MaxValue guard) - Dart: use dartString() for escaping, add break; in switch cases - Rust: nostr crate field access (event.kind not event.kind()), rustStringEscape for \n\r\t in content messages Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
validateEvent()/validate_event()to all 12 language emitters, extending codegen from tag-only to full event validationvalidateKindTags()Languages
Record<string, unknown>validateEvent()[String: Any]validateEvent(_:)dictvalidate_event()map[string]interface{}ValidateEvent()Map<String, Object>validateEvent()Map<String, Any?>validateEvent()Map<String, dynamic>validateEvent()IDictionary<string, object?>ValidateEvent()SchemataEventstructvalidate_event()arrayschemata_validate_event()Hashvalidate_event()ndb_note*schemata_validate_event()SchemataEvent/&Event/&Notevalidate_event()Test plan
npm run buildpassesnpm test— 975 tests, 972 pass (8 pre-existing failures from upstream schema changes, same as main)--allgenerates validateEvent in all 13 output files🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests