Skip to content

Bench/cross system webbery#35

Draft
Felipe705x wants to merge 5 commits into
bench/cross-system-kuzufrom
bench/cross-system-webbery
Draft

Bench/cross system webbery#35
Felipe705x wants to merge 5 commits into
bench/cross-system-kuzufrom
bench/cross-system-webbery

Conversation

@Felipe705x
Copy link
Copy Markdown
Collaborator

No description provided.

Felipe705x added 3 commits May 4, 2026 23:37
The cross-system bench was hardcoded to IC2 throughout: shell vars,
file paths, output labels, comparison-table headers. Adding any
other IC (when its toml flips to status="implemented") would have
meant editing every per-system runner plus the orchestrator.

This refactor pushes the IC number through as an explicit --ic <n>
parameter (default 2) so adding new ICs is a matter of dropping
in `bench/cross-system/<system>/ic<n>.{cypher,gql,...}` per
system and bumping the toml status. No runner code changes needed.

Plumbing:
  - run_all.sh: --ic flag, passes through to every per-system runner,
    records ic in run_info.txt.
  - gqlite/run.sh: --ic plumbed to ldbc_bench's --ic.
  - graphqlite/setup.py: SUPPORTED_ICS check; per-IC ic<n>.db path.
  - graphqlite/run.py: derives toml/cypher/db paths from --ic; reads
    params_file and expected_shape from the toml.
  - compare_results.py: extracts query label from CSV, includes
    [IC<n>] in section headers.
  - README: --ic <n> usage documented; CSV schema fixed (was
    referencing a result_shape column that doesn't exist).
…stem

Kuzu (gitlab.com/kuzudb/kuzu, MIT, CIDR 2023 paper from the Waterloo
DSG, vectorized columnar engine) integrated as a fourth system in the
cross-system bench, joining gqlite + graphqlite. End-to-end IC2 run
with all three at LDBC SF0.1, 5 iters + 2 warmup, all 15 params:

  gqlite (lazy)    median 0.20-0.41 ms
  kuzu-cypher      median 5.77-7.24 ms   (~25× slower than gqlite)
  graphqlite       median 28.45-32.70 ms (~5× slower than kuzu)

All three systems agree on row count (20) and result shape
(i,s,s,i,n/s,i) for every param row.

## On the archival caveat

Kùzu Inc. archived the GitHub repo on 2025-10-10 with v0.11.3 as the
final release. We pin to that version in `requirements.txt` — the
PyPI wheel is frozen and reproducible indefinitely. For a research-
paper-tier comparison "active maintenance" isn't a strict criterion;
"engineered, working, reproducible" is, and Kuzu satisfies all
three. The CIDR 2023 paper is cite-able. Full framing in
kuzu/DIVERGENCES.md.

## Integration choices

- Schema-first DDL (CREATE NODE/REL TABLE) with PRIMARY KEY on each
  node table — gives auto-indexed point lookups for the IC2-start
  `MATCH (p:Person {id: $personId})` without any extra DDL or the
  index-DDL acrobatics that auksys/gqlite forced us into.
- Native bulk loader: `COPY <Table> FROM '<csv>' (DELIM='|', HEADER=true)`.
  Loaded all 288K nodes + 315K edges in 0.9s.
- Multi-typed REL TABLE for `hasCreator(FROM Comment TO Person, FROM
  Post TO Person)`. COPY FROM into multi-typed REL TABLEs requires
  (FROM='X', TO='Y') hints to disambiguate the sub-table.
- IC2 query keeps the canonical "one MATCH + relationship-type-
  implicit label disjunction" shape: `MATCH (...)<-[:hasCreator]-(c)`
  with `c` unlabeled. Kuzu's multi-typed REL TABLE constrains `c` to
  Comment-or-Post automatically. Initial UNION ALL approach was
  rejected because LIMIT-after-UNION ALL only applies to the second
  branch in Kuzu's parser; details in kuzu/DIVERGENCES.md.

## Reproducibility hardening (the related cleanup)

- Added `bench/cross-system/install_python_deps.sh`: one-liner that
  runs `pip install -r requirements.txt` in every per-system subdir.
  Replaces the previous "discover each subdir's requirements.txt
  yourself" experience.
- Added `bench/cross-system/kuzu/README.md` per-system README with
  prereqs and standalone-run instructions. The other systems'
  per-system READMEs follow as the existing systems' loaders gain
  doc updates.
- Top-level `bench/cross-system/README.md`:
  - new "Per-system prerequisites" subsection in the Setup section
    listing each system's install command in one table
  - Kuzu row added to the systems table
  - new explicit "Per-system data load" subsection documenting the
    one-time setup-py invocations per system

The bench now installs deps and loads in one command each:
  bash bench/cross-system/install_python_deps.sh
  bash bench/cross-system/run_all.sh --ic 2
After thorough review of the upstream README, the bison grammar
(`src/gql.y`), the lexer (`src/gql.l`), the C API
(`include/gqlite.h`), and the example tests, we rejected
webbery/gqlite as unfit for the cross-system IC2 bench.

This is not a vibes-based rejection. The original bench plan flagged
this project as "dead since April 2023" and time-boxed any attempt
at three days. After the deep dive, the rejection is grounded in
concrete blockers:

1. **DSL cannot express `LIMIT`.** `src/gql.y:115` declares a `limit`
   token; `src/gql.l:147` lexes it. But NO grammar production rule
   references it. The complete query grammar (gql.y:362-379) is
   `{query_kind [, graph_expr [, where_expr]]}` — no LIMIT, no
   ORDER BY, no projection beyond `query_kind`. The keyword is
   half-implemented and the bench's `LIMIT 20` semantics can't be
   expressed.

2. **No label-disjunction syntax.** WHERE clauses use MongoDB-style
   operators (`$lt`, `$gt`, `$and`, `$or`) on property values;
   there's no syntax for "entity belongs to label A or label B".
   IC2's `(c:Comment|Post)` would require two separate queries
   combined externally, breaking the cross-system "same logical
   shape" rule.

3. **Per-row load idiom only.** `test/movielens.cpp` (their
   reference loader for ~10K movies + ~5K tags) does one
   `gqlite_exec` per CSV row with a `char[512]` buffer. At LDBC
   SF0.1 scale (~600K entities including MVAs), this is the same
   architectural shape that made auksys/gqlite take indefinite
   hours.

4. **C-only API + no bench-scale tests.** No Python/Ruby/etc.
   bindings; would require a custom C++ harness. The README states
   the project's purpose is "for testing abilities in ending
   device" (IoT/edge form factors), not LDBC-scale workloads.

5. **Last commit 2023-04-08** — over 2 years dead. README labels
   the DSL as "Unstable"; CHANGELOG.md is a single line.

The full diagnosis with grammar-line citations is in
`bench/cross-system/webbery_gqlite/SKIPPED.md`. The top-level
README's "What gets compared" table is updated to mark this row
as discarded with a link to the breakdown.

This is documentation-only — no setup.* or run.* is provided. The
orchestrator's `run_all.sh` already detects missing per-system
runners and logs `[SKIP]` to `skipped.log`.

## On the survey-of-the-landscape this gives us

Out of five external systems evaluated this branch family
(graphqlite, GraphLite-AI, auksys/gqlite, Kuzu, webbery/gqlite),
two are bench-runnable end-to-end:
  - graphqlite (boring SQLite extension, works)
  - Kuzu (vectorized columnar, archived but pinned)

The remaining three (GraphLite-AI, auksys/gqlite, webbery/gqlite)
all fail at the data-load stage with structurally similar issues:
custom DSL or aspirational ISO-GQL/Cypher, no real bulk-load API,
missing core SQL/Cypher features at the parser level, designed for
small-data niche workloads. This is itself a finding worth
mentioning in the paper's "Systems considered but rejected" or
"Threats to validity" subsection — the candidate landscape for
"small embedded graph DB" is sparse, and most projects in it don't
survive contact with LDBC SF0.1 (the smallest LDBC scale factor).
@Felipe705x Felipe705x changed the base branch from main to bench/cross-system-kuzu May 5, 2026 15:06
Felipe705x added 2 commits May 5, 2026 11:14
Verified by re-reading:
- src/gql.y (the bison grammar) — every production rule
- src/gql.l (the flex lexer) — including all start-state blocks
- include/gqlite.h (the public C API)
- the .gql test files under test/vertex/ and test/query/
  (the canonical query-shape examples)
- example/main.c (the README's reference example, recopied)

Two findings worth tightening in the writeup:

1. The `limit` token IS used in test/vertex/grammar.gql:35 inside
   a `$near` operator clause: `{feature_name: {limit: 3, $near: ...}}`.
   I'd missed this. But tracing it through the grammar
   (condition_property → right_value → condition_object → OP_NEAR
   ':' '{' geometry_condition '}' → OP_GEOMETRY ':' a_vector ','
   range_comparable) shows no production accepts the `limit` token
   even there — it's either an aspirational test or relies on a
   path I couldn't find. SKIPPED.md now documents this nuance and
   notes that even if it did parse, vector-search top-k semantics
   are not the result-set LIMIT IC2 requires.

2. The `query_kind_expr` rule (gql.y:436) is precisely
   `KW_EDGE | LITERAL_STRING | a_graph_properties`. The
   single-string label restriction is now cited verbatim. No
   production accepts an array of labels.

The verdict doesn't change — webbery/gqlite cannot express IC2
faithfully, the project is dead 2+ years, and the other blockers
(no bulk load, C-only API, IoT-targeted scale) all stand. But the
documentation is now precise enough to survive a careful upstream
reader pointing at the test/vertex/grammar.gql edge case.
Per request to actually verify rather than just analyze: tried to
build webbery/gqlite on Windows + MSVC 2022 + bundled CMake
(3.31.6-msvc6), May 2026.

Hit FOUR sequential build failures, each requiring manual
intervention:

1. **git-describe submodule version**: shallow clone breaks
   libmdbx's CMake tag-matcher. Worked around by --unshallow +
   synthetic tag.

2. **flex/bison not on PATH for MSBuild**: README claims the
   bundled `tool/{flex,bison}.exe` "needs no dependency" but
   MSBuild doesn't honor `WORKING_DIRECTORY` for cwd-based binary
   lookup. Worked around by pre-generating parser/lexer manually
   AND prefixing `tool/` to PATH for the build invocation.

3. **libmdbx version-mismatch** in CMake-templated `version.c`:
   `MDBX_VERSION_MAJOR/MINOR` mismatch between mdbx.h (0,11) and
   the generated file (0,0) fires a preprocessor `#error`. Worked
   around by patching the generated version.c.

4. **MASM assembly file not built/linked**: project contains x86_64
   coroutine assembly stubs but CMake doesn't `enable_language(ASM_MASM)`,
   so MSBuild compiles the .cpp consumers but never assembles the
   .asm files into .objs. Linker fails with `LNK1181: cannot open
   input file 'jump_x86_64_ms_pe_masm.obj'`.

We stopped at #4. The point isn't "Windows is hard" — every other
system in this bench (kuzu, graphqlite, gqlite) builds clean from
a fresh `git clone` on the same Windows + MSVC 2022 setup. webbery's
build issues are specific to its pre-2024 CMake config and
unmaintained submodules. **Combined with the grammar-level blockers
(no LIMIT, no label disjunction), the build issues are the second-
order signal — even if all four were patched, the DSL still can't
express IC2 faithfully.**

The verdict is unchanged: discard. But the SKIPPED.md is now
empirically grounded rather than just grammar-reading. A reviewer
who clones the repo will hit the same wall.
@Felipe705x Felipe705x force-pushed the bench/cross-system-kuzu branch from 700f47f to c25b453 Compare May 5, 2026 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant