Skip to content

feat(cubeorchestrator): Use Arrow format for CubeStore response format#10949

Open
ovr wants to merge 13 commits into
masterfrom
perf/query-orchestrator-arrow-results
Open

feat(cubeorchestrator): Use Arrow format for CubeStore response format#10949
ovr wants to merge 13 commits into
masterfrom
perf/query-orchestrator-arrow-results

Conversation

@ovr
Copy link
Copy Markdown
Member

@ovr ovr commented May 27, 2026

Add QueryResult::from_arrow to decode Arrow IPC stream payloads into the columnar QueryResult, and route the HttpQueryResult (Arrow) command in from_cubestore_fb to it. Arrow temporal columns map to a new dedicated DBResponsePrimitive::Timestamp variant that serializes to the existing ISO-8601 millisecond format.

@github-actions github-actions Bot added rust Pull requests that update Rust code javascript Pull requests that update Javascript code data source driver labels May 27, 2026
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 27, 2026

Claude finished @ovr's task in 4m 41s —— View job


Code Review: Arrow Format for CubeStore Response

  • Gather context and understand changes
  • Review Rust core changes (query_message_parser.rs, query_result_transform.rs)
  • Review Rust benchmarks and Cargo config
  • Review TypeScript codegen changes
  • Review CubeStore driver changes
  • Review backend-native changes
  • Post comprehensive review

Summary

Well-structured PR that adds Arrow IPC stream decoding into the existing columnar QueryResult, with a new DBResponsePrimitive::Timestamp variant for zero-copy temporal data. The implementation is clean and the test coverage is solid. I've posted inline comments on the specific issues — here's the high-level breakdown:

Issues Found

1 High-severity bug, 2 medium items, rest are low/informational.

Severity File Issue
High CubeStoreDriver.ts:91-98 this.config.responseFormat has no default value — toResponseFormat(undefined) will throw for every driver instance that doesn't explicitly set it. ConnectionConfig type also missing the field.
Medium query_message_parser.rs:296-306 All macros (push_numeric!, push_str!, etc.) use .unwrap() on downcast_ref — will panic instead of returning an error on unexpected array types.
Medium CubeStoreDriver.ts:118-125 WebSocketConnection.query() signature changed from positional args to options object — verify no external callers use the old signature.
Low query_message_parser.rs:373-378 Int64/UInt64f64 cast loses precision for values > 2^53. Consistent with existing behavior but worth noting for Arrow path where integers arrive as native types, not strings.
Low orchestrator.rs:140 TimestampFieldValue goes through Display instead of directly using TIMESTAMP_FORMAT — fragile coupling if Display impl ever diverges.

What's done well

  • DBResponsePrimitive::Timestamp fits within the existing 32-byte enum size (verified by the new test_db_response_primitive_size guard test) — no memory regression
  • Hand-written Serialize correctly replaces #[serde(untagged)] derive and handles the Timestamp → ISO-8601 string rendering with zero overhead for common variants
  • Arrow type coverage is comprehensive (all numeric types, strings, booleans, timestamps, decimals, dates) with a clear error for unsupported types
  • Backward-compatible responseFormat default in WebSocketConnection — old CubeStore ignores the field
  • Benchmarks properly mirror the existing from_js_raw_data combos for apples-to-apples comparison
  • Test coverage includes basic types, unsupported types, and full round-trip through from_cubestore_fb with Arrow payload

Testing gaps

  • No tests for Timestamp pass-through in transform_value (Arrow timestamps bypass the string-parsing path for type_ == "time")
  • No test for Decimal128/Decimal256 Arrow columns (the push_decimal! macro includes a parse::<f64> fallback to string — would be good to verify both paths)
  • No integration test exercising the full TypeScript → Rust → transform pipeline with Arrow format

Comment thread packages/cubejs-cubestore-driver/src/CubeStoreDriver.ts Outdated
Comment thread rust/cube/cubeorchestrator/src/query_message_parser.rs Outdated
Comment thread rust/cube/cubeorchestrator/src/query_message_parser.rs Outdated
Comment thread rust/cube/cubeorchestrator/src/query_message_parser.rs Outdated
Comment thread rust/cube/cubeorchestrator/src/query_result_transform.rs Outdated
Comment thread packages/cubejs-cubestore-driver/src/WebSocketConnection.ts Outdated
Comment thread packages/cubejs-backend-native/src/orchestrator.rs Outdated
Comment thread rust/cube/cubeorchestrator/src/query_result_transform.rs
Comment thread rust/cube/cubeorchestrator/src/query_message_parser.rs
Comment thread packages/cubejs-cubestore-driver/src/CubeStoreDriver.ts
@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.32%. Comparing base (0d6393b) to head (a8f7995).

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #10949   +/-   ##
=======================================
  Coverage   83.32%   83.32%           
=======================================
  Files         255      255           
  Lines       77261    77261           
=======================================
  Hits        64379    64379           
  Misses      12882    12882           
Flag Coverage Δ
cubesql 83.32% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ovr ovr force-pushed the perf/query-orchestrator-arrow-results branch from 17ec119 to 38ab4f2 Compare May 28, 2026 10:19
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 28, 2026

Claude finished @ovr's task in 4m 13s —— View job


Code Review: Arrow Format for CubeStore Response (Re-review after 6 commits)

  • Gather context and review updated diff
  • Check previous review threads and resolve stale ones
  • Review Rust core changes (query_message_parser.rs, query_result_transform.rs)
  • Review TypeScript changes (CubeStoreDriver, WebSocketConnection, codegen)
  • Review backend-native orchestrator changes
  • Review benchmarks and Cargo config
  • Post comprehensive review

Summary

Good iteration on the previous review. The follow-up commits addressed the key robustness concerns: downcast macros now use ok_or_else instead of .unwrap() (commit b14653e), DBResponsePrimitive numeric variants are properly split into Int64/UInt64/Float64 (commit 8826169), and timestamp formatting is optimized with pre-computed Item slices (commit 38ab4f2). I've resolved the 5 stale review threads that were addressed.

Two new issues found in this pass:

Issues Found

Severity File Issue
Medium CubeStoreDriver.ts:121 || operator silently overrides explicit QueryResultFormat.Legacy (value 0, falsy) to Arrow. Should use ??.
Medium query_message_parser.rs:262 from_arrow is private but benches/parser.rs calls it from external benchmark code — won't compile.

What improved since last review

  • Downcast safety: All macros now use ok_or_elseParseError::ArrowError instead of .unwrap() panics
  • Numeric fidelity: Int64/UInt64/Float64 split preserves integer precision through the pipeline until the JS boundary
  • Timestamp formatting: TIMESTAMP_ITEMS const slice avoids re-parsing the format string on every cell — measurable win for large timestamp columns
  • Display consistency: Both Display and Serialize impls for Timestamp now use the same TIMESTAMP_ITEMS, eliminating the fragile coupling concern

Testing observation

transform_value only reformats DBResponsePrimitive::String values when type_ == "time". Arrow timestamps arrive as Timestamp(NaiveDateTime) and pass through unchanged (via the other => other fallback). This is correct because the Timestamp variant already serializes to ISO-8601 — but there are no tests exercising this path. Consider adding a test that passes a Timestamp value through transform_value with type_ = "time" to document this intentional bypass.


Comment thread packages/cubejs-cubestore-driver/src/CubeStoreDriver.ts Outdated
Comment thread rust/cube/cubeorchestrator/src/query_message_parser.rs Outdated
Comment thread rust/cube/cubeorchestrator/src/query_result_transform.rs
Comment thread packages/cubejs-backend-native/src/orchestrator.rs
Comment thread rust/cube/cubeorchestrator/src/query_message_parser.rs
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 28, 2026

Claude finished @ovr's task in 2m 53s —— View job


Code Review: Arrow Format for CubeStore Response (Re-review after 8 commits)

  • Gather context and review updated diff
  • Check previous review threads and resolve stale ones
  • Review Rust core changes (query_message_parser.rs, query_result_transform.rs)
  • Review TypeScript changes (CubeStoreDriver, WebSocketConnection, codegen)
  • Review backend-native orchestrator changes
  • Review benchmarks and Cargo config
  • Post comprehensive review

Summary

Good iteration. The latest commit (7eab593) fixes the ||?? bug flagged in the previous review. I've resolved 6 stale threads (the || fix, plus positive/informational comments that don't need author action). One actionable issue remains from the prior pass.

Remaining Issue

Severity File Issue
Medium query_message_parser.rs:262 from_arrow is still private but called from benches/parser.rs:187 — benchmark won't compile. Needs pub fn.

This was flagged in the previous review and is still present. I've posted an inline comment with a suggested fix.

What was resolved since last review

  • ||?? fix (commit 7eab593): responseFormat ?? QueryResultFormat.Arrow now correctly preserves an explicit QueryResultFormat.Legacy (value 0, falsy). Resolved the stale thread.
  • Resolved 5 positive/informational comment threads that didn't require author action.

Overall Assessment

The PR is in good shape. The Arrow IPC decoding path is well-implemented with comprehensive type coverage, safe downcast error handling, and proper backward compatibility. The single remaining issue (pub visibility on from_arrow) is a straightforward one-line fix.


Comment thread rust/cube/cubeorchestrator/src/query_message_parser.rs Outdated
@ovr ovr force-pushed the perf/query-orchestrator-arrow-results branch from 7eab593 to f46307d Compare May 28, 2026 10:59
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 28, 2026

Claude finished @ovr's task in 2m 31s —— View job


Code Review: Arrow Format for CubeStore Response (Re-review after sync)

  • Gather context and review updated diff
  • Check previous review threads and resolve stale ones
  • Review Rust core changes (query_message_parser.rs, query_result_transform.rs)
  • Review TypeScript changes (CubeStoreDriver, WebSocketConnection, codegen)
  • Review backend-native orchestrator changes
  • Review benchmarks and Cargo config
  • Post comprehensive review

Summary

The branch was rebased (6 commits, down from 8 in the prior review). The good improvements from earlier rounds are preserved — safe downcast handling (ok_or_else), split numeric variants (Int64/UInt64/Float64), and optimized timestamp formatting (TIMESTAMP_ITEMS). However, the ||?? fix from commit 7eab593 was lost in the force-push. One prior issue also remains unfixed.

Re-affirmed 2 prior unresolved threads that still apply (both on from_arrow visibility). All 11 previously resolved threads remain correctly resolved — changes still addressed.

Issues Found

Severity File Issue
Medium CubeStoreDriver.ts:121 || operator silently overrides explicit QueryResultFormat.Legacy (value 0, falsy) to Arrow. Should use ??. Fix was previously committed but lost in force-push.
Medium query_message_parser.rs:262 from_arrow is still private (fn) but benches/parser.rs:187 calls it from external benchmark code. Benchmarks compile as separate crates — this will fail cargo bench.

Both have inline comments with suggested fixes.

What's in good shape

  • Downcast safety: All macros use ok_or_elseParseError::ArrowError instead of .unwrap() panics
  • Numeric fidelity: Int64/UInt64/Float64 split preserves integer precision through the pipeline
  • Timestamp formatting: TIMESTAMP_ITEMS const slice avoids re-parsing the format string on every cell
  • Hand-written Serialize: Zero-overhead for common variants, correct ISO-8601 rendering for Timestamp
  • Size guard test: test_db_response_primitive_size catches accidental enum layout growth
  • Arrow type coverage: Comprehensive (all numeric, string, boolean, temporal, decimal types) with clear UnsupportedArrowType error for unsupported types
  • Test coverage: Basic types, unsupported types, and full round-trip through from_cubestore_fb with Arrow payload
  • Codegen: Auto-generated FlatBuffer bindings look correct (HttpQueryResult, HttpQueryResultArrow, QueryResultFormat)
  • Backward compatibility: WebSocketConnection defaults to Arrow; old CubeStore ignores the field gracefully
  • No breaking external callers: CubeStoreCacheDriver calls CubeStoreDriver.query() (unchanged signature), not WebSocketConnection.query() directly

Testing observation

transform_value only reformats DBResponsePrimitive::String values when type_ == "time". Arrow timestamps arrive as Timestamp(NaiveDateTime) and pass through via the other => other fallback. This is correct — the Timestamp variant already serializes to ISO-8601 — but there are no tests exercising this path. A test passing a Timestamp through transform_value with type_ = "time" would document this intentional bypass.


Comment thread packages/cubejs-cubestore-driver/src/CubeStoreDriver.ts Outdated
Comment thread rust/cube/cubeorchestrator/src/query_message_parser.rs Outdated
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 28, 2026

Claude finished @ovr's task in 3m 54s —— View job


No new issues found. All 19 prior review threads resolved. PR is ready to merge.

Full review details (9 commits, 17 files, +1263/−122)

Code Review: Arrow Format for CubeStore Response (Re-review after sync)

  • Gather context and review updated diff
  • Check previous review threads and resolve stale ones
  • Review Rust core changes (query_message_parser.rs, query_result_transform.rs)
  • Review TypeScript changes (CubeStoreDriver, WebSocketConnection, codegen)
  • Review backend-native orchestrator changes
  • Review benchmarks and Cargo config
  • Post comprehensive review

Summary

9 commits, all previously flagged issues addressed across multiple review rounds. The Arrow IPC decoding path is well-implemented with comprehensive type coverage, safe error handling, and proper backward compatibility. No regressions or new concerns in this pass.

What's verified

Area Status Details
?? fix CubeStoreDriver.ts:121 Nullish coalescing correctly preserves explicit QueryResultFormat.Legacy (value 0, falsy)
Downcast safety query_message_parser.rs:293-314 All macros use ok_or_elseParseError::ArrowError instead of .unwrap() panics
Numeric fidelity query_result_transform.rs:1157-1163 Int64/UInt64/Float64 split preserves integer precision through the pipeline until the JS boundary
Timestamp formatting query_result_transform.rs:1169-1183 TIMESTAMP_ITEMS const slice avoids re-parsing the format string on every cell — both Display and Serialize impls share it consistently
Hand-written Serialize query_result_transform.rs:1187-1201 Zero-overhead for common variants, correct ISO-8601 rendering for Timestamp via collect_str
Size guard test query_result_transform.rs:1387-1389 test_db_response_primitive_size at 32 bytes catches accidental enum layout growth
Arrow type coverage query_message_parser.rs:389-436 Comprehensive (all numeric, string, boolean, temporal, decimal types) with clear UnsupportedArrowType error for unsupported types
Test coverage query_message_parser.rs:612-789 Basic types, unsupported types (Binary), and full round-trip through from_cubestore_fb with Arrow payload
Benchmark refactor benches/parser.rs:188-229 Arrow benchmarks go through from_cubestore_fb (wrapping Arrow IPC in FlatBuffer HttpQueryResult), measuring the real end-to-end path — from_arrow being pub(crate) is correct
Codegen codegen/*.ts Auto-generated FlatBuffer bindings are correct — HttpCommand.HttpQueryResult, HttpQueryResultArrow (with required data field), HttpQueryResultData, QueryResultFormat all wire up correctly
Backward compatibility WebSocketConnection.ts:363 Defaults to Arrow; old CubeStore ignores the responseFormat field gracefully
No breaking callers CubeStoreDriver.ts:116 CubeStoreCacheDriver and CubeStoreQueueDriver call CubeStoreDriver.query() (unchanged public signature), not WebSocketConnection.query() directly

Previous threads

All 19 review threads from prior rounds are resolved and fixes confirmed present:

  • High: responseFormat default → addressed (constructor delegates to WebSocketConnection which defaults to Arrow)
  • Medium: .unwrap()ok_or_else on downcasts → addressed
  • Medium: ||?? for QueryResultFormat.Legacy → addressed
  • Medium: from_arrow visibility for benchmarks → addressed (benchmarks refactored to use from_cubestore_fb)
  • Medium: WebSocketConnection.query() signature change → verified no external callers

Testing observation

transform_value (query_result_transform.rs:178-222) only reformats DBResponsePrimitive::String values when type_ == "time". Arrow timestamps arrive as Timestamp(NaiveDateTime) and pass through via the other => other fallback at line 221. This is correct — the Timestamp variant already serializes to ISO-8601 — but there are no tests exercising this path. A test passing a Timestamp through transform_value with type_ = "time" would document this intentional bypass.

View job

@cube-js cube-js deleted a comment from claude Bot May 28, 2026
@cube-js cube-js deleted a comment from claude Bot May 28, 2026
@cube-js cube-js deleted a comment from claude Bot May 28, 2026
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 28, 2026

Claude finished @ovr's task in 3m 43s —— View job


No new issues found. All 19 prior review threads remain resolved. PR is ready to merge.

Deep review details (9 commits, 17 files, +1263/−122)

Code Review: Arrow Format for CubeStore Response (deep re-review)

  • Gather context and review updated diff
  • Check existing review threads (all 19 prior threads remain correctly resolved)
  • Deep-dive Rust core changes (query_message_parser.rs, query_result_transform.rs)
  • Deep-dive TypeScript changes (CubeStoreDriver, WebSocketConnection, codegen)
  • Deep-dive backend-native orchestrator changes
  • Audit benchmarks & Cargo dependency changes
  • Post comprehensive review

Summary

This is the deep pass. The Arrow IPC decoding path is well-implemented with comprehensive type coverage, safe error handling, sensible numeric fidelity, and proper backward compatibility. The "High"-severity concern (responseFormat default → toResponseFormat(undefined) throwing) from the very first review round was fully obsoleted by the later refactor that moved the default to a ?? at the call site, and the prior visibility/||/downcast/Display items are all addressed.

What's verified in this pass

Area Location Verdict
?? default for responseFormat CubeStoreDriver.ts:121 Correctly preserves explicit QueryResultFormat.Legacy (value 0, falsy)
toResponseFormat removal CubeStoreDriver.ts No call to a string-discriminated toResponseFormat remains; the High regression scenario from review 1 is structurally impossible now
Downcast safety query_message_parser.rs:293-301 All push_*! macros use ok_or_elseParseError::ArrowError instead of .unwrap() panics
Numeric fidelity query_result_transform.rs:1157-1163 Int64/UInt64/Float64 split preserves integer precision up to the JS boundary; as f64 cast remains the documented trade-off at JS conversion only
Timestamp formatting query_result_transform.rs:1169-1183 TIMESTAMP_ITEMS const slice shared by both Display (line 1313) and Serialize (line 1197) — single source of truth
Hand-written Serialize query_result_transform.rs:1187-1201 Zero-overhead for common variants; Timestamp rendered via collect_str (no intermediate String for the JSON path)
Hand-written Deserialize query_result_transform.rs:1205-1301 Avoids #[serde(untagged)] buffering; handles visit_i128/visit_u128 overflow with explicit error
Size guard query_result_transform.rs:1387-1389 32-byte target enforced; NaiveDateTime (12 bytes) fits within the existing payload, no regression
Arrow type coverage query_message_parser.rs:389-436 All numeric, string, boolean, temporal, decimal types; UnsupportedArrowType for anything else (Duration/Interval/Time32/Time64/List/Struct deferrable per prior thread)
Empty-stream / no-batch edge cases query_message_parser.rs:262-281 try_new accepts 0 fields/0 rows; row_count defaults to 0; no panics
from_arrow visibility query_message_parser.rs:262 pub(crate) is correct — tests live in the same crate; benchmarks route through public from_cubestore_fb
Bench refactor benches/parser.rs:188-229 Wraps Arrow IPC in HttpQueryResult FlatBuffer and measures the public end-to-end path — apples-to-apples with from_js_raw_data::parse_plus_build
Codegen wiring codegen/*.ts HttpCommand.HttpQueryResult (=4), HttpQueryResultArrow (with required data field), HttpQueryResultData union, QueryResultFormat enum all match the Rust side
WebSocketConnection.query signature WebSocketConnection.ts:294 responseFormat is required on WebSocketQueryOptions — strict contract enforced by TS; CubeStoreDriver supplies default
No external callers of WebSocketConnection.query grep Only callers are within CubeStoreDriver; CubeStoreCacheDriver/CubeStoreQueueDriver go through CubeStoreDriver.query() (unchanged public signature)
Backend-native Timestamp arm orchestrator.rs:139 value.to_string() routes through the Display impl which uses TIMESTAMP_ITEMS — format-consistent with Serialize
get_cubestore_result compatibility orchestrator.rs:336-375 Coerces all non-null primitives to strings via to_string(), so the new Timestamp variant renders correctly without special casing

Notes / non-blocking observations

These are observations only — none warrant a new inline comment, but flagging here for completeness:

  • Timestamp(_, Some(tz)) timezone dropping: value_as_datetime() returns the underlying UTC instant as NaiveDateTime, discarding the timezone hint. The absolute moment is preserved (Arrow stores instants in UTC and the timezone is presentation metadata), so wire output is correct. If CubeStore ever needs to round-trip a tz hint for display purposes, this would silently lose it — not a current concern.
  • is_last field on HttpQueryResultArrow: codegen exposes it (http-query-result-arrow.ts:40) but the Rust parser doesn't read it. Today the WebSocket handler resolves one promise per message, so multi-chunk streaming would need parallel work on both sides — out of scope for this PR.
  • transform_value Timestamp pass-through: only DBResponsePrimitive::String is reformatted when type_ == "time"; Timestamp values fall through the other => other arm and serialize via the dedicated path. Correct, but no test exercises this intentional bypass — a single assertion would document it.

Resolved threads

All 19 prior review threads remain correctly resolved against the current diff. Spot-check on a few key ones:

  • High (responseFormat default → throw): obsoleted by the ?? refactor; no toResponseFormat call remains.
  • Medium (.unwrap()ok_or_else on downcasts): present at query_message_parser.rs:293-301.
  • Medium (||?? for QueryResultFormat.Legacy): present at CubeStoreDriver.ts:121.
  • Medium (from_arrow visibility for benchmarks): benchmarks refactored to use public from_cubestore_fb; pub(crate) is correct.
  • Medium (WebSocketConnection.query() signature): no external callers; type contract enforced.

@ovr ovr force-pushed the perf/query-orchestrator-arrow-results branch from 93f3ce6 to bf39044 Compare June 2, 2026 13:15
@github-actions github-actions Bot added the cube store Issues relating to Cube Store label Jun 2, 2026
ovr and others added 9 commits June 5, 2026 15:37
Add QueryResult::from_arrow to decode Arrow IPC stream payloads into the
columnar QueryResult, and route the HttpQueryResult (Arrow) command in
from_cubestore_fb to it. Arrow temporal columns map to a new dedicated
DBResponsePrimitive::Timestamp variant that serializes to the existing
ISO-8601 millisecond format.
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
@ovr ovr force-pushed the perf/query-orchestrator-arrow-results branch from 2a5e625 to 8f0a966 Compare June 5, 2026 13:37
@ovr ovr marked this pull request as ready for review June 5, 2026 13:45
@ovr ovr requested review from a team as code owners June 5, 2026 13:46
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jun 5, 2026

Claude finished @ovr's task in 52s —— View job


No new issues found. PR is ready to merge.

Re-review after sync (10 commits, 17 files)
  • Reviewed updated diff against origin/master
  • Verified all 19 prior review threads remain resolved
  • Reviewed new commit fcc3dab (capability-gated Arrow format)
  • Reviewed Rust core, TypeScript, codegen, benchmarks, backend-native

New commit fcc3dab — "use arrow format only with new cube store"

The default responseFormat now depends on CubeStore version: Arrow only when the connected CubeStore advertises ≥ 1.6.55 (which is when Arrow + the Completed response type both landed), otherwise Legacy.

responseFormat: responseFormat ?? (
  await this.hasCapability('arrowFormat') ? QueryResultFormat.Arrow : QueryResultFormat.Legacy
),

This is a stricter backward-compat strategy than the previous "default Arrow, old CubeStore ignores it" approach. Verified:

  • ?? is preserved — explicit QueryResultFormat.Legacy (value 0) from a caller is still honored.
  • Precedence is correct: await binds tighter than ?:, and the whole expression is properly parenthesized inside the ?? right-hand side.
  • arrowFormat: '1.6.55' added to CubeStoreCapabilityMinVersion; reuses the existing hasCapability plumbing.

No new issues. Everything else from previous reviews remains in good shape (downcast safety, numeric fidelity split, TIMESTAMP_ITEMS, hand-written Serialize, size guard, Arrow type coverage, benchmark refactor through from_cubestore_fb).

• [Branch](https://github.com/cube-js/cube/tree/perf/query-orchestrator-arrow-results)

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jun 5, 2026

Claude finished @ovr's task in 40s —— View job


No new issues found. All 19 prior review threads remain resolved. PR is ready to merge.

Re-review after sync (1 new commit: `bcda3ca`)
  • Checked existing review threads (all 19 resolved)
  • Reviewed new commit bcda3ca chore: some love
  • Verified no production code changes

New commit bcda3ca — test cleanup

Pure test-quality refactor in rust/cube/cubeorchestrator/src/query_message_parser.rs (+62/−80). No production code touched.

  • Consolidated test-only use statements at the top of the tests module instead of repeating them per-test.
  • Converted assert!(result.is_ok()) / .expect(...) patterns to -> Result<(), ParseError> returning tests with the ? operator. Better failure messages on test breakage.
  • One small semantic improvement in test_compare_with_unchecked_parse: now asserts checked_result.row_count == 33_000 instead of just is_ok(), strengthening the check.
  • test_parse_with_custom_verifier_options now propagates the verifier error through ? and removes the wrapping match … Err(e) => panic!(…) arm.

No regressions or new concerns. Everything from prior reviews is still in good shape (downcast safety, numeric variant split, TIMESTAMP_ITEMS, hand-written Serialize, size guard, capability-gated Arrow default, ?? fix, etc.).

• [Branch](https://github.com/cube-js/cube/tree/perf/query-orchestrator-arrow-results)

The legacy CubeStore result set carried every value as a string, so
numbers always serialized to JSON strings. The Arrow path produces real
Int64/UInt64/Float64 primitives that serialized to JSON numbers, which
changed the wire shape and broke pre-aggregation snapshots (e.g.
"5" -> 5, "0.6666666666666666" -> 0.6666666666666666).

Render the numeric DBResponsePrimitive variants via collect_str so the
Arrow path matches the legacy string output. f64 Display is the shortest
round-tripping form, so values are stable.
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jun 5, 2026

Claude finished @ovr's task in 46s —— View job


No new issues found. PR is ready to merge.

Re-review after sync (1 new commit: `a8f7995`)
  • Reviewed updated diff against origin/master
  • Verified all 19 prior review threads remain resolved
  • Reviewed new commit a8f7995 (numeric variants → JSON strings)
  • Verified Serialize/Deserialize asymmetry is intentional and matches legacy

New commit a8f7995 — "serialize numeric query results as JSON strings"

The Serialize impl for DBResponsePrimitive::{Int64, UInt64, Float64} now goes through collect_str instead of serialize_i64/serialize_u64/serialize_f64. This restores the legacy wire shape: the FlatBuffer/string path historically delivered every value as a JSON string, and pre-aggregation snapshots + downstream consumers depend on that shape. The Arrow path was producing real JSON numbers, which would have been a breaking change in output.

Verified:

  • f64's Display impl produces the shortest round-tripping decimal (Rust uses the Grisu/Ryu algorithm), so 0.6666666666666666 stays stable across re-serialization. Good call in the commit message.
  • The new test at query_message_parser.rs:699-703 pins the behavior: serde_json::to_value of Int64(2) produces the string "2", Float64(1.5) produces "1.5", and Null stays null.
  • Round-trip asymmetry is intentional: Serialize writes "2" but Deserialize reads "2" back into DBResponsePrimitive::String("2".into()), not Int64(2). This is fine — these values are produced by the Arrow parser and consumed by JS, not round-tripped through Rust serde. It also matches what the legacy Deserialize path would have done before this PR (strings stay as strings).
  • Numeric precision concern from the very first review round is now fully obsolete: large i64/u64 values render losslessly via Display, not via the as f64 cast. The split into Int64/UInt64/Float64 variants pays off here.
  • Boolean/Null/String/Timestamp paths are unchanged.

No new issues. Everything from prior reviews remains in good shape (capability-gated Arrow default, ?? fix, downcast safety via ok_or_else, TIMESTAMP_ITEMS shared between Display/Serialize, size guard at 32 bytes, Arrow type coverage, benchmark refactor through from_cubestore_fb).

• [Branch](https://github.com/cube-js/cube/tree/perf/query-orchestrator-arrow-results)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cube store Issues relating to Cube Store data source driver javascript Pull requests that update Javascript code rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants