Axon is a Rust workspace for building a hybrid query platform with native and browser runtimes, shared query contracts, and supporting infrastructure for control-plane and UDF execution.
crates/query-contractcontains shared request and response types, capability flags, and fallback reasons.crates/native-query-runtimeis the native execution reference runtime.crates/wasm-query-runtimeis the browser-oriented runtime envelope for constrained object access and the narrow in-repo browser execution-plan path.crates/delta-control-planecontains the in-repo control-plane slice for snapshot resolution and table policy enforcement.crates/wasm-http-object-storecontains the thin EPIC-04 HTTP byte-range slice for browser-safe object reads.crates/query-router,crates/browser-sdk,crates/udf-abi, andcrates/udf-host-wasiremain scaffolds around routing, browser access, and hosted UDF execution.
cargo check --workspace
cargo test -p query-contract
cargo test -p native-query-runtime
cargo test -p wasm-query-runtime
cargo test -p wasm-http-object-store
cargo check -p wasm-query-runtime -p wasm-http-object-store -p browser-sdk --target wasm32-unknown-unknowncrates/wasm-query-runtime now contains the current in-repo EPIC-04 browser preflight slice:
BrowserRuntimeConfigvalidates the constrained browser envelope before any object access is attempted, including nonzero request, execution, and snapshot-preflight timeouts for runtime-owned reads and end-to-end browser execution.BrowserRuntimeConfigalso carries bounded metadata-fetch concurrency so browser snapshot bootstrap is not forced into unbounded serial I/O.BrowserRuntimeSession::new(config)constructs a runtime handle with runtime-owned HTTP client timeout policy, whileBrowserRuntimeSession::with_reader(config, reader)remains available for injected host-side readers and tests.BrowserObjectSource::from_url(url)is the typed browser object source boundary for URL-backed access and only accepts HTTPS object URLs in production browser mode, with loopback-only plain HTTP reserved for native host-side tests.MaterializedBrowserFile::new(...)andMaterializedBrowserSnapshot::{new,new_with_partition_column_types}(...)provide runtime-owned constructors for already-validated object sources while preserving explicit partition typing when the shared descriptor includes it.BrowserRuntimeSession::materialize_snapshot(&descriptor)converts a shared HTTPS-onlyBrowserHttpSnapshotDescriptorinto runtime-owned validated object sources while preserving file order, partition metadata, and partition column types without performing any network I/O.BrowserRuntimeSession::probe(&source, range)delegates exact range reads tocrates/wasm-http-object-storewithout reimplementing HTTP logic.BrowserRuntimeSession::read_parquet_footer_for_file(&file)validates descriptor size against observed object metadata while bootstrapping raw footer bytes.BrowserRuntimeSession::read_parquet_metadata_for_file(&file)decodes strongly typed Parquet file metadata from those footer bytes, including per-file integerBrowserParquetFieldStatsmin/max/null-count summaries when footer statistics are present.BootstrappedBrowserFile::new(...)andBootstrappedBrowserSnapshot::{new,new_with_partition_column_types}(...)now validate size/path invariants up front, preserve bootstrapped object identity metadata when available, and expose bootstrapped state through read-only accessors instead of public mutable fields.BrowserRuntimeSession::bootstrap_snapshot_metadata(&snapshot)now buffers metadata fetches up to the configured concurrency limit and enforces a snapshot-level deadline, whileBootstrappedBrowserSnapshot::{validate_uniform_schema,summarize}produces deterministic Parquet payload-field summaries plus sorted partition-column names, explicit partition column types, and row/byte totals without attempting browser SQL execution.BrowserRuntimeSession::analyze_query_shape(&self, sql)validates the current read-only browser SQL envelope overaxon_tableand returns a deterministicBrowserQueryShapewithout attempting execution.BrowserRuntimeSession::plan_query(&self, snapshot, request)binds a supportedQueryRequestto a bootstrapped snapshot and returns aBrowserPlannedQuerycandidate-file set plusBrowserPruningSummary, including lossless partition pruning for=,IN,IS NULL, andIS NOT NULLfilters and integer footer-stat pruning for=,>,>=,<, and<=predicates when complete file stats are available.BrowserRuntimeSession::build_execution_plan(&self, snapshot, request)now lowers the currently accepted browser SQL subset into a typedBrowserExecutionPlanover the already-planned candidate-file set, including required scan columns, typedWHEREfilters, passthrough output columns, grouped columns, and aliased aggregate measures forAVG,ARRAY_AGG,BOOL_AND,BOOL_OR,COUNT,SUM,MIN, andMAX, plus output-alignedORDER BY/LIMITmetadata.DISTINCT,HAVING, wildcard projections, non-lossless projections, and non-output-alignedORDER BYexpressions remain rejected without attempting execution.BrowserRuntimeSession::execute_plan(&self, snapshot, plan)now executes the current supported execution-plan subset against materialized browser HTTP objects by reading the planned candidate Parquet files, synthesizing partition columns from per-filepartition_values, applying typed row filters, projecting passthrough outputs, executing narrow grouped and ungrouped aggregates, and preserving output-alignedORDER BY/LIMIT.BrowserExecutionResultnow carriesQueryMetricsSummaryexecution metrics including fetched bytes, wall-clock duration, touched files, and skipped files.- Browser execution plans now carry bootstrapped object ETags for candidate files, and execution rejects bootstrap-to-read identity drift instead of reading through object replacement.
- Browser execution now fails closed to native fallback when a required partition column has no explicit type metadata or an unsupported partition type, so numeric-looking string partitions do not get reinterpreted during browser execution.
- Deferred aggregate functions such as
ARRAY_AGG,BOOL_AND, andBOOL_ORreturn structured native fallback instead of overclaiming browser support. - The runtime rejects multi-partition execution as a structured native fallback, rejects unsupported object URL schemes during source construction, rejects cloud credentials as a security policy violation, and allows plain HTTP only for loopback host-side tests.
Local validation:
cargo install wasm-bindgen-cli --version 0.2.114 --locked
cargo test -p wasm-query-runtime --locked
cargo test -p wasm-query-runtime --target wasm32-unknown-unknown --locked --test wasm_smokeThis slice is intentionally small: it does not register tables with DataFusion, expose browser-sdk, orchestrate query-router, or implement any services/query-api behavior. The current browser output is a deterministic planning/pruning plus narrow execution-plan interpreter over the curated supported SQL corpus, not a broad browser SQL or DataFusion engine.
crates/native-query-runtime now contains the first callable EPIC-02 slice:
bootstrap_table(table_uri)opens a Delta table and validates the Sprint 1 compatibility envelope.execute_query(request)registers the table asaxon_table, executes read-only SQL, and returns Arrow batches, execution-derived scan metrics, wall-clock duration, and optional explain output.QueryRequest.snapshot_versionoptionally pins execution to a specific Delta snapshot version; omitting it keeps the current latest-snapshot behavior.
Local/offline validation:
cargo test -p native-query-runtime --lockedSprint 2 tightens native metrics around the executed plan:
bytes_fetchedis sourced from scan-levelbytes_scannedmetrics when available.files_touchedreports scanned files rather than total active snapshot files.files_skippedreports partition/file pruning outcomes when the scan path exposes them, and otherwise falls back toactive_files - files_touched.
Offline native coverage now includes both the original unpartitioned SQL corpus and a partitioned latest-snapshot corpus that asserts pruning-visible metrics. Sprint 4 expands that local oracle coverage to:
- a 12-case latest-snapshot unpartitioned SQL corpus,
- a 10-case latest-snapshot partitioned SQL corpus with explicit scan-metric assertion flags so pruning expectations are only enforced where they are stable,
- a 4-case snapshot-version SQL corpus over the local multi-version fixture.
The local cargo test -p native-query-runtime --locked suite also carries deterministic negative-path coverage for:
- invalid table locations,
- unavailable or negative snapshot versions,
- missing local data files,
- Unix permission-denied local data files.
These local failures are the baseline oracle checks; the GCS smokes below remain optional environment-backed coverage for cloud-specific paths.
Env-gated GCS smoke validation:
AXON_GCS_TEST_TABLE_URI=gs://your-bucket/your-table \
cargo test -p native-query-runtime --locked bootstrap_table_supports_env_gated_gcs_smoke -- --exact --nocaptureEnv-gated GCS query execution smoke:
AXON_GCS_TEST_TABLE_URI=gs://your-bucket/your-table \
cargo test -p native-query-runtime --locked execute_query_supports_env_gated_gcs_smoke -- --exact --nocaptureThe GCS smoke path assumes standard Google ADC is already available in the shell or runner environment, and the configured table should be non-empty so the query smoke can assert LIMIT 1 execution.
GitHub Actions uses the same command behind an explicit google-github-actions/auth step and requires the AXON_GCP_CREDENTIALS_JSON secret when any env-gated GCS fixture is configured.
Env-gated partitioned GCS pruning smoke:
AXON_GCS_TEST_PARTITIONED_TABLE_URI=gs://your-bucket/your-partitioned-table \
cargo test -p native-query-runtime --locked execute_query_supports_env_gated_partitioned_gcs_pruning_smoke -- --exact --nocaptureEnv-gated partitioned GCS snapshot-version smoke:
AXON_GCS_TEST_PARTITIONED_TABLE_URI=gs://your-bucket/your-partitioned-table \
AXON_GCS_TEST_PARTITIONED_TABLE_SNAPSHOT_VERSION=1 \
cargo test -p native-query-runtime --locked execute_query_supports_env_gated_partitioned_gcs_snapshot_version_smoke -- --exact --nocaptureThe partitioned GCS fixture contract is intentionally narrow:
- the table must be partitioned by a
categorycolumn, - the latest snapshot must include at least one row in the
category = 'C'partition, - the pinned historical snapshot version must be readable and return a different
COUNT(*)result than latest, - the latest pruning query should visibly skip at least one file so the smoke can assert
files_skipped > 0.
Env-gated negative GCS smokes:
AXON_GCS_TEST_FORBIDDEN_TABLE_URI=gs://your-bucket/forbidden-table \
cargo test -p native-query-runtime --locked bootstrap_table_rejects_env_gated_forbidden_gcs_smoke -- --exact --nocapture
AXON_GCS_TEST_NOT_FOUND_TABLE_URI=gs://your-bucket/missing-table \
cargo test -p native-query-runtime --locked bootstrap_table_rejects_env_gated_not_found_gcs_smoke -- --exact --nocapture
AXON_GCS_TEST_STALE_HISTORY_TABLE_URI=gs://your-bucket/history-trimmed-table \
AXON_GCS_TEST_STALE_HISTORY_SNAPSHOT_VERSION=1 \
cargo test -p native-query-runtime --locked execute_query_rejects_env_gated_stale_history_gcs_smoke -- --exact --nocapture
AXON_GCS_TEST_MISSING_OBJECT_TABLE_URI=gs://your-bucket/missing-object-table \
cargo test -p native-query-runtime --locked execute_query_rejects_env_gated_missing_object_gcs_smoke -- --exact --nocaptureNegative fixture contract:
AXON_GCS_TEST_FORBIDDEN_TABLE_URImust point at a table path that exists but returns403or equivalent access denial for the runner identity.AXON_GCS_TEST_NOT_FOUND_TABLE_URImust point at a table path that returns404or equivalent not-found behavior during bootstrap.AXON_GCS_TEST_STALE_HISTORY_TABLE_URIandAXON_GCS_TEST_STALE_HISTORY_SNAPSHOT_VERSIONmust be configured together, and the table must have a readable latest snapshot whose configured historical version is no longer available.AXON_GCS_TEST_MISSING_OBJECT_TABLE_URImust point at a table whose log is readable but whose current snapshot references at least one missing data object; the smoke issues a full-table aggregate to force every current file to be opened.
Fixture provisioning, IAM policy, and CI variable population for these negative smokes remain external dependencies outside this repository. Among the negative GCS fixtures, only the paired stale-history env vars are hard-validated in CI; the single-variable negative fixture URIs remain independently optional and simply skip when unset.
crates/delta-control-plane now contains the first in-repo EPIC-03 slice:
resolve_snapshot(request)validates the table locator, resolves the latest or explicit historical Delta snapshot, and returns a metadata-only descriptor.resolve_snapshot_with_policy(request, policy)applies exact-match per-table allow/deny rules after URI normalization and before snapshot I/O.attach_browser_http_urls(resolved_snapshot, object_urls_by_path)converts a resolved metadata-only snapshot into a browser HTTP descriptor once a trusted caller supplies exact per-file URLs.SnapshotAccessPolicycanonicalizes equivalent locators so raw local paths,file://URLs, remote bucket/root variants, redundant-slash variants, whitespace variants, and trailing-slash variants cannot bypass table policy.SnapshotResolutionRequestcarriestable_uriplus an optionalsnapshot_version.ResolvedSnapshotDescriptorreturns the normalizedtable_uri, the concrete resolvedsnapshot_version, explicitpartition_column_types, and deterministically ordered active file metadata asResolvedFileDescriptorentries.BrowserHttpSnapshotDescriptorandBrowserHttpFileDescriptorprovide the shared in-repo browser-facing contract for explicit per-file HTTPS access while preserving snapshot-level partition column typing for browser execution.
This slice still does not mint signed URLs or proxy endpoints. Instead, it now defines and validates the descriptor seam that future trusted service code will use after it generates exact per-file browser-safe URLs. Tokens, credentials, audit fields, TTL, request correlation, and CORS/origin behavior remain out of repo.
Local validation:
cargo test -p delta-runtime-support --locked
cargo test -p delta-control-plane --lockedCross-crate handoff coverage in crates/delta-control-plane/tests checks the resolved table_uri / snapshot_version pair against crates/native-query-runtime, validates the descriptor's active-file metadata and partition_column_types against the local fixture without changing QueryRequest, proves browser HTTP URL attachment preserves file order and metadata, proves invalid or duplicate browser URL inputs fail deterministically without leaking query strings, and confirms the resulting HTTPS descriptors materialize cleanly into crates/wasm-query-runtime runtime-owned object sources.
Additional cross-crate browser-preflight coverage now resolves real local Delta snapshots, serves their Parquet files over loopback HTTP in host-side tests, bootstraps runtime-owned Parquet metadata and snapshot summaries through crates/wasm-query-runtime, proves the resulting file_count, snapshot_version, total_bytes, total_rows, integer footer stats, and curated browser-planning candidate-file counts remain aligned with the resolved snapshot descriptor and the native COUNT(*) / files_touched / files_skipped oracle, asserts typed browser execution-plan shape over the curated supported-browser SQL corpus, and now also executes that curated non-aggregate and aggregate browser corpus with normalized native-result parity across mixed-case integer and numeric-looking string partitions while keeping explicit native-only divergence checks and fail-closed identity-drift coverage so browser-envelope drift is caught in CI.
Authenticated HTTP service work remains out of repo: there is still no services/query-api directory here, so signed URL issuance, proxy reads, audit logging, request correlation, and CORS/origin validation remain external blockers rather than shipped repository scope.
crates/wasm-http-object-store now contains the thin in-repo EPIC-04 opening slice:
HttpByteRangemodels full, bounded, from-offset, and suffix reads without introducing signing or proxy assumptions.HttpRangeReader::with_client(client)allows callers to inject a preconfiguredreqwest::Clientfor timeout or redirect policy control.HttpRangeReader::read_range(url, range)performs exact HTTP byte-range requests and returnsbytes::BytesplusHttpObjectMetadatawithout an extra payload copy.- Returned metadata and error messages redact URL query strings and fragments so signed URL secrets do not leak past the transport boundary.
- Deterministic local HTTP tests cover footer-style reads plus
401,403,404,416, and malformed partial-response handling. - The crate maps transport failures to
ExecutionFailed, auth failures toAccessDenied, and range/protocol failures toObjectStoreProtocolusing the existing shared query error taxonomy.
Local validation:
cargo test -p wasm-http-object-store --locked
cargo check -p wasm-query-runtime -p wasm-http-object-store -p browser-sdk --target wasm32-unknown-unknown --lockedThis slice is intentionally small: it does not register tables with DataFusion, execute browser SQL, expose a browser SDK surface, or implement any services/query-api behavior. Signed URL issuance, read-proxy mode, audit logging, request correlation, and production-shape CORS/origin validation remain external blockers outside this repository.
crates/contains the Rust workspace packages.tests/conformance/contains scaffold checks plus native SQL corpora whose partition-pruning expectations now serve as the local oracle for narrow browser-planning parity coverage.tests/perf/contains performance test scaffolding.tests/security/contains security notes and will grow into service-level secret/CORS coverage onceservices/query-apiexists..github/workflows/ci.ymlcontains the CI configuration.