Skip to content

Explore SlateDB as a storage backend#163

Closed
em3s wants to merge 19 commits intonext/java25from
next/slatedb
Closed

Explore SlateDB as a storage backend#163
em3s wants to merge 19 commits intonext/java25from
next/slatedb

Conversation

@em3s
Copy link
Copy Markdown
Contributor

@em3s em3s commented Feb 3, 2026

Summary

Experimental integration to explore the feasibility of SlateDB as an alternative storage layer.

Closes #155

Stages

Stage 0

Stage 1

Stage 2

Stage 3

  • Use upstream Add merge operator support to slatedb-java slatedb/slatedb#1338 (merge operator support).
  • Open SlateDB via SlateDb.builder().withMergeOperator() with an increment merge operator.
  • Replace non-atomic read-put in BatchOperation.Increment with batch.merge().
  • Replace read-compute-put in SlateDbHashLabel.incrby() with db.merge().

Stage 4 (current)

How to Test

./gradlew :engine:build

To run SlateDB datastore compatibility tests (opt-in):

SLATEDB_TEST=true ./gradlew :engine:test --tests "*SlateDBDatastoreCompatibilityTest*"

Notes

  • HBase mini cluster tests are automatically skipped on Java 18+ due to Hadoop's use of Subject.getSubject(AccessControlContext), which was removed in Java 18.
  • SlateDB operations are serialized through a single-threaded scheduler (SlateDbScheduler). Concurrent reads could theoretically be parallelized, but the C library's Tokio runtime model makes this risky without further investigation.

em3s and others added 6 commits February 3, 2026 23:38
Implements StorageOperations adapter for SlateDB backend.

- 19 tests pass (get, scan, put, delete, increment, batch)
- 7 tests skipped (checkAndMutate not supported by SlateDB)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@em3s em3s self-assigned this Feb 6, 2026
@em3s em3s changed the title feat: integrate SlateDB storage Integrate SlateDB storage Feb 12, 2026
@em3s em3s changed the base branch from main to next/java25 February 25, 2026 01:18
Replace manual native library path management with JAR-bundled native
library from slatedb/slatedb#1329. NativeLibraryLoader automatically
extracts and loads the platform-specific library from classpath resources.

- Remove SlateDb.loadLibrary(path) calls (API removed in upstream PR)
- Use SlateDb.initLogging(SlateDbConfig.LogLevel.INFO) for initialization
- Remove libraryPath from SlateDbOptions, SlateDbConnections, tests, and config
- Remove working directory hack in server bootRun task

Tested: all 4 SlateDB test suites pass (SlateDbTableTest, SlateDbStorageTest,
SlateDbHashLabelTest, SlateDbIndexedLabelTest)
em3s added 4 commits February 25, 2026 10:28
Out-of-scope change (import reordering, expression body) that was
unrelated to SlateDB integration.
- Restore MathOps.kt and math_ops.c (FFI examples belong to java25 branch)
- Remove build-slatedb.sh (obsolete with JAR-bundled native library)
- Restore .gitignore (slatedb build artifacts no longer relevant)
- Restore engine/build.gradle.kts (remove import reorder and broken
  native/lib/slatedb.jar file dependency)
- Restore server/build.gradle.kts (Java 25 javaLauncher belongs in
  next/java25 branch)
Add native/build-slatedb-java.sh to clone, build, and publish
slatedb-java to Maven local since it is not on Maven Central.
Declare the dependency in engine and add mavenLocal() to repo
resolution.
@em3s em3s changed the title Integrate SlateDB storage Explore SlateDB as a storage backend Feb 25, 2026
@em3s em3s force-pushed the next/slatedb branch 2 times, most recently from 2b306fc to e6773a5 Compare February 26, 2026 05:35
- Add merge() to SlateDbTable interface and implementation
- Add toSlateBytes()/toLong() helpers to eliminate duplicate ByteBuffer code
- Define incrementMergeOperator once, reuse in connections and tests
- Replace non-atomic read-modify-write with db.merge() in SlateDbHashLabel.incrby()
- Use batch.merge() for BatchOperation.Increment in batch writes
- Switch SlateDbConnections from SlateDb.open() to builder pattern with merge operator
- Update build script to use java-merge-op branch (slatedb PR #1338)
- Add merge operator tests for degree counting use case
em3s added 2 commits March 3, 2026 17:17
The slatedb-java library is now published to Maven Central as
io.slatedb:slatedb:0.11.0 with bundled native libraries. No local
build is required anymore.

Changes:
- Switch dependency from 0.1.0-SNAPSHOT to 0.11.0 (Maven Central)
- Remove mavenLocal() repository added for SNAPSHOT resolution
- Configure org.gradle.jvm.version=25 on resolvable configurations so
  Gradle accepts the artifact's JVM 24+ requirement
- Introduce SlateDbScheduler singleton to serialize native FFI calls;
  the C library's global Tokio runtime does not support concurrent
  block_on from multiple threads
- Delete native/build-slatedb-java.sh (local build no longer needed)

Ref: slatedb/slatedb#368 (comment)
- Add ByteBuddy experimental flag so BlockHound works on Java 25
- Skip HBase mini cluster tests on Java 18+: Subject.getSubject() was
  removed in Java 18 (Hadoop UserGroupInformation incompatibility)
em3s added 4 commits March 3, 2026 17:22
MathOps.kt formatting, HBase Java 18+ skip, and BlockHound/Hadoop JVM
args belong in next/java25, not next/slatedb. Revert those to the
next/java25 baseline so this branch only contains SlateDB-specific changes.
The JVM version attribute fix belongs in next/java25 as a general
Java 25 environment fix, not in next/slatedb. Also remove stale
.gitignore entries for local native build artifacts (no longer needed
since SlateDB is pulled from Maven Central).
Resolve conflict in GraphFixtures.kt: take java25 version which adds
configBuilder parameter and BlockHound MockHTable allowlist.
Two bugs fixed in SlateDbConnections/SlateDbScheduler:

1. Use-after-close race: closeConnections() was running db.close() on
   boundedElastic while pending FFI ops (delete, merge) were still
   queued on slatedb-worker. Moving close to SlateDbScheduler.INSTANCE
   ensures it is enqueued after all pending ops, preventing the
   ClosedException that corrupted the Tokio runtime state and caused
   SIGABRT on the next test.

2. BlockHound conflict: Schedulers.newSingle() marks its thread as
   Reactor NonBlocking, which BlockHound monitors for blocking calls.
   The intentional FFI block_on calls triggered instrumentation
   conflicts on Java 25. Switching to Schedulers.fromExecutorService
   with a plain single-thread executor avoids the NonBlocking marker
   while preserving serialization guarantees.
@em3s
Copy link
Copy Markdown
Contributor Author

em3s commented Mar 4, 2026

We confirmed SlateDB's potential in this PR. We'll revisit after defining our Java 25 upgrade policy and V2→V3 engine migration.

  • Verified compatibility via SlateDBDatastoreCompatibilityTest
  • Implemented and validated SlateDbHashLabel, SlateDbIndexedLabel

Note: hit a concurrency issue in 0.11.0, worked around with newSingleThreadExecutor. Suspect block_on is the cause, but need to explore further.

cc. @criccomini

@em3s em3s closed this Mar 4, 2026
@criccomini
Copy link
Copy Markdown

Note: hit a concurrency issue in 0.11.0, worked around with newSingleThreadExecutor. Suspect block_on is the cause, but need to explore further.

Curious about this. If it's an issue with SlateDB, please do open an issue :)

@em3s
Copy link
Copy Markdown
Contributor Author

em3s commented Mar 4, 2026

Curious about this. If it's an issue with SlateDB, please do open an issue :)

Opened slatedb/slatedb#1366 cc. @criccomini

After investigation, the issue is blocking — not concurrency. The multi-thread Tokio runtime handles concurrent block_on calls fine, but each FFI call blocks the calling thread, which starves non-blocking event loops.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Experiment with Java 24+ for SlateDB integration

2 participants