Skip to content

[Java] Add experimental Arrow Flight support#141

Merged
danilonajkov-db merged 3 commits intomainfrom
java-arrow
Mar 17, 2026
Merged

[Java] Add experimental Arrow Flight support#141
danilonajkov-db merged 3 commits intomainfrom
java-arrow

Conversation

@danilonajkov-db
Copy link
Contributor

@danilonajkov-db danilonajkov-db commented Mar 12, 2026

What changes are proposed in this pull request?

Adds Arrow Flight ingestion to the Java SDK. The Java side accepts VectorSchemaRoot directly, serializes to IPC bytes internally, and passes them across the JNI boundary where the Rust SDK handles encoding, framing, and transmission over the Arrow Flight gRPC protocol.

  • New ZerobusArrowStream with ingestBatch(), waitForOffset(), flush(), close(), getUnackedBatches()
  • New ArrowStreamConfigurationOptions with builder pattern
  • New SDK methods: createArrowStream(), recreateArrowStream()
  • Opt-in: add arrow-vector and arrow-memory-netty as dependencies (provided scope, >= 15.0.0)
  • Integration tests and example in examples/arrow/

How is this tested?

Added arrow flight integration tests

@danilonajkov-db danilonajkov-db changed the title [WIP][Java] Apache Arrow support [WIP][Java] Add experimental Arrow Flight support Mar 12, 2026
@danilonajkov-db danilonajkov-db changed the title [WIP][Java] Add experimental Arrow Flight support [Java] Add experimental Arrow Flight support Mar 13, 2026
@danilonajkov-db danilonajkov-db marked this pull request as ready for review March 13, 2026 11:04
*/
public long ingestBatch(VectorSchemaRoot batch) throws ZerobusException {
if (batch == null) {
throw new ZerobusException("Batch must not be null");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how should we handle this case? I think throwing an exception makes sense, but we can make it a no-op too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the regular gRPC stream the return type is Optional<Long> and we return an empty option, so I think matching that would be okay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should handle the case where there are 0 rows in the batch as well and treat it as no-op.

Copy link
Contributor

@teodordelibasic-db teodordelibasic-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can include some UTs for scenarios that don't need mock server or env vars, like null batch rejection and serialization/deserialization.

*/
public long ingestBatch(VectorSchemaRoot batch) throws ZerobusException {
if (batch == null) {
throw new ZerobusException("Batch must not be null");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should handle the case where there are 0 rows in the batch as well and treat it as no-op.

* @throws ZerobusException if an error occurs during close
*/
@Override
public void close() throws ZerobusException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can investigate this further and fix in follow up PR since it exists for regular stream as well, but LLM found two issues in close:

  1. Two threads calling close() concurrently can both read a non-zero handle, leading to double nativeDestroy (use-after-free). Same risk if ingestBatch races with closeensureOpen() passes, then the handle gets freed underneath. I don't think this is of a high priority since we say streams are not supposed to be used by multiple threads.
  2. If nativeDestroy throws, the native resource leaks forever. nativeHandle is set to 0, but underlying native memory allocated by the Rust SDK is never freed.

Something like this should fix maybe:

private final AtomicLong nativeHandle = new AtomicLong(0);

public void close() throws ZerobusException {
    long handle = nativeHandle.getAndSet(0);
    if (handle == 0) return;
    try {
        nativeClose(handle);
        try {
            cachedUnackedBatches = nativeGetUnackedBatches(handle);
        } catch (Exception e) {
            logger.warn("Failed to cache unacked batches: {}", e.getMessage());
            cachedUnackedBatches = new ArrayList<>();
        }
    } finally {
        nativeDestroy(handle);
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I saw its the same for regular streams, we mention that it isn't thread safe for both regular/arrow streams currently. I can do this in a follow up PR


// Cache unacked batches before destroying the handle (for recreateArrowStream)
try {
cachedUnackedBatches = nativeGetUnackedBatches(handle);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more similar potential pre-existing issue that is not of a high priority. If nativeGetUnackedBatches fails after nativeClose succeeds, the batches are gone — close() already flushed and shut down the stream, and the cache is set to an empty list. When the user later calls recreateArrowStream, it thinks there's nothing to re-ingest. Data is silently lost. Potential fix is to store the exception and surface it when getUnackedBatches() is called on the closed stream:

private volatile Exception cachedUnackedBatchesError;

// In close():
try {
    cachedUnackedBatches = nativeGetUnackedBatches(handle);
} catch (Exception e) {
    logger.warn("Failed to cache unacked batches: {}", e.getMessage());
    cachedUnackedBatchesError = e;
    cachedUnackedBatches = null;
}

// In getUnackedBatches():
public List<byte[]> getUnackedBatches() throws ZerobusException {
    if (nativeHandle.get() == 0) {
        if (cachedUnackedBatchesError != null) {
            throw new ZerobusException(
                "Failed to retrieve unacked batches during close: " + cachedUnackedBatchesError.getMessage(),
                cachedUnackedBatchesError);
        }
        return cachedUnackedBatches != null ? cachedUnackedBatches : new ArrayList<>();
    }
    return nativeGetUnackedBatches(nativeHandle.get());
}

*
* <p>Package-private for use by {@link ZerobusSdk#createArrowStream}.
*/
static byte[] serializeSchemaToIpc(Schema schema) throws ZerobusException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improvement could be to reuse the RootAllocator since it's heavyweight. Since it's created only during stream creation the performance impact is negligible probably. We can either pass a BufferAllocator to this function from the caller or have a static one.

Copy link
Contributor Author

@danilonajkov-db danilonajkov-db Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made it a static allocator for every create stream, thanks. The other things ill address in a follow up as they apply for the current non-arrow implementation as well

* @param maxInflightBatches the maximum number of in-flight batches
* @return this builder for method chaining
*/
public ArrowStreamConfigurationOptionsBuilder setMaxInflightBatches(int maxInflightBatches) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both for Arrow and regular options we could include validations that some of the values are not for example less then 0 and throw if true. Can be handled in some future PR also.

@teodordelibasic-db
Copy link
Contributor

Looks good, a couple of potential pre-existing follow up edge cases found by LLM. We can handle the null/empty batch case.

@danilonajkov-db danilonajkov-db linked an issue Mar 16, 2026 that may be closed by this pull request
Copy link
Contributor

@elenagaljak-db elenagaljak-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some details about Arrow support into Java readme and example level readme

@danilonajkov-db danilonajkov-db added this pull request to the merge queue Mar 17, 2026
Merged via the queue into main with commit e0402c7 Mar 17, 2026
13 checks passed
@danilonajkov-db danilonajkov-db deleted the java-arrow branch March 17, 2026 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Java] Add experimental Arrow Flight support

3 participants