Skip to content

feat(bigquery-jdbc): OpenTelemetry integration in BQ JDBC#12902

Draft
keshavdandeva wants to merge 30 commits into
mainfrom
jdbc/feature-branch-otel
Draft

feat(bigquery-jdbc): OpenTelemetry integration in BQ JDBC#12902
keshavdandeva wants to merge 30 commits into
mainfrom
jdbc/feature-branch-otel

Conversation

@keshavdandeva

Copy link
Copy Markdown
Contributor

This is a feature branch. Please do not merge to main.

… Statement (#12124)

b/491239772
b/491239773

### Changes

- New connection properties: `enableGcpTraceExporter` (Boolean, default:
false) and `enableGcpLogExporter` (Boolean, default: false)

- `customOpenTelemetry` (Instance): Programmatic injection of a custom
SDK (User Application-Managed setup) via
`BigQueryDataSource.setCustomOpenTelemetry()`

- Added the core initialization logic for `OpenTelemetry`. During
connection setup, it evaluates whether tracing is enabled and constructs
an OpenTelemetry Tracer instance. Then, it passes this newly minted
tracer strictly downward into the core `BigQueryOptions.Builder` via
`.setOpenTelemetryTracer()`

- Intercepted the execution functions (`execute`, `executeQuery`,
`executeLargeUpdate`, `executeBatch`) to spawn child spans wrapping each
database call.
@keshavdandeva keshavdandeva added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Apr 23, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request integrates OpenTelemetry into the BigQuery JDBC driver, enabling tracing for core operations such as query execution, batch updates, and background data processing tasks like pagination and Arrow stream processing. It adds support for custom OpenTelemetry instances and introduces configuration flags for GCP trace and log exporters. The review feedback recommends performance optimizations by using static tracer constants, better adherence to OpenTelemetry semantic conventions for database spans (e.g., setting db.system and SpanKind), and improving user feedback when certain exporter features are still under development.

keshavdandeva and others added 19 commits April 28, 2026 15:31
…agination (#12918)

b/491245568

### Key Changes

#### Core Instrumentation Logic
* **Database Metadata Tracing**: Added OTel spans to key methods in
`BigQueryDatabaseMetaData.java` (`getCatalogs`, `getSchemas`,
`getTables`, `getColumns`) to capture underlying API calls.
* **Pagination Span Links**: Captured the parent span context at the
start of `fetchNextPages` in `BigQueryStatement.java` and linked
background pagination spans back to it, avoiding timeline anomalies.
* **Cross-Thread Context Propagation**: Stored the `SpanContext` in
`BigQueryBaseResultSet.java` at creation time and made it current during
`next()` in `BigQueryJsonResultSet.java` and
`BigQueryArrowResultSet.java` to survive thread hops.
* **Tracer Reuse**: Extracted `getSafeTracer` to
`BigQueryJdbcOpenTelemetry.java` as a static utility to ensure
consistent fallback behavior across the driver.
* **Lambda Extraction**: Extracted the large lambda function in
`populateArrowBufferedQueue` in `BigQueryStatement.java` to its own
private method `processArrowStream` to improve readability and
maintainability.
b/491245568

### Changes

#### Enhanced Unit Testing Infrastructure
- `OpenTelemetryTestUtility`: Introduced a shared utility to simplify
complex OTel assertions
- `BigQueryStatementTest`: Added comprehensive parameterised tracing
tests for all execution paths, including attribute validation for
queries and batch operations.
- `BigQueryDatabaseMetaDataTest`: Instrumented all metadata discovery
methods and added corresponding unit tests to verify span generation
using parameterised tests
b/496678357

This PR implements the **Correlated GCP Logging Bridge** for the
OpenTelemetry integration in the BigQuery JDBC driver. It enables
bridging standard Java logs (`java.util.logging`) to the OpenTelemetry
Logs API, allowing users to correlate logs with distributed traces and
isolate them by connection session.

### Changes

- `BigQueryDriver.java`: Implemented Cloud-Only Mode matrix logic to
suppress local file creation when `enableGcpLogExporter=true` and
`LogPath` is omitted.
- `BigQueryJdbcRootLogger.java`: Updated `setLevel` to handle
`Level.OFF` properly and skip file handler creation if path is null.
- `BigQueryConnection.java`: Attached `OpenTelemetryJulHandler` to the
`"com.google.cloud.bigquery"` namespace during initialization.
- `OpenTelemetryJulHandler.java`: Created a new handler that bridges JUL
logs to OTel Logs API with context harvesting and connection ID
filtering.
- `pom.xml`: Added `google-cloud-logging` dependency with version
`3.33.0-SNAPSHOT` and auto-update marker.
- `OpenTelemetryJulHandlerTest.java`: Created unit tests using
`OpenTelemetryExtension` to verify log emission and filtering.
…#13039)

b/491238299
b/511147053

This PR completes the implementation of the OpenTelemetry SDK lifecycle
and cross-project authentication for the BigQuery JDBC driver. It
introduces thread-safe caching of heavy OTel SDK instances to support
multi-project tracing without global side effects.

### Changes

#### `BigQueryJdbcOpenTelemetry.java`
- A `ConcurrentHashMap` caches `OpenTelemetrySdk` instances, keyed by a
concatenated string of `ProjectId` and `Credentials`
- The `getOpenTelemetry()` method lazily loads and initializes the SDK
only when requested and not present in the cache.
- A JVM shutdown hook closes each cached SDK to ensure pending traces
are flushed on application exit.

#### `BigQueryJdbcUrlUtility.java`
- `gcpTelemetryCredentials` and `gcpTelemetryProjectId` are added to
connection properties.

#### `BigQueryJdbcOAuthUtility.java`
- The `isJson()` helper method is changed from private to
package-private to allow reuse in `BigQueryJdbcOpenTelemetry.`

#### `BigQueryConnection.java`
- Added new connection properties to be used
- Uses `Boolean.TRUE.equals()` when checking `enableGcpLogExporter` and
`enableGcpTraceExporter` to safely handle cases where these properties
are not specified `null` and avoid `NullPointerException` during
auto-unboxing.

#### `BigQueryJdbcOpenTelemetryTest.java`
- New unit tests verify that SDK instances are correctly cached for
identical keys and isolated for different keys.

#### `BigQueryArrowStructTest.java`
- A fix is applied to avoid varargs ambiguity in array creation,
resolving a specific coercion failure in the structOfArrays test.
…agation (#13187)

b/496720140

## Changes

### Context Propagation & Session Tracking
* **Baggage Injection**: Injected the generated Connection UUID into
OpenTelemetry Baggage upon `BigQueryConnection` initialization to enable
reliable log correlation.
* **Log Handler Update**: Updated `OpenTelemetryJulHandler` to rely on
Baggage for retrieving the connection ID, removing the legacy MDC
fallback.
* **Thread Pool Audit**: Wrapped tasks submitted to background executors
in `BigQueryDatabaseMetaData` with `Context.current().wrap()`, ensuring
trace context is not lost during parallel metadata fetching.

### Span Enrichment & Semantic Conventions
* **Attributes**: Enriched JDBC spans with standard attributes:
`db.system = "bigquery"`, `db.connection_id`, and `db.application`
(derived from `partnerToken` or falling back to
`"Google-BigQuery-JDBC-Driver"`).
* **Scope Separation**: Implemented separate tracers for the JDBC driver
(`com.google.cloud.bigquery.jdbc`) and the SDK
(`com.google.cloud.bigquery`) to allow clean filtering in tracing UIs
while maintaining correlation.

### Instrumentation
* **PreparedStatement**: Added missing instrumentation for
`BigQueryPreparedStatement` execution methods (`execute`,
`executeQuery`, `executeLargeUpdate`) to generate spans.

### Refactoring & Cleanups
* **Centralized Tracing**: Created a centralized `withTracing` helper in
`BigQueryJdbcOpenTelemetry.java` to eliminate duplicated tracing logic
in `BigQueryStatement` and `BigQueryDatabaseMetaData`.
* **Constants**: Defined all semantic convention keys as constants in
`BigQueryJdbcOpenTelemetry.java` to eliminate magic strings from method
bodies.
* **Simplifications**: Simplified redundant boolean checks in
`BigQueryConnection.java`.
# Conflicts:
#	java-bigquery/google-cloud-bigquery-jdbc/pom.xml
#	java-bigquery/google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryConnection.java
#	java-bigquery/google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryDatabaseMetaData.java
#	java-bigquery/google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryDriver.java
#	java-bigquery/google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryPreparedStatement.java
#	java-bigquery/google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryStatement.java
#	java-bigquery/google-cloud-bigquery-jdbc/src/test/java/com/google/cloud/bigquery/jdbc/BigQueryDatabaseMetaDataTest.java
# Conflicts:
#	java-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryJdbcOpenTelemetry.java
#	java-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/OpenTelemetryJulHandler.java
#	java-bigquery-jdbc/src/test/java/com/google/cloud/bigquery/jdbc/BigQueryJdbcOpenTelemetryTest.java
#	java-bigquery-jdbc/src/test/java/com/google/cloud/bigquery/jdbc/OpenTelemetryJulHandlerTest.java
#	java-bigquery-jdbc/src/test/java/com/google/cloud/bigquery/jdbc/OpenTelemetryTestUtility.java
@logachev logachev force-pushed the jdbc/feature-branch-otel branch from 7bebb0f to 6176a4a Compare May 22, 2026 05:02
keshavdandeva and others added 6 commits May 22, 2026 13:24
b/499079838

### Changes

#### 1. BigQueryJdbcOpenTelemetry.java
* **Feature**: Added safe, generous default attribute value length
limits of **`60KB (61,440 characters)`** to the autoconfigured
OpenTelemetry instance properties.
* **Why**: Prevents GCP Cloud Trace from silently rejecting and dropping
span batches when we log massive `BigQueryException` stack traces or
Arrow schema payloads exceeding the hard 64KB Cloud Trace backend
limits.
* **Design**: If the user explicitly configures their own limits the
driver automatically skips the defaults and respects their overrides.

#### 2. OpenTelemetryJulHandler.java
* **Fix**: Configured the handler level to `Level.ALL` in the
constructor.
* **Why**: Bypasses a standard Java Logging (JUL) constraint where
handlers default to `Level.INFO` and silently drop `FINE`/`DEBUG`
queries. Delegates log filtering exclusively to the Connection loggers.

#### 3. BigQueryConnection.java
* **Visibility**: Exposed the visibility of the connection session
identifier by changing `getConnectionId()` from package-private to
`public`.
* **Why**: Allows automated E2E tests to retrieve the UUID and harvest
specific logs/traces accurately.

#### 4. ITOpenTelemetryTest.java
* **Feature**: Implemented a new standalone E2E integration test suite
verifying the live GCP OTel egress.
* **Test 1 (`testExecute_withOpenTelemetryGcpExporter`)**: Natively
resolves target project via `ServiceOptions.getDefaultProjectId()`. Runs
an optimized in-memory array query and iterates results to trigger
small-page JSON pagination. Queries Cloud Trace E2E to strictly assert
that async pagination child spans are parented perfectly under the root
JDBC span.
* **Test 2 (`testExecute_withErrorCorrelation`)**: Triggers database
failures, captures `SQLException`, harvests Trace IDs from standard
logs, and verifies failed span ingestion in Cloud Trace.

#### 5. BigQueryConnectionTest.java
* **Feature**: Added a new unit test
(`testConnect_withCustomOpenTelemetry_usesCustomInstance`) verifying the
custom OTel injection pipeline.
* **What it does**: Leverages `OpenTelemetryExtension` to mock an OTel
provider locally. Injecting the custom SDK via properties, it validates
that `BigQueryConnection` resolves the instance and routes spans
exclusively to the custom provider

#### 6. pom.xml
* **Dependencies**: Added `google-cloud-trace` test-scoped dependency to
query Cloud Trace v1 API programmatically during E2E validation.

---------

Co-authored-by: cloud-java-bot <cloud-java-bot@google.com>
Co-authored-by: Kirill Logachev <kirl@google.com>
…ing SDK (#13293)

b/517588332

This PR completes the linkage between the JDBC driver's OpenTelemetry
instrumentation and the underlying BigQuery SDK to ensure full
end-to-end traces.

## Key Changes
- **Dependency**: Added `io.grpc:grpc-opentelemetry` to intercept
low-level gRPC network spans for the Storage API.
- **REST API**: Enabled `setEnableOpenTelemetryTracing(true)` in
`BigQueryOptions` to unlock SDK-level tracing for standard queries.
- **HTAPI (Storage API)**: Enabled `setEnableOpenTelemetryTracing(true)`
in `BigQueryReadSettings` and wired the `GrpcOpenTelemetry` interceptor
to the channel builder.
- **Global OTel**: Ensured the `useGlobalOpenTelemetry` flag is
respected when configuring the Storage API client.
b/517498094

This PR fixes dependency analysis failures and flaky test issues
identified in the OpenTelemetry integration feature branch.
@keshavdandeva keshavdandeva changed the title feat(jdbc): OpenTelemetry integration in BQ JDBC feat(bigquery-jdbc): OpenTelemetry integration in BQ JDBC May 29, 2026
…n refresh (#13302)

b/516416076

This PR enables the BigQuery JDBC driver to use custom Service Account
credentials (JSON string or file path) for OpenTelemetry tracing,
bypassing the ADC-only limitation of the default GCP extension.

### **Key Changes**
* **`BigQueryJdbcOpenTelemetry.java`**: Added a customizer to inject
dynamic OAuth2 headers into OTLP exporters (supporting both HTTP and
gRPC) while preserving auto-configured properties.
* **`ITOpenTelemetryTest.java`**: Added 4 integration tests to verify
custom credentials and transport protocols.
* **`ITBase.java`**: Moved the shared `getAuthJson()` helper here to
remove duplication.
* **`pom.xml`**: Moved `opentelemetry-sdk-trace` to compile scope to
support the implementation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do not merge Indicates a pull request not ready for merge, due to either quality or timing.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants