feat(bigquery-jdbc): OpenTelemetry integration in BQ JDBC#12902
Draft
keshavdandeva wants to merge 30 commits into
Draft
feat(bigquery-jdbc): OpenTelemetry integration in BQ JDBC#12902keshavdandeva wants to merge 30 commits into
keshavdandeva wants to merge 30 commits into
Conversation
… Statement (#12124) b/491239772 b/491239773 ### Changes - New connection properties: `enableGcpTraceExporter` (Boolean, default: false) and `enableGcpLogExporter` (Boolean, default: false) - `customOpenTelemetry` (Instance): Programmatic injection of a custom SDK (User Application-Managed setup) via `BigQueryDataSource.setCustomOpenTelemetry()` - Added the core initialization logic for `OpenTelemetry`. During connection setup, it evaluates whether tracing is enabled and constructs an OpenTelemetry Tracer instance. Then, it passes this newly minted tracer strictly downward into the core `BigQueryOptions.Builder` via `.setOpenTelemetryTracer()` - Intercepted the execution functions (`execute`, `executeQuery`, `executeLargeUpdate`, `executeBatch`) to spawn child spans wrapping each database call.
Contributor
There was a problem hiding this comment.
Code Review
This pull request integrates OpenTelemetry into the BigQuery JDBC driver, enabling tracing for core operations such as query execution, batch updates, and background data processing tasks like pagination and Arrow stream processing. It adds support for custom OpenTelemetry instances and introduces configuration flags for GCP trace and log exporters. The review feedback recommends performance optimizations by using static tracer constants, better adherence to OpenTelemetry semantic conventions for database spans (e.g., setting db.system and SpanKind), and improving user feedback when certain exporter features are still under development.
…agination (#12918) b/491245568 ### Key Changes #### Core Instrumentation Logic * **Database Metadata Tracing**: Added OTel spans to key methods in `BigQueryDatabaseMetaData.java` (`getCatalogs`, `getSchemas`, `getTables`, `getColumns`) to capture underlying API calls. * **Pagination Span Links**: Captured the parent span context at the start of `fetchNextPages` in `BigQueryStatement.java` and linked background pagination spans back to it, avoiding timeline anomalies. * **Cross-Thread Context Propagation**: Stored the `SpanContext` in `BigQueryBaseResultSet.java` at creation time and made it current during `next()` in `BigQueryJsonResultSet.java` and `BigQueryArrowResultSet.java` to survive thread hops. * **Tracer Reuse**: Extracted `getSafeTracer` to `BigQueryJdbcOpenTelemetry.java` as a static utility to ensure consistent fallback behavior across the driver. * **Lambda Extraction**: Extracted the large lambda function in `populateArrowBufferedQueue` in `BigQueryStatement.java` to its own private method `processArrowStream` to improve readability and maintainability.
b/491245568 ### Changes #### Enhanced Unit Testing Infrastructure - `OpenTelemetryTestUtility`: Introduced a shared utility to simplify complex OTel assertions - `BigQueryStatementTest`: Added comprehensive parameterised tracing tests for all execution paths, including attribute validation for queries and batch operations. - `BigQueryDatabaseMetaDataTest`: Instrumented all metadata discovery methods and added corresponding unit tests to verify span generation using parameterised tests
b/496678357 This PR implements the **Correlated GCP Logging Bridge** for the OpenTelemetry integration in the BigQuery JDBC driver. It enables bridging standard Java logs (`java.util.logging`) to the OpenTelemetry Logs API, allowing users to correlate logs with distributed traces and isolate them by connection session. ### Changes - `BigQueryDriver.java`: Implemented Cloud-Only Mode matrix logic to suppress local file creation when `enableGcpLogExporter=true` and `LogPath` is omitted. - `BigQueryJdbcRootLogger.java`: Updated `setLevel` to handle `Level.OFF` properly and skip file handler creation if path is null. - `BigQueryConnection.java`: Attached `OpenTelemetryJulHandler` to the `"com.google.cloud.bigquery"` namespace during initialization. - `OpenTelemetryJulHandler.java`: Created a new handler that bridges JUL logs to OTel Logs API with context harvesting and connection ID filtering. - `pom.xml`: Added `google-cloud-logging` dependency with version `3.33.0-SNAPSHOT` and auto-update marker. - `OpenTelemetryJulHandlerTest.java`: Created unit tests using `OpenTelemetryExtension` to verify log emission and filtering.
…#13039) b/491238299 b/511147053 This PR completes the implementation of the OpenTelemetry SDK lifecycle and cross-project authentication for the BigQuery JDBC driver. It introduces thread-safe caching of heavy OTel SDK instances to support multi-project tracing without global side effects. ### Changes #### `BigQueryJdbcOpenTelemetry.java` - A `ConcurrentHashMap` caches `OpenTelemetrySdk` instances, keyed by a concatenated string of `ProjectId` and `Credentials` - The `getOpenTelemetry()` method lazily loads and initializes the SDK only when requested and not present in the cache. - A JVM shutdown hook closes each cached SDK to ensure pending traces are flushed on application exit. #### `BigQueryJdbcUrlUtility.java` - `gcpTelemetryCredentials` and `gcpTelemetryProjectId` are added to connection properties. #### `BigQueryJdbcOAuthUtility.java` - The `isJson()` helper method is changed from private to package-private to allow reuse in `BigQueryJdbcOpenTelemetry.` #### `BigQueryConnection.java` - Added new connection properties to be used - Uses `Boolean.TRUE.equals()` when checking `enableGcpLogExporter` and `enableGcpTraceExporter` to safely handle cases where these properties are not specified `null` and avoid `NullPointerException` during auto-unboxing. #### `BigQueryJdbcOpenTelemetryTest.java` - New unit tests verify that SDK instances are correctly cached for identical keys and isolated for different keys. #### `BigQueryArrowStructTest.java` - A fix is applied to avoid varargs ambiguity in array creation, resolving a specific coercion failure in the structOfArrays test.
…agation (#13187) b/496720140 ## Changes ### Context Propagation & Session Tracking * **Baggage Injection**: Injected the generated Connection UUID into OpenTelemetry Baggage upon `BigQueryConnection` initialization to enable reliable log correlation. * **Log Handler Update**: Updated `OpenTelemetryJulHandler` to rely on Baggage for retrieving the connection ID, removing the legacy MDC fallback. * **Thread Pool Audit**: Wrapped tasks submitted to background executors in `BigQueryDatabaseMetaData` with `Context.current().wrap()`, ensuring trace context is not lost during parallel metadata fetching. ### Span Enrichment & Semantic Conventions * **Attributes**: Enriched JDBC spans with standard attributes: `db.system = "bigquery"`, `db.connection_id`, and `db.application` (derived from `partnerToken` or falling back to `"Google-BigQuery-JDBC-Driver"`). * **Scope Separation**: Implemented separate tracers for the JDBC driver (`com.google.cloud.bigquery.jdbc`) and the SDK (`com.google.cloud.bigquery`) to allow clean filtering in tracing UIs while maintaining correlation. ### Instrumentation * **PreparedStatement**: Added missing instrumentation for `BigQueryPreparedStatement` execution methods (`execute`, `executeQuery`, `executeLargeUpdate`) to generate spans. ### Refactoring & Cleanups * **Centralized Tracing**: Created a centralized `withTracing` helper in `BigQueryJdbcOpenTelemetry.java` to eliminate duplicated tracing logic in `BigQueryStatement` and `BigQueryDatabaseMetaData`. * **Constants**: Defined all semantic convention keys as constants in `BigQueryJdbcOpenTelemetry.java` to eliminate magic strings from method bodies. * **Simplifications**: Simplified redundant boolean checks in `BigQueryConnection.java`.
# Conflicts: # java-bigquery/google-cloud-bigquery-jdbc/pom.xml # java-bigquery/google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryConnection.java # java-bigquery/google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryDatabaseMetaData.java # java-bigquery/google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryDriver.java # java-bigquery/google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryPreparedStatement.java # java-bigquery/google-cloud-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryStatement.java # java-bigquery/google-cloud-bigquery-jdbc/src/test/java/com/google/cloud/bigquery/jdbc/BigQueryDatabaseMetaDataTest.java
# Conflicts: # java-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/BigQueryJdbcOpenTelemetry.java # java-bigquery-jdbc/src/main/java/com/google/cloud/bigquery/jdbc/OpenTelemetryJulHandler.java # java-bigquery-jdbc/src/test/java/com/google/cloud/bigquery/jdbc/BigQueryJdbcOpenTelemetryTest.java # java-bigquery-jdbc/src/test/java/com/google/cloud/bigquery/jdbc/OpenTelemetryJulHandlerTest.java # java-bigquery-jdbc/src/test/java/com/google/cloud/bigquery/jdbc/OpenTelemetryTestUtility.java
7bebb0f to
6176a4a
Compare
b/499079838 ### Changes #### 1. BigQueryJdbcOpenTelemetry.java * **Feature**: Added safe, generous default attribute value length limits of **`60KB (61,440 characters)`** to the autoconfigured OpenTelemetry instance properties. * **Why**: Prevents GCP Cloud Trace from silently rejecting and dropping span batches when we log massive `BigQueryException` stack traces or Arrow schema payloads exceeding the hard 64KB Cloud Trace backend limits. * **Design**: If the user explicitly configures their own limits the driver automatically skips the defaults and respects their overrides. #### 2. OpenTelemetryJulHandler.java * **Fix**: Configured the handler level to `Level.ALL` in the constructor. * **Why**: Bypasses a standard Java Logging (JUL) constraint where handlers default to `Level.INFO` and silently drop `FINE`/`DEBUG` queries. Delegates log filtering exclusively to the Connection loggers. #### 3. BigQueryConnection.java * **Visibility**: Exposed the visibility of the connection session identifier by changing `getConnectionId()` from package-private to `public`. * **Why**: Allows automated E2E tests to retrieve the UUID and harvest specific logs/traces accurately. #### 4. ITOpenTelemetryTest.java * **Feature**: Implemented a new standalone E2E integration test suite verifying the live GCP OTel egress. * **Test 1 (`testExecute_withOpenTelemetryGcpExporter`)**: Natively resolves target project via `ServiceOptions.getDefaultProjectId()`. Runs an optimized in-memory array query and iterates results to trigger small-page JSON pagination. Queries Cloud Trace E2E to strictly assert that async pagination child spans are parented perfectly under the root JDBC span. * **Test 2 (`testExecute_withErrorCorrelation`)**: Triggers database failures, captures `SQLException`, harvests Trace IDs from standard logs, and verifies failed span ingestion in Cloud Trace. #### 5. BigQueryConnectionTest.java * **Feature**: Added a new unit test (`testConnect_withCustomOpenTelemetry_usesCustomInstance`) verifying the custom OTel injection pipeline. * **What it does**: Leverages `OpenTelemetryExtension` to mock an OTel provider locally. Injecting the custom SDK via properties, it validates that `BigQueryConnection` resolves the instance and routes spans exclusively to the custom provider #### 6. pom.xml * **Dependencies**: Added `google-cloud-trace` test-scoped dependency to query Cloud Trace v1 API programmatically during E2E validation. --------- Co-authored-by: cloud-java-bot <cloud-java-bot@google.com> Co-authored-by: Kirill Logachev <kirl@google.com>
…ing SDK (#13293) b/517588332 This PR completes the linkage between the JDBC driver's OpenTelemetry instrumentation and the underlying BigQuery SDK to ensure full end-to-end traces. ## Key Changes - **Dependency**: Added `io.grpc:grpc-opentelemetry` to intercept low-level gRPC network spans for the Storage API. - **REST API**: Enabled `setEnableOpenTelemetryTracing(true)` in `BigQueryOptions` to unlock SDK-level tracing for standard queries. - **HTAPI (Storage API)**: Enabled `setEnableOpenTelemetryTracing(true)` in `BigQueryReadSettings` and wired the `GrpcOpenTelemetry` interceptor to the channel builder. - **Global OTel**: Ensured the `useGlobalOpenTelemetry` flag is respected when configuring the Storage API client.
b/517498094 This PR fixes dependency analysis failures and flaky test issues identified in the OpenTelemetry integration feature branch.
…n refresh (#13302) b/516416076 This PR enables the BigQuery JDBC driver to use custom Service Account credentials (JSON string or file path) for OpenTelemetry tracing, bypassing the ADC-only limitation of the default GCP extension. ### **Key Changes** * **`BigQueryJdbcOpenTelemetry.java`**: Added a customizer to inject dynamic OAuth2 headers into OTLP exporters (supporting both HTTP and gRPC) while preserving auto-configured properties. * **`ITOpenTelemetryTest.java`**: Added 4 integration tests to verify custom credentials and transport protocols. * **`ITBase.java`**: Moved the shared `getAuthJson()` helper here to remove duplication. * **`pom.xml`**: Moved `opentelemetry-sdk-trace` to compile scope to support the implementation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a feature branch. Please do not merge to main.