Conversation
…execution time, pool stats, slot metrics Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com>
|
@copilot why ojp_sql_execution_time_ms uses the sql hash as label? this make readability difficult. |
…cution_time_ms metric Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com>
The hash was used because
Example metric in Prometheus after the change: |
|
@copilot does this mean that if the slow query segregation is disabled we cannot collect the metrics? if so the collection of execution time needs to be removed somewere else and be active if any of slow query segration or advanced telemetry is enabled. |
…move to StatementServiceImpl Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com>
Correct observation — it was an architectural coupling rather than an actual gap (the
|
|
Must evaluate this one first #318 |
|
@copilot review the latest commit to main branch which introduced, especially metrics for XA pools as well. evaluate how that affects this PR. Ideally we want to apply these advanced metrics in the XA implementation as well. |
Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com>
Done in a740c02. The XA path now has full metric coverage parity with regular connections: New XA pool gauges (read from Existing metrics extended to XA:
Also fixed a merge artifact from main where |
|
@copilot test failure |
…GrpcTelemetry; wire ojpMetrics into StatementServiceImpl Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com>
Fixed in db79b1a. Root cause was a pre-existing bug in Two fixes:
Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@copilot got this error in the prometheus URL http://localhost:9159/metrics while trying to test metrics XA and non XA at the same time An Exception occurred while scraping metrics: io.prometheus.metrics.model.snapshots.DuplicateLabelsException: Duplicate labels for metric "ojp.xa.pool.connections": {otel_scope_name="ojp.xa.pool",pool.name="ojp-xa-pool"} |
…MetricsFactory key Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com>
Fixed in a09d958. Two bugs combined to cause this:
Fixes:
After the fix, concurrent XA connections each get distinct Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@copilot error still happening Also we should not be using the connection hash for this, can we not use the datasource name + 4 first chars of connection hash? Would that work? |
…nup on pool recreation Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com>
Fixed in d7aac6c. There were two independent root causes: 1. Pool recreation wasn't cleaning up gauge callbacks — 2. Pool labels now use datasource name +
This satisfies the requirement of distinguishing a non-XA and XA pool on the same datasource while staying readable. Both Example output after fix: Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@copilot I never told you to use the database name, that is a security risk. Use the datasource name in the OJP perspective, if none is set "default" is assumed. |
… extractDbName Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com>
Fixed in 693afb9.
Label format: Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
|
done in a different PR. |



OJP previously exposed only gRPC-level metrics via OpenTelemetry. This adds eleven custom Prometheus metrics covering the three highest-value gaps: connection queue depth, connection wait time, and per-statement SQL execution time — plus pool stats (for both HikariCP and XA pools) and slot activity.
New metrics (
/metrics)ojp_connection_queue_depthconn_hashojp_connection_wait_time_msconn_hash,outcome(success/failure)ojp_sql_execution_time_mssql(truncated SQL text)ojp_pool_active_connectionspoolojp_pool_idle_connectionspoolojp_pool_pending_threadspoolojp_xa_pool_active_sessionspoolojp_xa_pool_idle_sessionspoolojp_xa_pool_pending_threadspoolojp_slot_active_slowojp_slot_active_fastPool gauge
poollabels use the format<dsName>_<4-char-hash>for HikariCP pools and<dsName>_xa_<4-char-hash>for XA pools, where<dsName>is the OJP datasource name configured in OJP (defaulting to"default"if none is set). This keeps labels human-readable while guaranteeing uniqueness even when a non-XA and XA pool share the same datasource name in the same connection. No JDBC URL or database server details are ever exposed in metric labels.The
sqllabel onojp_sql_execution_time_mscontains the first 100 characters of the normalised (lower-cased, whitespace-collapsed) SQL text, giving human-readable metric labels while keeping Prometheus cardinality bounded.ojp_connection_queue_depthandojp_connection_wait_time_mscover both regular connection acquisition and XA session borrowing.Key changes
OjpMetrics(new) — owns all custom OTel instruments; injected as an optional dependency (null-safe throughout, no behaviour change when telemetry is disabled); includestoSqlSnippet()helper that normalises and truncates SQL text; uses a pre-compiled regex for whitespace collapsing on the hot path;buildPoolLabel(dsName, connHash, xa)builds consistent, human-readable pool labels using the OJP datasource name (falls back to"default") and the first 4 chars of the connection hash;registerDatasource(connHash, poolLabel, ds)andregisterXaPool(connHash, poolLabel, ds)accept a separate label so the internal registry key (fullconnHash) is decoupled from the display labelOjpServerTelemetry— exposesgetOpenTelemetry()andcreateOjpMetrics()so custom metrics share the same Prometheus endpoint as gRPC metrics; fixed a pre-existing bug in the 9-paramcreateGrpcTelemetryoverload where the result was assigned to a local variable instead ofthis.openTelemetryConnectionAcquisitionManager— new overload acceptsOjpMetrics; increments/decrementsqueue_deptharoundgetConnection()and recordswait_time_msSlowQuerySegregationManager— new constructor acceptsOjpMetrics; emits slot counters (slow/fast slot acquired/released) on acquire/release. SQL execution time recording is intentionally kept outside this class so it fires regardless of whether slow query segregation is enabled or disabledStatementServiceImpl—try-finallytiming blocks inexecuteUpdate()andexecuteQuery()callojpMetrics.sqlExecuted()unconditionally; timing usesSystem.nanoTime()for precision.ojp_sql_execution_time_msis always recorded as long as OpenTelemetry is enabled, independent of slow query segregation configurationHandleXAConnectionWithPoolingAction— setsxa.poolNameto the friendly pool label soOpenTelemetryPoolMetricsuses it as thepool.nameattribute; registers the XA pool withojpMetrics.registerXaPool(); wrapsborrowSession()withconnectionWaitStarted()/connectionAcquired()intry-finallyto record XA session borrow queue depth and wait timeOpenTelemetryPoolMetrics— stores all 8ObservableDoubleGaugereferences as fields;close()now calls.close()on each gauge, properly unregistering their callbacks from the OTel SDK and preventingDuplicateLabelsExceptionif a pool with the same name is recreatedActionContext/GrpcServer— wireOjpMetricsthrough the call stack;GrpcServer.main()creates the instance only whenojp.opentelemetry.enabled=trueand passes it toStatementServiceImplConnectAction— callsojpMetrics.registerDatasource(connHash, poolLabel, ds)with a friendly label derived from the OJP datasource name when a new HikariCP datasource is createdUsage
Backward-compatible: all constructors/methods that don't pass
OjpMetricscontinue to work unchanged.🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.