-
Notifications
You must be signed in to change notification settings - Fork 480
Description
Description
Problem
Database connections leak from the telemetry metrics collection system. Each invocation of MetricStatsCollector.getStats() orphans one connection per database. Connections accumulate indefinitely because HikariCP cannot reclaim them — close() is never called on the HikariConnection wrapper. With maxPoolSize=200, the pool drains over days until getConnection() calls block or timeout.
Surfaced at v26.01.02-01 but accumulating since v25.12.22-01 (commit e8ac669, Dec 15 2025). Production evidence: pg_stat_activity shows 30+ connections idle 11+ days, all with last_query from CountVariantsInAllArchivedExperimentsMetricType.
Root Cause
Commit e8ac669 moved metric queries onto a newSingleThreadExecutor() for timeout enforcement. DbConnectionFactory stores connections in a ThreadLocal. The executor thread opens a connection when the first query runs. shutdownNow() kills the thread without calling close(). The closeSilently() in the finally block runs on the calling thread and cannot reach the executor thread's ThreadLocal. The connection is orphaned permanently.
MetricsAPIImpl.getValue() has @CloseDBIfOpened but this annotation is implemented via ByteBuddy class instrumentation which cannot see through Weld CDI proxy subclasses — the annotation silently does nothing when invoked via the CDI proxy returned by APILocator.
HikariCP (SystemEnvDataSourceStrategy): maxLifetime=15min, idleTimeout=5min, leakDetectionThreshold=5min. None of these reclaim checked-out connections. They only apply to connections returned to the pool. leakDetectionThreshold logs a warning only — no close or reclaim.
Fix (this issue)
-
Wrap DBMetricType.getValue() default method in LocalTransaction.wrapReturn(). Each metric gets its own properly-scoped connection lifecycle, independent of @CloseDBIfOpened firing on CDI proxies. ContentTypeFieldsMetricType already uses this exact pattern in its overridden getStat(). Fixes all 130+ DBMetricType implementations at once.
-
Remove dead openDBConnection() call (line 114) and private method (lines 322-324) from MetricStatsCollector. These open and close a connection on the calling thread that no query ever uses — dead code left over from before the executor was introduced.
-
Add ArchUnit test to CodingStandardsArchTest: no CDI normal-scope bean should have methods annotated with @CloseDBIfOpened or @WrapInTransaction. ByteBuddy cannot see through Weld proxies so these annotations silently do nothing on CDI beans. 7 pre-existing violations will be surfaced.
Follow-up (separate issue)
Create CDI @interceptor versions of @CloseDBIfOpened and @WrapInTransaction. Adding @InterceptorBinding to the existing annotations and creating interceptor beans makes them work correctly on CDI-proxied beans, complementing ByteBuddy which continues to cover non-CDI classes. This is backward-compatible and clears all 7 ArchUnit violations system-wide.
Files
- dotCMS/src/main/java/com/dotcms/telemetry/collectors/DBMetricType.java (lines 19-21)
- dotCMS/src/main/java/com/dotcms/telemetry/collectors/MetricStatsCollector.java (lines 114, 322-324)
- dotCMS/src/test/java/com/dotcms/architecture/CodingStandardsArchTest.java (new test method)
Acceptance Criteria
- [Define what needs to be accomplished]
Additional Notes
[Any additional context or notes]
Metadata
Metadata
Assignees
Labels
Type
Projects
Status