Skip to content

[ibm-mq-metrics] move to produce metrics via MetricProducer#2836

Open
atoulme wants to merge 6 commits into
open-telemetry:mainfrom
atoulme:producer
Open

[ibm-mq-metrics] move to produce metrics via MetricProducer#2836
atoulme wants to merge 6 commits into
open-telemetry:mainfrom
atoulme:producer

Conversation

@atoulme
Copy link
Copy Markdown
Contributor

@atoulme atoulme commented May 15, 2026

Description:

Move away from the meter, and use the SDK API to create a MetricProducer to produce data points to be directly sent to the backend.

Copilot AI review requested due to automatic review settings May 15, 2026 07:02
@atoulme atoulme requested a review from a team as a code owner May 15, 2026 07:02
@github-actions github-actions Bot requested a review from breedx-splk May 15, 2026 07:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces the OpenTelemetry Meter-based instrumentation in the ibm-mq-metrics module with a custom MetricProducer implementation that builds MetricData directly and registers itself with the SDK's MeterProvider via registerMetricProducer. The Metrics.java weaver template (which generated Meter-based factory methods) is replaced by a MetricProducer.java template that generates one record* method per metric, each appending a fully-formed MetricData to an internal list that the SDK drains on each collection cycle.

Changes:

  • New MetricProducer and (auto-value-backed) MetricData classes, plus updated weaver template/config; Metrics.java.j2 removed.
  • All collectors (ChannelMetricsCollector, QueueManagerMetricsCollector, QueueCollectionBuddy, etc.), WmqMonitor, and Main are reworked to take a MetricProducer instead of a Meter and call producer.record* directly.
  • All unit/integration tests are migrated off OpenTelemetryExtension/Meter to construct a MetricProducer and assert on producer.produce(Resource.empty()).

Reviewed changes

Copilot reviewed 28 out of 29 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
ibm-mq-metrics/templates/registry/java/weaver.yaml Switches generation to the new MetricProducer.java.j2 template.
ibm-mq-metrics/templates/registry/java/Metrics.java.j2 Deleted (replaced by MetricProducer.java.j2).
ibm-mq-metrics/templates/registry/java/MetricProducer.java.j2 New template generating one record* method per metric.
ibm-mq-metrics/src/main/java/io/opentelemetry/ibm/mq/metrics/MetricProducer.java Generated producer class implementing the SDK MetricProducer SPI.
ibm-mq-metrics/src/main/java/io/opentelemetry/ibm/mq/metrics/MetricData.java AutoValue-backed MetricData with a createMetricData factory.
ibm-mq-metrics/src/main/java/io/opentelemetry/ibm/mq/WmqMonitor.java Drops Meter/Long* fields, uses producer directly for heartbeat & connection-error metrics.
ibm-mq-metrics/src/main/java/io/opentelemetry/ibm/mq/opentelemetry/Main.java Wires the producer into the auto-configured SDK via registerMetricProducer.
ibm-mq-metrics/src/main/java/.../metricscollector/ChannelMetricsCollector.java Replaces gauge fields with producer recorder calls.
ibm-mq-metrics/src/main/java/.../metricscollector/InquireChannelCmdCollector.java Replaces gauge fields with producer recorder calls.
ibm-mq-metrics/src/main/java/.../metricscollector/InquireQueueManagerCmdCollector.java Replaces gauge field with producer recorder call.
ibm-mq-metrics/src/main/java/.../metricscollector/InquireTStatusCmdCollector.java Replaces gauge fields with producer recorder calls.
ibm-mq-metrics/src/main/java/.../metricscollector/ListenerMetricsCollector.java Replaces gauge field with producer recorder call.
ibm-mq-metrics/src/main/java/.../metricscollector/PerformanceEventQueueCollector.java Replaces counter fields with producer recorder calls.
ibm-mq-metrics/src/main/java/.../metricscollector/QueueCollectionBuddy.java Switches AllowedGauge registry from LongGauge to BiConsumer<Long,Attributes> recorder.
ibm-mq-metrics/src/main/java/.../metricscollector/QueueManagerEventCollector.java Replaces counter field with producer recorder call.
ibm-mq-metrics/src/main/java/.../metricscollector/QueueManagerMetricsCollector.java Replaces gauge fields with producer recorder calls.
ibm-mq-metrics/src/main/java/.../metricscollector/QueueMetricsCollector.java Constructor now takes a MetricProducer.
ibm-mq-metrics/src/main/java/.../metricscollector/ReadConfigurationEventQueueCollector.java Replaces gauge field with producer recorder call.
ibm-mq-metrics/src/main/java/.../metricscollector/TopicMetricsCollector.java Constructor now takes a MetricProducer.
ibm-mq-metrics/src/test/java/.../*CollectorTest.java (5 files) Migrated tests to construct a MetricProducer directly and assert on its produce() output.
ibm-mq-metrics/src/integrationTest/java/.../TestWMQMonitor.java Holds a MetricProducer instead of a Meter.
ibm-mq-metrics/src/integrationTest/java/.../WMQMonitorIntegrationTest.java Replaces OpenTelemetryExtension with a directly-instantiated MetricProducer.
ibm-mq-metrics/build.gradle.kts Adds AutoValue annotations + processor for MetricData.
ibm-mq-metrics/config.yml Trailing newline added.
Comments suppressed due to low confidence (4)

ibm-mq-metrics/src/main/java/io/opentelemetry/ibm/mq/metrics/MetricProducer.java:902

  • The produce() method copies metricData into a new list and then clears it as two separate operations. Although metricData is a synchronizedList, the copy-then-clear sequence is not atomic: any record* invocation that occurs between the new ArrayList<>(this.metricData) (which internally synchronizes on the list) and this.metricData.clear() will be silently dropped. Additionally, iterating the synchronized list inside the ArrayList copy constructor without an explicit synchronized(this.metricData) block can throw ConcurrentModificationException if a recorder thread mutates the list during iteration. Wrap both operations in a single synchronized(this.metricData) block (or use a different concurrent structure such as draining via ConcurrentLinkedQueue).
  public List<MetricData> produce(Resource resource) {
    List<MetricData> collectedPoints = new ArrayList<>(this.metricData);
    this.metricData.clear();
    this.currentEpochNanos = Clock.getDefault().nanoTime();
    return collectedPoints;
  }

ibm-mq-metrics/src/main/java/io/opentelemetry/ibm/mq/metrics/MetricProducer.java:902

  • The Resource parameter passed by the SDK to produce(Resource resource) is ignored — the producer always exports metrics tagged with the Resource provided in the constructor. Per the MetricProducer SPI contract this argument is the resource the SDK wants the producer to associate metrics with (typically the SDK's own resource), so ignoring it can cause exported metrics to carry a different/empty resource than other metrics from the same SDK pipeline. Either use the supplied resource when building the MetricData, or document why it's deliberately ignored.
  @Override
  public List<MetricData> produce(Resource resource) {
    List<MetricData> collectedPoints = new ArrayList<>(this.metricData);
    this.metricData.clear();
    this.currentEpochNanos = Clock.getDefault().nanoTime();
    return collectedPoints;
  }

ibm-mq-metrics/src/main/java/io/opentelemetry/ibm/mq/metrics/MetricProducer.java:56

  • Each record* call allocates a singleton List, a LongPointData, a GaugeData/SumData, and a MetricData object, plus a synchronized-list add (and a Clock.getDefault() lookup). Previously, LongGauge.set / LongCounter.add cost essentially nothing on the hot path, with aggregation done at collection time. For collectors that fire frequently or with many attribute combinations this is a meaningful regression in CPU and GC pressure. Consider aggregating points per (name, attributes) so that only one point per metric stream is emitted per collection cycle, similar to what the SDK aggregator did before.
  public void recordIbmMqMessageRetryCount(long value, Attributes attributes) {
    this.metricData.add(
        createMetricData(
            this.resource,
            this.instrumentationScopeInfo,
            "ibm.mq.message.retry.count",
            "Number of message retries",
            "{message}",
            MetricDataType.LONG_GAUGE,
            GaugeData.createLongGaugeData(
                Collections.singletonList(
                    LongPointData.create(
                        this.currentEpochNanos,
                        Clock.getDefault().nanoTime(),
                        attributes,
                        value)))));
  }

ibm-mq-metrics/src/main/java/io/opentelemetry/ibm/mq/metrics/MetricProducer.java:292

  • Counter-style metrics (recordIbmMq*Event, recordIbmMqUnauthorizedEvent, recordIbmMqConnectionErrors) emit one LongPointData per call carrying only the per-call delta value (e.g. 1), but they are declared as AggregationTemporality.CUMULATIVE and isMonotonic=true. A cumulative sum point is supposed to carry the total since startEpochNanos, not a delta; with this implementation each export interval will emit several independent "cumulative" points all of value 1 with the same start/end (or with monotonic-time values), which downstream cumulative-temporality consumers will interpret incorrectly (e.g., as resets to 1 each interval). Either pre-aggregate per (name, attributes) before emission, or declare these as AggregationTemporality.DELTA.
  public void recordIbmMqQueueDepthFullEvent(long value, Attributes attributes) {
    this.metricData.add(
        createMetricData(
            this.resource,
            this.instrumentationScopeInfo,
            "ibm.mq.queue.depth.full.event",
            "The number of full queue events",
            "{event}",
            MetricDataType.LONG_SUM,
            SumData.createLongSumData(
                /* isMonotonic= */ true,
                AggregationTemporality.CUMULATIVE,
                Collections.singletonList(
                    LongPointData.create(
                        this.currentEpochNanos,
                        Clock.getDefault().nanoTime(),
                        attributes,
                        value)))));
  }

Comment thread ibm-mq-metrics/src/main/java/io/opentelemetry/ibm/mq/metrics/MetricProducer.java Outdated
Comment thread ibm-mq-metrics/templates/registry/java/MetricProducer.java.j2 Outdated
Comment thread ibm-mq-metrics/templates/registry/java/MetricProducer.java.j2 Outdated
@atoulme atoulme changed the title move to produce metrics via MetricProducer [ibm-mq-metrics] move to produce metrics via MetricProducer May 15, 2026
Comment thread ibm-mq-metrics/src/main/java/io/opentelemetry/ibm/mq/metrics/MetricProducer.java Outdated
…etricProducer.java

Co-authored-by: jason plumb <75337021+breedx-splk@users.noreply.github.com>
public List<MetricData> produce(Resource resource) {
List<MetricData> collectedPoints = new ArrayList<>(this.metricData);
this.metricData.clear();
this.currentEpochNanos = Clock.getDefault().nanoTime();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this.currentEpochNanos = Clock.getDefault().nanoTime();
this.currentEpochNanos = Clock.getDefault().now();

Comment on lines +757 to +758
List<MetricData> collectedPoints = new ArrayList<>(this.metricData);
this.metricData.clear();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this is not thread safe.

It's possible for other methods to add data points to the metricData as the new array is being copied but before the clear(). There is real risk of those data points being dropped.

Two possible ways that I can of solving this:

  1. ditch the Collections.synchronizedList for a plain ArrayList, create a mutex object, and explicitly protect all accesses thru the mutex. It's a little cumbersome.
  2. change metricData from List to AtomicReference<List<...>> and instead of clearing it here use metricDataReference.getAndSet(new ArrayList<>()) to atomically swap out the reference and get the underlying (old) list for returning.

Of those, I think 2 is a bit cleaner.

@breedx-splk
Copy link
Copy Markdown
Contributor

I don't fully understand this, but I'll share in case this makes sense to you @atoulme:

The new producer changes counters from “accumulate per series” to “append
one MetricData object per call”, which breaks cumulative sum semantics. For
example, recordIbmMqQueueDepthFullEvent, recordIbmMqUnauthorizedEvent, and
recordIbmMqConnectionErrors each emit a standalone LONG_SUM point with value
1 (MetricProducer.java:235-249, 706-720, 738-752), and the collectors call
them once per event/error (PerformanceEventQueueCollector.java:85-99,
QueueManagerEventCollector.java:54-67, WmqMonitor.java:109-118). If two
matching events happen before the next export, this no longer produces one
cumulative point with value 2; it produces multiple cumulative points for
the same series, each still at 1. The updated tests are already asserting
this append-only shape for gauges too (TopicMetricsCollectorTest.java:59-75,
ListenerMetricsCollectorTest.java:57-65), which suggests the implementation
drift is intentional rather than caught accidentally.

it feels like a type of aggregation is being lost there, but I'm not sure.

Other than that, this looks good. I'm honestly surprised at how much cleaner it makes things! 🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants