Skip to content

Feature/open telemetry prometheus#39

Merged
IsaacDSC merged 9 commits intomainfrom
feature/open-telemetry-prometheus
Mar 8, 2026
Merged

Feature/open telemetry prometheus#39
IsaacDSC merged 9 commits intomainfrom
feature/open-telemetry-prometheus

Conversation

@IsaacDSC
Copy link
Copy Markdown
Owner

@IsaacDSC IsaacDSC commented Mar 6, 2026

Add metrics using otel with collector prometheus

Backoffice Metrics

  • CPU
  • Memory
  • Number of Go-routine
  • Number of thread
  • GC activity
  • Total Connection Pg
  • Total Gb in disc
  • Cache db ratio
  • Disc db reading

PubSub Metrics

  • P95 publisher msg
  • RPM publisher msg
  • RPM consumer msg
  • RPM client http sent msg
  • Lag consumer msg
  • Mem store sync activity
  • Mem store time execution(ms)
  • CPU
  • Memory
  • Number of Go-routine
  • Number of thread
  • GC activity

Task Metrics

  • Total processing task
  • Total success consumer task
  • Total failure consumer task
  • P95 publisher msg
  • RPM task publisher
  • RPM task consumer
  • RPM client http sent msg
  • CPU
  • Memory
  • Number of Go-routine
  • Number of thread
  • GC activity

More information

Overview

image image

IsaacDSC added 8 commits March 3, 2026 17:14
- Added OpenTelemetry support for metrics collection and monitoring across services.
- Implemented Prometheus metrics exporter and configured HTTP handlers for metrics exposure.
- Updated Docker Compose to include Prometheus and OpenTelemetry Collector services for observability.
- Enhanced middleware to record HTTP metrics, including request counts and durations.
- Introduced telemetry context management for better integration with existing services.
- Updated service configurations to enable metrics collection and reporting.
- Removed legacy metric implementations in favor of a centralized telemetry package for better consistency.
- Updated middleware and handlers to utilize new telemetry metrics for HTTP requests and consumer durations.
- Improved error handling and logging for telemetry events, ensuring better observability.
- Adjusted service configurations to reflect new metric definitions and enhance monitoring capabilities.
- Introduced a new `PublishedAt` field in the `RequestPayload` struct to track message publication time.
- Implemented consumer lag telemetry to measure the time difference between message publication and processing start.
- Enhanced tests to validate the handling of the new `PublishedAt` field and its impact on consumer lag metrics.
- Updated telemetry metrics to include `PubSubConsumerLagSeconds` for improved observability of processing delays.
…synchronization

- Added telemetry metrics for task consumer processing, including total processing, success, and failure counts.
- Integrated memory store activity duration tracking to monitor refresh performance.
- Updated middleware to utilize new metrics for improved observability of task handling.
- Refactored logging to include more detailed context during memory store synchronization.
- Added new Grafana dashboards for backoffice, pubsub, and task services to enhance monitoring capabilities.
- Removed legacy dashboard files and replaced them with updated JSON configurations for better performance and usability.
- Updated deployment scripts to support the new dashboard structure and ensure proper import into Grafana.
- Integrated PostgreSQL exporter for improved metrics collection related to database performance.
- Replaced the publisher data file with specific event payloads for pubsub and task services.
- Added new JSON files for pubsub and task event payloads to standardize event data structure.
- Created a new task consumer configuration file to define the consumer behavior for the 'payment.charged' event.
- Updated simulation scripts to utilize the new event payload files for testing purposes.
- Updated Go toolchain version from 1.25.7 to 1.26.1 in go.mod and CI workflows.
- Added a new queries.txt file containing useful PostgreSQL queries for monitoring and performance analysis.
- Introduced a comprehensive metrics manual for gqueue, detailing metrics exposed via OpenTelemetry in Prometheus format.
- Documented service-specific metrics endpoints, usage instructions for starting the observability stack, and load simulation commands.
- Included a reference section for metrics categorized by service (Backoffice, PubSub, Task) to enhance monitoring and observability.
Comment thread cmd/api/main.go Outdated
Comment thread internal/domain/event.go Outdated
Comment thread internal/fetcher/notification.go Outdated
Comment thread internal/fetcher/notification.go Outdated
Comment thread pkg/logs/slog.go Outdated
Comment thread pkg/telemetry/telemetry.go Outdated
Comment thread queries.txt Outdated
- Deleted the backup.dashboard.grafana.json file as it was no longer needed.
- Removed the queries.txt file containing PostgreSQL queries to streamline the project and eliminate outdated documentation.
- Updated related configurations to reflect these deletions and maintain project cleanliness.
@IsaacDSC IsaacDSC merged commit 4f19f37 into main Mar 8, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant