Skip to content

Conversation

@FeBe95
Copy link

@FeBe95 FeBe95 commented Dec 4, 2025

Important

This is a proof-of-concept PR, seeking opinions and evaluations from maintainers and contributors.

What is this?

This is an extension to the already existing "Slow XYZ" recorders, e.g. the SlowRequests recorder. With the newly collected data, the Slow Request card now looks like this:

grafik

Disclaimer

Note

The idea for this isn't new. It was first raised in #281. There was also a draft implementation in #284, from which I took some inspiration. For my own implementation, I came up with a solution for the main issues that were discussed in that first draft:

  1. Which average value should be recorded – the average of all requests or the average of requests above the threshold?
  2. How should the average value be recorded?
  3. Bonus: How should the value be displayed in the UI to make its meaning easy to understand to the users?

What has changed?

For now, this proof-of-concept implementation only extends the functionality of the SlowRequests recorder.
Here's what's changed:

  • The recorder now calculates the average duration or all requests.
  • The card now shows the total number of requests next to the average duration, making it more obvious what the “average” value represents.
  • Adds average and total to the sorting options.
  • Adds an info action with a short explanation about the shown metrics.

How has this been implemented?

The recorder now tracks all requests by default and calculates their max and avg durations. Additionally, if a request duration exceeds the configured threshold, it also increments the count metric.

For tracking the total number of requests, several approaches exist:

  1. Record a new separate entry with a different type name (e.g. slow-requests-total). Then use the ->count() aggregate.
  2. Record a new separate entry with the same type name (slow-requests). Then (mis-)use the ->sum() aggregate with a static value of 1.
  3. Don't record a separate entry. Use the existing count column of the pulse_aggregates table. This column contains the total count and is used for accurate average value calculation.

Evaluation

I implemented, tested, and evaluated all three approaches.

Approach Advantages Disadvantages
// Option 1:
Entry('slow-requests-total')
    ->count()
  • Separate entry type
  • Easy to turn on/off
  • Difficult to "merge" with slow entries
  • Creates extra entries in database tables
// Option 2:
Entry('slow-requests')
    ->sum()
  • Easy to implement
  • Feels "hacky"
  • Creates duplicate entries with value: 1 instead of value: $duration
-- Option 3:
pulse_aggregates.count
  • Easy to implement
  • Uses existing data and logic
  • Doesn't create duplicate entries
  • A bit more difficult to turn on/off

Decision

I chose option 3, as its advantages seem to outweigh its disadvantages the most. The implementation is straightforward and uses already-existing data. Also, it shouldn't break neither the ingesting nor the bucket logic. What do you think?

To-Dos

  • Apply the changes to all other "Slow XYZ" recorders as well:
    • SlowJobs
    • SlowOutgoingRequests
    • SlowQueries
  • Add configuration options to enable/disable recording of average values
  • Adjust existing feature tests
  • Add new feature tests
  • Anything else...?

More Screenshots

grafik grafik

@FeBe95 FeBe95 marked this pull request as draft December 4, 2025 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant