Skip to content

Take into account CPU spikes #456

@dudicoco

Description

@dudicoco

Is your feature request related to a problem? Please describe.

Prometheus is based on samples, which means that even if we scraped every 15 seconds we could miss many short 100% or more CPU spikes.
So our Prom query may show that our 95th percentile utilization is at 50% of the current CPU requests value, but in practice lowering the requests might cause CPU throttling and/or increase latency.

Describe the solution you'd like
I believe a profiling tool would be needed here such as an ebpf exporter which could expose a metric with cpu spikes.
Perhaps something like https://github.com/cloudflare/ebpf_exporter

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions