Skip to content

FR: Ignore failed run measurements when computing statistics or put failures into a separate bucket  #827

@ilyagr

Description

@ilyagr

Currently, hyperfine seems to either abort when a command fails even once, or it treats failures the same as successes.

I'd like an option for either:

  • ignoring the failed runs (report their number, but otherwise compute statistics as though they never happened). Some possible names for such an option: --omit-failed-runs, --forget-failed-runs, or --skip-failed-runs. This is slightly confusing in the presence of existing --ignore-failed-runs, which I think should be renamed, see Suggestion: rename --ignore-failure to --ignore-exit-code for Hyperfine 2.0 #828.
  • putting the failures in a different "bucket", reporting the statistics for the successful runs and failed runs separately.

Out of scope (as far as I'm concerned), but maybe worth discussing.

There is also a potential feature of bucketing results based on other data, say the exact exit code, or whether it takes more than X seconds, but that's less important to me now.

Another possibility that I'd consider out of scope is automatically finding the buckets, e.g. trying to fit a sum of Gaussian distributions instead of one Gaussian distribution onto the measurements.

My use-case

I am trying to benchmark a test run that includes a test that is flaky and sometimes deadlocks (around 10% of the time). Successful runs take about 3 minutes, unsuccessful ones take forever. So, I do:

hyperfine --warmup 1 --min-runs 10 -- "timeout 5m cargo nextest run"

However, when the test does deadlock in any of the 10 runs, the whole benchmark is wasted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions