Work out pass/fail from test output and check it's the same failure
across bisection steps.
This matters because you can hit a different bug mid-bisect and end
up chasing the wrong thing. Logspec already generates SHA1 error
signatures — we should use those to confirm we're still seeing the
original regression at each step.
Also need to handle non-monotonic pass/fail along the history, which
is more common than you'd hope.
Work out pass/fail from test output and check it's the same failure
across bisection steps.
This matters because you can hit a different bug mid-bisect and end
up chasing the wrong thing. Logspec already generates SHA1 error
signatures — we should use those to confirm we're still seeing the
original regression at each step.
Also need to handle non-monotonic pass/fail along the history, which
is more common than you'd hope.