Skip to content

feat: add fp-stability command for Verrou-based FP instability testing#1403

Draft
sbryngelson wants to merge 4 commits intoMFlowCode:masterfrom
sbryngelson:fp-stability
Draft

feat: add fp-stability command for Verrou-based FP instability testing#1403
sbryngelson wants to merge 4 commits intoMFlowCode:masterfrom
sbryngelson:fp-stability

Conversation

@sbryngelson
Copy link
Copy Markdown
Member

@sbryngelson sbryngelson commented May 6, 2026

Closes #650

Summary

  • Adds ./mfc.sh fp-stability — a persistent floating-point stability test suite using Verrou's random IEEE-754 rounding mode
  • Adds two 1-D test cases targeting known ill-conditioned operations in MFC
  • Adds a GitHub Actions workflow (.github/workflows/fp-stability.yml) that builds Verrou and runs the suite on every push

What it tests

For each case the runner executes 1 nearest-rounding reference run + N random-rounding runs under Verrou, then reports the max L∞ deviation vs. a threshold:

Case Grid Physics Variables compared Ill-conditioning probed
sod_strong 25 cells, 5 steps Ideal gas, p_L/p_R=100,000, WENO5+HLLC density, energy HLLC xi factor (s_L−vel_L)/(s_L−s_S) cancels near sonic contact
water_stiffened 25 cells, 5 steps Stiffened EOS (pi_inf=4046), WENO5+HLLC density, pressure p=(E−pi_inf)/gamma loses ~4 decimal digits (pi_inf/p_right ≈ 40,000)

CI

Verrou (Valgrind 3.26.0 + edf-hpc/verrou@a58d434) is built from source and cached by commit hash. First run ~25 min uncached; subsequent runs ~10–12 min with cache hit.

Test plan

  • Verify CI passes on this branch
  • Adjust thresholds if baseline deviations shift on the GitHub runner architecture
  • Add additional cases as new ill-conditioning sources are identified

Adds ./mfc.sh fp-stability — a persistent floating-point stability test
suite using Verrou's random IEEE-754 rounding mode.

For each registered test case the runner:
  1. Generates initial conditions via pre_process
  2. Runs simulation once with --rounding-mode=nearest (reference)
  3. Runs simulation N times with --rounding-mode=random
  4. Reports max L-inf deviation vs threshold (PASS/FAIL)

Two cases probe known ill-conditioning in MFC:
  - sod_strong: 1-D Sod p_L/p_R=100,000 — HLLC xi-factor cancellation
    (s_L - vel_L)/(s_L - s_S) near sonic contact
  - water_stiffened: 1-D water shock pi_inf=4046 — pressure recovery
    p=(E-pi_inf)/gamma loses ~4 decimal digits on low-pressure side

Requires a Verrou-enabled Valgrind at $VERROU_HOME/bin/valgrind
(default: $HOME/.local/verrou). Silently skips if not found.
Binaries are auto-discovered from build/install/ or passed explicitly.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Claude Code Review

Head SHA: 5c8eaab

Files changed:

  • 5
  • .github/workflows/fp-stability.yml
  • .gitignore
  • toolchain/main.py
  • toolchain/mfc/cli/commands.py
  • toolchain/mfc/fp_stability.py

Findings

1. cons.unindent() skipped on exception — console permanently indented after any case error

File: toolchain/mfc/fp_stability.py

In _run_case, cons.indent() is called before the try block, but cons.unindent() is placed after the try/finally block — outside of any exception-protection:

cons.indent()
...
work_dir = tempfile.mkdtemp(...)
try:
    _run_preprocess(...)            # can raise MFCException
    _run_simulation_verrou(...)     # can raise MFCException
    ...
finally:
    shutil.rmtree(work_dir, ignore_errors=True)

cons.unindent()   # ← never reached when MFCException propagates out
cons.print()
return passed

When _run_preprocess or _run_simulation_verrou raises MFCException, the exception bypasses cons.unindent(). The caller (fp_stability()) catches the exception and continues to the next case, but every subsequent cons.print() call will be at the wrong indentation level for all remaining cases.

cons.unindent() and cons.print() should be moved inside the finally block alongside shutil.rmtree.


2. _exclude_args is defined but never called — dead code

File: toolchain/mfc/fp_stability.py

def _exclude_args(exclude_file: str) -> list:
    """Return --exclude flag if a verrou exclusion file is provided."""
    if exclude_file and os.path.isfile(exclude_file):
        return [f"--exclude={exclude_file}"]
    return []

This function is never called anywhere in the module. All invocations of _run_simulation_verrou hard-code [] for exclude_args, and there is no corresponding CLI argument for an exclusion file in FP_STABILITY_COMMAND. The function should either be wired up (add a --exclude CLI argument and thread it through _run_case) or removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Valgrind verrou to check for sketchy float operations

1 participant