Conversation
Greptile SummaryThis PR adds a Confidence Score: 5/5Safe to merge — all prior P1 concerns are resolved and only a minor annotation gap remains. Both previously flagged issues (np.bool_ coverage and np.integer → int coercion) are now correctly handled. The only remaining finding is a P2 return-type annotation omission (bool not listed). No behavioral or serialization bugs remain. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[metric value] --> B{isinstance?}
B -- np.bool_ --> C[bool v]
B -- np.floating --> D[float v]
B -- np.integer --> E[int v]
B -- np.ndarray --> F[v.tolist]
B -- else --> G[pass through]
C & D & E & F & G --> H[sanitized dict]
H --> I1[MetricsLogger.append_job_metrics]
H --> I2[policy_runner print]
I1 --> J1[print_metrics]
I1 --> J2[save_metrics_to_file JSON]
Reviews (5): Last reviewed commit: "Merge branch 'main' into xyao/fix/saniti..." | Re-trigger Greptile |
| if isinstance(v, (np.floating, np.integer)): | ||
| sanitized[k] = float(v) |
There was a problem hiding this comment.
np.integer coerced to float, not int
np.integer values (e.g. np.int64(5)) are cast to float, so a count metric like num_episodes would become 5.0 rather than 5. The return-type annotation already promises int | float | list, but int is never actually produced. Coercing to int for the integer branch would honour the annotation and keep counts as integers in the JSON output.
| if isinstance(v, (np.floating, np.integer)): | |
| sanitized[k] = float(v) | |
| if isinstance(v, np.floating): | |
| sanitized[k] = float(v) | |
| elif isinstance(v, np.integer): | |
| sanitized[k] = int(v) |
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR adds a sanitize_metrics() utility to convert numpy scalars and arrays to native Python types before logging and JSON serialization. The fix is well-motivated — np.mean() returns np.float64 (not Python float), which prints as np.float64(0.85) and breaks json.dump(). The approach is sound and correctly placed at both ingestion (append_job_metrics) and ad-hoc printing (policy_runner.py).
Design Assessment
Design is sound. Sanitizing at the MetricsLogger.append_job_metrics() boundary is the right call — it ensures all downstream consumers (print_metrics(), save_metrics_to_file()) get clean types without each needing their own conversion. The additional sanitization in policy_runner.py for the standalone print path is also correct since that path doesn't go through MetricsLogger.
Findings
🟡 Warning: np.integer converted to float instead of int — isaaclab_arena/metrics/metrics_logger.py:16
The function converts both np.floating and np.integer to float. While this works for JSON serialization, it's semantically lossy — integer metrics (like num_episodes from compute_metrics()) would become floats in the output. Currently num_episodes comes from len() which returns native int, so it passes through the else branch unaffected. But if any metric ever produces an np.int64 value, it would be stored as float in the JSON output and displayed with .4f formatting in print_metrics().
Consider preserving the integer type:
if isinstance(v, np.floating):
sanitized[k] = float(v)
elif isinstance(v, np.integer):
sanitized[k] = int(v)
elif isinstance(v, np.ndarray):
sanitized[k] = v.tolist()
else:
sanitized[k] = v🟡 Warning: np.bool_ not handled — isaaclab_arena/metrics/metrics_logger.py:14
np.bool_ is not a subclass of np.floating or np.integer, so it falls through to the else branch and passes through unsanitized. np.bool_ is not JSON-serializable (json.dump raises TypeError). While current metric implementations use np.mean() which returns np.float64, if a metric ever returns a np.bool_ value directly, it would silently break JSON export.
Adding np.bool_ handling would make the function more robust:
elif isinstance(v, np.bool_):
sanitized[k] = bool(v)🔵 Suggestion: Inline import could be a top-level import — isaaclab_arena/evaluation/policy_runner.py:204
The import of sanitize_metrics is done inline inside the if metrics is not None: block. This file already imports from isaaclab_arena modules at the top level (e.g., isaaclab_arena.cli, isaaclab_arena.evaluation, isaaclab_arena.utils). Moving this to the top-level imports would be more consistent with the file's existing style, unless there's a specific circular-import reason for the inline placement.
🔵 Suggestion: Return type annotation could be more precise — isaaclab_arena/metrics/metrics_logger.py:13
The return type dict[str, int | float | list] doesn't account for values that pass through the else branch (e.g., native Python str, bool, or other types). dict[str, Any] would be more accurate, or the union could be expanded to include all expected pass-through types.
Test Coverage
- Bug fix PR: No regression test included. This is understandable for a serialization fix — the bug manifests at runtime with numpy-producing metric implementations, and the existing test infrastructure may not exercise the JSON export path. A unit test for
sanitize_metrics()would be lightweight and valuable:
def test_sanitize_metrics():
import numpy as np
from isaaclab_arena.metrics.metrics_logger import sanitize_metrics
metrics = {
"success_rate": np.float64(0.85),
"count": np.int64(42),
"values": np.array([1.0, 2.0]),
"name": "test",
}
result = sanitize_metrics(metrics)
assert isinstance(result["success_rate"], float)
assert isinstance(result["count"], float) # or int, if the fix above is applied
assert isinstance(result["values"], list)
assert result["name"] == "test"This is a should-have rather than a blocker — the fix is straightforward enough that the risk of regression is low.
CI Status
⏳ Pre-commit check is pending.
Verdict
Minor fixes needed
The core approach is correct and well-placed. The np.integer → float conversion is the most notable issue — splitting integer and floating-point handling would preserve type fidelity. The np.bool_ gap is a minor robustness concern. Neither is blocking, but both would make the sanitization utility more correct and future-proof.
|
|
||
| def sanitize_metrics(metrics: dict[str, Any]) -> dict[str, int | float | list]: | ||
| """Convert numpy scalars/arrays in a metrics dict to plain Python types.""" | ||
| sanitized = {} |
There was a problem hiding this comment.
🟡 Warning: np.bool_ not covered
np.bool_ is not a subclass of np.floating or np.integer, so it falls through to the else branch unsanitized. np.bool_ is not JSON-serializable — json.dump() will raise TypeError.
While current metrics use np.mean() (which returns np.float64), adding np.bool_ handling would make this utility more robust:
elif isinstance(v, np.bool_):
sanitized[k] = bool(v)
alexmillane
left a comment
There was a problem hiding this comment.
Couple of nits. Thank you for doing this!
| if metrics is not None: | ||
| # Each rank prints its own metrics as it can be different due to random seed | ||
| print(f"[Rank {local_rank}/{world_size}] Metrics: {metrics}") | ||
| from isaaclab_arena.metrics.metrics_logger import sanitize_metrics |
There was a problem hiding this comment.
Suggestion to move this import to the top of the file. Sometime's we need these locally-situated imports because of isaac-sim but let's try to avoid them unless required.
| from typing import Any | ||
|
|
||
|
|
||
| def sanitize_metrics(metrics: dict[str, Any]) -> dict[str, int | float | list]: |
There was a problem hiding this comment.
What do you think about renaming this metrics_to_plain_python_types. Makes it explicit what the function is doing. "Sanitize" could mean almost anything.
b01adfc to
f2c1431
Compare
ea88db6 to
2139155
Compare
Summary
Sanitize numpy types in metrics for readable logging and JSON export. Fix to https://nvbugspro.nvidia.com/bug/6077892
Detailed description
np.float32,np.int64, ornp.ndarray. These print asnp.float32(0.85)instead of0.85and are not JSON-serializable by default.sanitize_metrics()utility inmetrics_logger.pythat converts numpy scalars tofloatand numpy arrays to Python lists. Used it inpolicy_runner.py(metrics print after rollout) andMetricsLogger.append_job_metrics()(sanitizes on ingestion so bothprint_metrics()andsave_metrics_to_file()get clean types).