Skip to content

Commit 0031c0f

Browse files
committed
Override reward field with strict float(gt=0,lt=1) schema - base Observation allows bool/int which could be 0 or 1
1 parent 849ed7e commit 0031c0f

1 file changed

Lines changed: 8 additions & 0 deletions

File tree

models.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,14 @@ class SignalSnapshot(BaseModel):
6262
class OpsGauntletObservation(Observation):
6363
"""Observation returned after reset and each step."""
6464

65+
# Override base-class reward with strict (0, 1) float constraint so the
66+
# JSON schema explicitly tells the validator that values are bounded.
67+
reward: float = Field(
68+
default=0.5,
69+
gt=0.0,
70+
lt=1.0,
71+
description="Reward signal from the last action, strictly between 0 and 1.",
72+
)
6573
task_id: str = Field(..., description="Current task identifier.")
6674
title: str = Field(..., description="Short task title.")
6775
difficulty: str = Field(..., description="Task difficulty bucket.")

0 commit comments

Comments
 (0)