From 525f92f6bba8193e4b2f145f3d11347a93318f48 Mon Sep 17 00:00:00 2001 From: benediktstroebl <50178209+benediktstroebl@users.noreply.github.com> Date: Mon, 11 May 2026 08:51:55 -0700 Subject: [PATCH] sync: fix reward priority, add Daytona JWT auth docs - verifier now checks reward.json before reward.txt (harbor#1620) - daytona preflight now accepts JWT auth (DAYTONA_JWT_TOKEN + DAYTONA_ORGANIZATION_ID) as alternative to DAYTONA_API_KEY (harbor#1620) --- skills/harbor-adapter-creator/SKILL.md | 2 +- skills/harbor-cli/SKILL.md | 2 ++ skills/harbor-task-creator/SKILL.md | 4 ++-- 3 files changed, 5 insertions(+), 3 deletions(-) diff --git a/skills/harbor-adapter-creator/SKILL.md b/skills/harbor-adapter-creator/SKILL.md index 6f7379f..60e9e00 100644 --- a/skills/harbor-adapter-creator/SKILL.md +++ b/skills/harbor-adapter-creator/SKILL.md @@ -282,7 +282,7 @@ This is the most common source of adapter validation failures. The verifier uplo **test.sh MUST write a reward value to `/logs/verifier/reward.txt`** -- a single float (e.g., `0` or `1` or `0.75`). If this file is missing, the verifier raises `RewardFileNotFoundError` and the trial fails. -Optionally, test.sh can also write `/logs/verifier/reward.json` for structured reward data (e.g., `{"reward": 1.0, "quality": 0.8}`). The verifier checks `reward.txt` first, then falls back to `reward.json`. +Optionally, test.sh can also write `/logs/verifier/reward.json` for structured reward data (e.g., `{"reward": 1.0, "quality": 0.8}`). The verifier checks `reward.json` first, then falls back to `reward.txt`. ### Verification Patterns diff --git a/skills/harbor-cli/SKILL.md b/skills/harbor-cli/SKILL.md index 5a2ba5b..b6d1fce 100644 --- a/skills/harbor-cli/SKILL.md +++ b/skills/harbor-cli/SKILL.md @@ -492,3 +492,5 @@ harbor jobs summarize ./jobs/my-job **`--no-cleanup` for debugging.** Pass this to `harbor trials start` to keep the container running after a failed trial so you can inspect it. **Retry defaults skip common errors.** `--retry-exclude` defaults to `AgentTimeoutError`, `VerifierTimeoutError`, `RewardFileNotFoundError`, `RewardFileEmptyError`, `VerifierOutputParseError`. These usually indicate task bugs, not transient failures. + +**Daytona requires credentials.** Use `DAYTONA_API_KEY`, or both `DAYTONA_JWT_TOKEN` and `DAYTONA_ORGANIZATION_ID` (JWT auth). If neither is set, `harbor run -e daytona` fails at preflight. diff --git a/skills/harbor-task-creator/SKILL.md b/skills/harbor-task-creator/SKILL.md index 26d1a69..80bd843 100644 --- a/skills/harbor-task-creator/SKILL.md +++ b/skills/harbor-task-creator/SKILL.md @@ -63,7 +63,7 @@ Understanding the runtime flow prevents most common authoring mistakes: 3. **Agent runs** — The agent connects to the container, reads instruction.md (provided separately, not in the container), and works in the environment. 4. **Upload tests** — After the agent finishes, Harbor uploads `tests/` to `/tests/` inside the container using `docker compose cp`. 5. **Execute test.sh** — Harbor runs `chmod +x /tests/test.sh` then executes it. Working directory is `/tests/`. Environment variables from `[verifier].env` in task.toml are injected. -6. **Parse reward** — Harbor reads `/logs/verifier/reward.txt` (priority) or `/logs/verifier/reward.json`. If neither exists, the trial errors with `RewardFileNotFoundError`. +6. **Parse reward** — Harbor reads `/logs/verifier/reward.json` (priority) or `/logs/verifier/reward.txt`. If neither exists, the trial errors with `RewardFileNotFoundError`. This means: the agent never sees `tests/` or `solution/`. Tests run after the agent is done. Reward files must always be written. @@ -227,7 +227,7 @@ The test script runs inside the container AFTER the agent finishes. It determine **Reward contract:** - Write a float to `/logs/verifier/reward.txt` (e.g., `1` for pass, `0` for fail, `0.5` for partial) - Alternatively, write JSON to `/logs/verifier/reward.json` (e.g., `{"reward": 1.0}`) -- `reward.txt` takes priority — if both exist, Harbor reads `reward.txt` and ignores `reward.json` +- `reward.json` takes priority — if both exist, Harbor reads `reward.json` and ignores `reward.txt` - `reward.json` supports multiple named rewards (e.g., `{"reward": 1.0, "quality": 0.8, "style": 0.6}`) - MUST write the reward file in ALL code paths — if missing, the trial fails with `RewardFileNotFoundError`; if empty, `RewardFileEmptyError`