Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion skills/harbor-adapter-creator/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,7 @@ This is the most common source of adapter validation failures. The verifier uplo

**test.sh MUST write a reward value to `/logs/verifier/reward.txt`** -- a single float (e.g., `0` or `1` or `0.75`). If this file is missing, the verifier raises `RewardFileNotFoundError` and the trial fails.

Optionally, test.sh can also write `/logs/verifier/reward.json` for structured reward data (e.g., `{"reward": 1.0, "quality": 0.8}`). The verifier checks `reward.txt` first, then falls back to `reward.json`.
Optionally, test.sh can also write `/logs/verifier/reward.json` for structured reward data (e.g., `{"reward": 1.0, "quality": 0.8}`). The verifier checks `reward.json` first, then falls back to `reward.txt`.

### Verification Patterns

Expand Down
2 changes: 2 additions & 0 deletions skills/harbor-cli/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -492,3 +492,5 @@ harbor jobs summarize ./jobs/my-job
**`--no-cleanup` for debugging.** Pass this to `harbor trials start` to keep the container running after a failed trial so you can inspect it.

**Retry defaults skip common errors.** `--retry-exclude` defaults to `AgentTimeoutError`, `VerifierTimeoutError`, `RewardFileNotFoundError`, `RewardFileEmptyError`, `VerifierOutputParseError`. These usually indicate task bugs, not transient failures.

**Daytona requires credentials.** Use `DAYTONA_API_KEY`, or both `DAYTONA_JWT_TOKEN` and `DAYTONA_ORGANIZATION_ID` (JWT auth). If neither is set, `harbor run -e daytona` fails at preflight.
4 changes: 2 additions & 2 deletions skills/harbor-task-creator/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Understanding the runtime flow prevents most common authoring mistakes:
3. **Agent runs** — The agent connects to the container, reads instruction.md (provided separately, not in the container), and works in the environment.
4. **Upload tests** — After the agent finishes, Harbor uploads `tests/` to `/tests/` inside the container using `docker compose cp`.
5. **Execute test.sh** — Harbor runs `chmod +x /tests/test.sh` then executes it. Working directory is `/tests/`. Environment variables from `[verifier].env` in task.toml are injected.
6. **Parse reward** — Harbor reads `/logs/verifier/reward.txt` (priority) or `/logs/verifier/reward.json`. If neither exists, the trial errors with `RewardFileNotFoundError`.
6. **Parse reward** — Harbor reads `/logs/verifier/reward.json` (priority) or `/logs/verifier/reward.txt`. If neither exists, the trial errors with `RewardFileNotFoundError`.

This means: the agent never sees `tests/` or `solution/`. Tests run after the agent is done. Reward files must always be written.

Expand Down Expand Up @@ -227,7 +227,7 @@ The test script runs inside the container AFTER the agent finishes. It determine
**Reward contract:**
- Write a float to `/logs/verifier/reward.txt` (e.g., `1` for pass, `0` for fail, `0.5` for partial)
- Alternatively, write JSON to `/logs/verifier/reward.json` (e.g., `{"reward": 1.0}`)
- `reward.txt` takes priority — if both exist, Harbor reads `reward.txt` and ignores `reward.json`
- `reward.json` takes priority — if both exist, Harbor reads `reward.json` and ignores `reward.txt`
- `reward.json` supports multiple named rewards (e.g., `{"reward": 1.0, "quality": 0.8, "style": 0.6}`)
- MUST write the reward file in ALL code paths — if missing, the trial fails with `RewardFileNotFoundError`; if empty, `RewardFileEmptyError`

Expand Down