NoesisVision · szjanikowski · May 9, 2026 · May 9, 2026 · May 9, 2026
@@ -14,6 +14,52 @@ description: |
 
 Create and configure coding agent benchmarks for evaluation with `nasde`. A benchmark is a set of coding tasks that AI agents solve inside isolated Docker containers, scored both by functional tests (pass/fail) and by an LLM-as-a-Judge architecture assessment.
 
+## Critical: line endings on Windows (read this first)
+
+Benchmark scripts execute inside **Linux** sandboxes (Docker, Daytona). If `tests/test.sh`, `solution/solve.sh`, or `environment/Dockerfile` are checked out with **CRLF** line endings (the Windows git default when `core.autocrlf=true` and there is no `.gitattributes`), every trial fails immediately with:
+
+```
+bash: line 1: /tests/test.sh: cannot execute: required file not found
+```
+
+…because the kernel reads the shebang as `#!/bin/bash\r` and tries to execute a non-existent `/bin/bash\r`. The agent finishes its work, but the verifier never runs and Harbor reports `RewardFileNotFoundError`.
+
+**Mitigation (always do this for a new benchmark — `nasde init` does it for you, but verify):**
+
+1. The benchmark repo MUST have a `.gitattributes` file enforcing LF for shell scripts and Dockerfiles. The minimum content:
+   ```gitattributes
+   * text=auto eol=lf
+   *.sh        text eol=lf
+   *.bash      text eol=lf
+   Dockerfile  text eol=lf
+   *.dockerfile text eol=lf
+   docker-compose.yaml text eol=lf
+   docker-compose.yml  text eol=lf
+
+   *.ps1       text eol=crlf
+   *.bat       text eol=crlf
+   *.cmd       text eol=crlf
+   ```
+   `nasde init` writes this automatically. If you are adding a benchmark to an existing repo without `.gitattributes`, create one before adding any task.
+
+2. When **writing** `.sh` or `Dockerfile` content programmatically on Windows, write with explicit LF — not `path.write_text(content)` (which translates `\n`→`\r\n` on Windows), but `path.write_text(content, encoding="utf-8", newline="")` or open the file in binary mode.
+
+3. After committing on Windows for the first time, run:
+   ```bash
+   git add --renormalize .
+   git commit -m "normalize line endings"
+   ```
+   to fix any files that landed before `.gitattributes` was in place.
+
+4. Sanity check before pushing a new task:
+   ```bash
+   file tasks/<task>/tests/test.sh
+   # MUST say "with LF line terminators" or omit line-terminator info entirely.
+   # If it says "with CRLF line terminators" — fix it (`sed -i 's/\r$//' file`).
+   ```
+
+This applies equally when you're **adding tasks to a benchmark someone else created** — if their repo has no `.gitattributes` and you're on Windows, your contribution will silently break for them on Linux CI and vice versa.
+
 ## Step 1: Understand what to evaluate
 
 Before creating files, clarify with the user:
@@ -116,6 +162,8 @@ What the agent must NOT do (e.g., don't modify existing tests).
 
 ### environment/Dockerfile (required)
 
+> **Reminder for Windows authors:** the Dockerfile and any helper scripts it `COPY`s in must have LF line endings — Docker tolerates CRLF in some commands but not in `RUN` shell snippets, and any shell script copied with CRLF will hit the same shebang failure as `test.sh`.
+
 ```dockerfile
 FROM <base-image>
 
@@ -137,6 +185,8 @@ The Dockerfile MUST be self-contained — the agent starts working immediately.
 
 ### tests/test.sh (required — Harbor verifier)
 
+> **Reminder for Windows authors:** this file MUST be saved with LF line endings. See "Critical: line endings on Windows" at the top of this skill. CRLF here = `bash: required file not found` and a wasted trial.
+
 ```bash
 #!/bin/bash
 cd /app
@@ -319,3 +369,11 @@ Before running with a real agent:
    ```bash
    nasde run --variant vanilla --tasks <task-name> --without-eval -C .
    ```
+
+4. **Final pre-flight on Windows authors** — verify no CRLF leaked in:
+   ```bash
+   find tasks -name '*.sh' -exec sh -c 'file "$1" | grep -q CRLF && echo "BAD: $1"' _ {} \;
+   find tasks -name 'Dockerfile' -exec sh -c 'file "$1" | grep -q CRLF && echo "BAD: $1"' _ {} \;
+   # Both should print nothing.
+   ```
+   If anything prints, fix with `sed -i 's/\r$//' <file>` and re-commit.
@@ -19,6 +19,15 @@ Generate NASDE benchmark tasks by mining git history. You analyze commits, diffs
 - An existing NASDE benchmark project (run `nasde init` first, or use the `nasde-benchmark-creator` skill)
 - If the benchmark project doesn't exist yet, create it first — this skill generates tasks, not the project scaffold
 
+## Critical: line endings on Windows (read this first)
+
+When generating `tests/test.sh`, `solution/solve.sh`, or `environment/Dockerfile` on a Windows host, write them with **LF** line endings or every trial fails with `bash: required file not found` (the kernel reads `#!/bin/bash\r` as the shebang). See the full explanation and `.gitattributes` template in the `nasde-benchmark-creator` skill.
+
+Quick rules:
+- The benchmark project MUST have a `.gitattributes` enforcing `*.sh text eol=lf` and `Dockerfile text eol=lf`. `nasde init` creates this. If the existing project lacks it, **create `.gitattributes` before generating any task files**.
+- When writing files programmatically, use `path.write_text(content, encoding="utf-8", newline="")` — never the bare default which translates `\n`→`\r\n` on Windows.
+- Sanity-check after generation: `find tasks/<new-task> -name '*.sh' -o -name 'Dockerfile' | xargs file | grep CRLF` should print nothing.
+
 ## Step 1: Identify the source repository and commit range
 
 Ask the user:

@@ -19,6 +19,15 @@ Build a diverse NASDE benchmark by curating tasks from multiple public GitHub re
 - A clear description of the skill being evaluated (what it does, what kinds of tasks it helps with)
 - Internet access (to browse and clone public repositories)
 
+## Critical: line endings on Windows (read this first)
+
+When generating `tests/test.sh`, `solution/solve.sh`, or `environment/Dockerfile` on a Windows host, write them with **LF** line endings or every trial fails with `bash: required file not found` (the kernel reads `#!/bin/bash\r` as the shebang). See the full explanation and `.gitattributes` template in the `nasde-benchmark-creator` skill.
+
+Quick rules:
+- The benchmark project MUST have a `.gitattributes` enforcing `*.sh text eol=lf` and `Dockerfile text eol=lf`. `nasde init` creates this. If the existing project lacks it, **create `.gitattributes` before generating any task files**.
+- When writing files programmatically, use `path.write_text(content, encoding="utf-8", newline="")` — never the bare default which translates `\n`→`\r\n` on Windows.
+- Sanity-check after generation: `find tasks/<new-task> -name '*.sh' -o -name 'Dockerfile' | xargs file | grep CRLF` should print nothing.
+
 ## Step 1: Understand the skill under test
 
 Ask the user:

@@ -0,0 +1,46 @@
+# Default: let Git detect text vs binary, force LF in working tree.
+# Critical: shell scripts and Dockerfiles MUST be LF — they are executed by
+# Linux interpreters in benchmark sandboxes (Daytona, Docker). CRLF causes
+# `bash: required file not found` because the shebang becomes `#!/bin/bash\r`.
+* text=auto eol=lf
+
+# Source code that runs in Linux containers / cross-platform tooling: force LF.
+*.sh        text eol=lf
+*.bash      text eol=lf
+*.py        text eol=lf
+Dockerfile  text eol=lf
+*.dockerfile text eol=lf
+docker-compose.yaml text eol=lf
+docker-compose.yml  text eol=lf
+*.toml      text eol=lf
+*.yaml      text eol=lf
+*.yml       text eol=lf
+*.json      text eol=lf
+*.md        text eol=lf
+Makefile    text eol=lf
+*.mk        text eol=lf
+
+# PowerShell expects CRLF on Windows. Keep as-is so PS5.1 parses cleanly.
+*.ps1       text eol=crlf
+*.psd1      text eol=crlf
+*.psm1      text eol=crlf
+
+# Windows batch files require CRLF.
+*.bat       text eol=crlf
+*.cmd       text eol=crlf
+
+# Binary assets: never touch line endings.
+*.png       binary
+*.jpg       binary
+*.jpeg      binary
+*.gif       binary
+*.ico       binary
+*.pdf       binary
+*.zip       binary
+*.gz        binary
+*.tgz       binary
+*.tar       binary
+*.whl       binary
+*.so        binary
+*.dll       binary
+*.exe       binary
@@ -101,6 +101,49 @@
 jobs/
 """
 
+GITATTRIBUTES_TEMPLATE = """\
+# Critical: files executed inside benchmark sandboxes (Linux containers via
+# Docker / Daytona / Modal / etc.) MUST be LF. CRLF on a shebang line causes
+# `bash: required file not found` because the kernel reads `#!/bin/bash\\r`.
+* text=auto eol=lf
+
+*.sh        text eol=lf
+*.bash      text eol=lf
+Dockerfile  text eol=lf
+*.dockerfile text eol=lf
+docker-compose.yaml text eol=lf
+docker-compose.yml  text eol=lf
+*.toml      text eol=lf
+*.yaml      text eol=lf
+*.yml       text eol=lf
+*.json      text eol=lf
+*.md        text eol=lf
+*.py        text eol=lf
+
+# PowerShell / Windows batch keep CRLF.
+*.ps1       text eol=crlf
+*.psd1      text eol=crlf
+*.psm1      text eol=crlf
+*.bat       text eol=crlf
+*.cmd       text eol=crlf
+
+# Binary assets — never touch line endings.
+*.png       binary
+*.jpg       binary
+*.jpeg      binary
+*.gif       binary
+*.ico       binary
+*.pdf       binary
+*.zip       binary
+*.gz        binary
+*.tar       binary
+*.tgz       binary
+*.whl       binary
+*.so        binary
+*.dll       binary
+*.exe       binary
+"""
+
 
 def create_project(project_dir: Path, name: str) -> None:
     """Scaffold a new evaluation project structure."""
@@ -111,6 +154,7 @@ def create_project(project_dir: Path, name: str) -> None:
     _write_if_missing(project_dir / "nasde.toml", NASDE_TOML_TEMPLATE.format(name=name))
     _write_if_missing(project_dir / "assessment_dimensions.json", ASSESSMENT_DIMENSIONS_TEMPLATE)
     _write_if_missing(project_dir / ".gitignore", GITIGNORE_TEMPLATE)
+    _write_if_missing(project_dir / ".gitattributes", GITATTRIBUTES_TEMPLATE)
 
     tasks_dir.mkdir(parents=True, exist_ok=True)
     variants_dir.mkdir(parents=True, exist_ok=True)
@@ -158,4 +202,4 @@ def _write_if_missing(path: Path, content: str) -> None:
         console.print(f"  [yellow]Skipping[/yellow] {path.name} (already exists)")
         return
     path.parent.mkdir(parents=True, exist_ok=True)
-    path.write_text(content)
+    path.write_text(content, encoding="utf-8", newline="")