Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 21 additions & 31 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,21 +46,24 @@ Checks most contributors can run:

- [ ] `python3 scripts/validate_skill.py skills/java-optionals`
- [ ] `python3 scripts/validate_eval_criteria.py evals evals-reference evals-regression`
- [ ] `python3 -m py_compile scripts/validate_skill.py scripts/validate_eval_criteria.py`
- [ ] `bash -n scripts/check_publish_dry_run.sh`
- [ ] `python3 -m py_compile scripts/*.py`
- [ ] `bash -n scripts/*.sh`
- [ ] `tessl plugin lint .`
- [ ] `markdownlint`, if Markdown changed
- [ ] Manual rendered-doc or example review, if docs or examples changed

Tessl-authenticated checks:

- [ ] `bash scripts/check_publish_dry_run.sh .`
- [ ] `tessl plugin publish --dry-run --bump patch .`
- [ ] `tessl skill review --threshold 100 skills/java-optionals/SKILL.md`, if skill text or references changed
- [ ] Targeted `tessl eval run --variant with-context --variant without-context <scenario-dir>`, if
skill behavior or evals changed
- [ ] Full/main `tessl eval run --variant with-context --variant without-context .`, if benchmark
claims changed or targeted with-context results are clean
- [ ] Targeted main/reference `scripts/run_eval_suite.sh <main|reference> <scenario-name>`, if skill behavior or those evals changed
- [ ] Targeted regression `scripts/run_eval_suite.sh regression <scenario-name>`, if regression evals changed
- [ ] Every substantively changed eval scenario was rerun targeted and reached 100% with context, or the PR explains the Tessl blocker and remaining work
- [ ] Runtime skill/reference changes only: full `scripts/run_eval_suite.sh reference` was run after the final runtime-context change, or the PR links the blocker issue
- [ ] Runtime skill/reference changes only: full `scripts/run_eval_suite.sh regression` was run after the final runtime-context change, or the PR links the blocker issue
- [ ] Pure eval suite moves did not change task wording, scoring criteria, or capability text beyond suite-placement metadata/numbering notes
- [ ] `scripts/classify_eval_result.py <run-json> --scenario-dir <scenario-dir>`, if a scenario was added or moved between suites
- [ ] Full/main `scripts/run_eval_suite.sh main`, if benchmark claims changed or targeted with-context results are clean

`bash scripts/check_publish_dry_run.sh .`, `tessl skill review`, and hosted Tessl evals require
Tessl authentication. Hosted evals also require a linked Tessl project. If you can't run one of
Expand All @@ -83,30 +86,17 @@ explain why.

## Review Checklist

- [ ] Docs updated, or N/A
- [ ] Evals updated, or N/A
- [ ] Scenario directories include `task.md`, `criteria.json`, and `capability.txt`, or N/A
- [ ] Scenario invocation style is classified as natural or explicit, or N/A
- [ ] Natural activation prompts don't explicitly invoke the skill, or N/A
- [ ] Explicit invocation prompts are labeled as explicit, or N/A
- [ ] Main eval criteria include compile/artifact checks, or N/A
- [ ] Main eval criteria include behavior correctness checks, or N/A
- [ ] Runtime references contain no eval answer keys, scenario inventory, hosted run IDs, or fixed
score claims
- [ ] If any with-context result was below 100%, targeted failing scenarios were fixed and rerun
before broader eval suites
- [ ] Java baseline compatibility has been considered, or N/A
- [ ] `OptionalInt`, `OptionalLong`, and `OptionalDouble` guidance has been considered, or N/A
- [ ] Optional-producing stream terminals and collectors are covered, or N/A
- [ ] Java 26 Javadocs were checked for Optional-family coverage, or N/A
- [ ] Valid README package-runner instructions were preserved, or N/A
- [ ] Tessl package commands match the verified plugin package format
- [ ] Full/reference eval reporting is not hidden or cherry-picked
- [ ] Tessl checks were run, or unavailability is documented
- [ ] PR title or squash title uses Conventional Commits
- [ ] Redaction checked: no Tessl tokens, GitHub tokens, package manager tokens, private repository
links, private eval artifacts, private registry/workspace links, local host paths, or
proprietary Java source
- [ ] The change is scoped to the sections, skill files, evals, or workflows described above.
- [ ] Validation that applies to this change is checked above, or any unavailable check is explained.
- [ ] If Java Optional guidance changed, Java baseline compatibility, fallback timing, null interop, primitive Optionals, and checked boundaries were considered.
- [ ] If evals or benchmark claims changed, the eval scenarios remain fair and do not leak answer keys, run IDs, or fixed score claims into runtime references.
- [ ] If runtime skill text or references changed, hosted checks were widened from targeted affected scenarios to main/reference/regression as described in `docs/agents/workflow.md`, or any Tessl blocker is documented.
- [ ] If a runtime skill/reference change was released, the final report includes the published main eval run plus post-change reference and regression run IDs, or a blocker issue for missing broad suites.
- [ ] Main and reference evals were run with both variants when hosted evals were needed; regression evals were run with context only unless reclassification back to reference was being checked.
- [ ] New or moved eval scenarios follow the classifier recommendation, or the PR explains the maintainer-approved override.
- [ ] Every retained eval scenario has a 100% with-context result, or any below-100 result is documented as blocking follow-up rather than classified/reportable coverage.
- [ ] PR title or squash title uses Conventional Commits.
- [ ] Redaction checked: no tokens, private links, private eval artifacts, local host paths, or proprietary Java source.

## AI Assistance (if used)

Expand Down
46 changes: 11 additions & 35 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,12 @@ jobs:
name: Validate skill and plugin
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6

- name: Setup Tessl CLI
uses: tesslio/setup-tessl@25ec223fc0da33b41b8044ff5ab2b85235f4f91e # v2
with:
version: "0.81.2"
version: "0.82.0"

- name: Validate skill metadata
run: python3 scripts/validate_skill.py skills/java-optionals
Expand All @@ -35,34 +35,16 @@ jobs:
run: python3 scripts/validate_eval_criteria.py evals evals-reference evals-regression

- name: Compile validation scripts
run: python3 -m py_compile scripts/validate_skill.py scripts/validate_eval_criteria.py
run: python3 -m py_compile scripts/*.py

- name: Check shell scripts
run: bash -n scripts/check_publish_dry_run.sh
run: bash -n scripts/*.sh

- name: Parse JSON files
run: |
python3 - <<'PY'
import json
import pathlib
for path in pathlib.Path('.').rglob('*.json'):
json.load(open(path, encoding='utf-8'))
print('JSON ok')
PY

- name: Parse YAML files
run: |
python3 - <<'PY'
import pathlib
try:
import yaml
except ImportError:
print('PyYAML unavailable; skipping YAML parse')
raise SystemExit(0)
for path in list(pathlib.Path('.').rglob('*.yml')) + list(pathlib.Path('.').rglob('*.yaml')):
yaml.safe_load(open(path, encoding='utf-8'))
print('YAML ok')
PY
run: python3 scripts/validate_json_files.py

- name: Validate YAML metadata
run: python3 scripts/validate_openai_agent_yaml.py

- name: Lint Tessl plugin
run: tessl plugin lint .
Expand All @@ -72,25 +54,19 @@ jobs:
if: ${{ github.event_name == 'push' && github.ref == 'refs/heads/main' }}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6

- name: Check Tessl token
id: tessl-token
env:
TESSL_TOKEN: ${{ secrets.TESSL_TOKEN }}
run: |
if [ -n "${TESSL_TOKEN:-}" ]; then
echo "available=true" >> "$GITHUB_OUTPUT"
else
echo "available=false" >> "$GITHUB_OUTPUT"
echo "TESSL_TOKEN isn't configured; skipping Tessl publish dry-runs."
fi
run: scripts/check_tessl_token_available.sh

- name: Setup Tessl CLI
if: ${{ steps.tessl-token.outputs.available == 'true' }}
uses: tesslio/setup-tessl@25ec223fc0da33b41b8044ff5ab2b85235f4f91e # v2
with:
version: "0.81.2"
version: "0.82.0"
token: ${{ secrets.TESSL_TOKEN }}

- name: Check fast publish dry-run
Expand Down
24 changes: 3 additions & 21 deletions .github/workflows/commitlint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ jobs:
name: Commitlint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6
with:
fetch-depth: 0
ref: ${{ github.event.pull_request.head.sha || github.sha }}
Expand All @@ -40,17 +40,7 @@ jobs:
node-version: "24"

- name: Prepare commitlint
run: |
set -euo pipefail
commitlint_home="$RUNNER_TEMP/commitlint"
mkdir -p "$commitlint_home"
printf '{"private":true}\n' > "$commitlint_home/package.json"
cp commitlint.config.cjs "$commitlint_home/commitlint.config.cjs"
npm --prefix "$commitlint_home" install --silent \
@commitlint/cli@21.0.1 \
@commitlint/config-conventional@21.0.1
echo "COMMITLINT_BIN=$commitlint_home/node_modules/.bin/commitlint" >> "$GITHUB_ENV"
echo "COMMITLINT_CONFIG=$commitlint_home/commitlint.config.cjs" >> "$GITHUB_ENV"
run: scripts/install_commitlint.sh

- name: Lint pull request title
env:
Expand All @@ -63,12 +53,4 @@ jobs:
PR_HEAD_SHA: ${{ github.event.pull_request.head.sha }}
BASE_REF: ${{ inputs.base_ref || 'main' }}
EVENT_NAME: ${{ github.event_name }}
run: |
if [ "$EVENT_NAME" = "pull_request" ]; then
"$COMMITLINT_BIN" --config "$COMMITLINT_CONFIG" \
--from "$PR_BASE_SHA" --to "$PR_HEAD_SHA" --verbose
else
git fetch origin "$BASE_REF"
"$COMMITLINT_BIN" --config "$COMMITLINT_CONFIG" \
--from "origin/$BASE_REF" --to HEAD --verbose
fi
run: scripts/lint_pr_commits.sh
50 changes: 30 additions & 20 deletions .github/workflows/publish-tessl.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,22 @@ on:
workflow_dispatch:
inputs:
ref:
description: "Git ref to publish. Defaults to the selected workflow ref."
required: false
description: "Git ref to publish. Use refs/tags/v<plugin-version> for releases."
required: true
type: string
allow_non_tag_ref:
description: "Allow publishing a non-v<plugin-version> ref. Use only for maintainer-approved recovery."
required: false
type: boolean
default: false
release:
types: [published]

permissions:
contents: read

concurrency:
group: publish-tessl-${{ github.event.release.tag_name || github.run_id }}
group: publish-tessl-${{ github.event.release.tag_name || inputs.ref || github.ref }}
cancel-in-progress: false

jobs:
Expand All @@ -25,40 +30,45 @@ jobs:
steps:
- name: Checkout manual ref
if: ${{ github.event_name == 'workflow_dispatch' }}
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6
with:
ref: ${{ inputs.ref || github.ref }}
ref: ${{ inputs.ref }}

- name: Checkout release tag
if: ${{ github.event_name == 'release' }}
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6
with:
ref: ${{ github.event.release.tag_name }}

- name: Show publish ref
run: |
echo "event_name=${{ github.event_name }}"
echo "requested_ref=${{ github.event.release.tag_name || inputs.ref }}"
echo "checked_out_ref=$(git rev-parse --abbrev-ref HEAD)"
echo "checked_out_sha=$(git rev-parse HEAD)"
git describe --tags --always --dirty

- name: Validate publish ref
env:
EVENT_NAME: ${{ github.event_name }}
RELEASE_TAG: ${{ github.event.release.tag_name }}
MANUAL_REF: ${{ inputs.ref }}
ALLOW_NON_TAG_REF: ${{ inputs.allow_non_tag_ref }}
run: scripts/validate_publish_ref.sh

- name: Require Tessl token
env:
TESSL_TOKEN: ${{ secrets.TESSL_TOKEN }}
run: |
if [ -z "${TESSL_TOKEN:-}" ]; then
echo "TESSL_TOKEN is required to publish the Tessl plugin." >&2
exit 1
fi
run: scripts/require_tessl_token.sh

- name: Setup Tessl CLI
uses: tesslio/setup-tessl@25ec223fc0da33b41b8044ff5ab2b85235f4f91e # v2
with:
version: "0.81.2"
version: "0.82.0"
token: ${{ secrets.TESSL_TOKEN }}

- name: Validate plugin before publish
run: |
python3 scripts/validate_skill.py skills/java-optionals
python3 scripts/validate_eval_criteria.py evals evals-reference evals-regression
python3 -m py_compile scripts/validate_skill.py scripts/validate_eval_criteria.py
bash -n scripts/check_publish_dry_run.sh
tessl plugin lint .
tessl skill review --threshold 100 skills/java-optionals/SKILL.md
tessl plugin publish --dry-run .
run: scripts/validate_publish_ready.sh

- name: Publish plugin
run: tessl plugin publish .
Loading