Skip to content

Develop#122

Merged
solderzzc merged 6 commits intomasterfrom
develop
Mar 6, 2026
Merged

Develop#122
solderzzc merged 6 commits intomasterfrom
develop

Conversation

@solderzzc
Copy link
Copy Markdown
Member

No description provided.

When --vlm URL included /v1 suffix (e.g. http://localhost:5405/v1),
llmCall constructed http://host:5405/v1/v1/chat/completions causing
HTTP 404. Now strips trailing /v1 before appending endpoint path.

Result: 35/35 VLM tests now pass (LFM2.5-VL-1.6B-Q8_0).
…to v2.0.0

- Report is now always generated after benchmark completion
- Auto-opens in browser via 'open' (macOS) / 'xdg-open' (Linux)
- Use --no-open to suppress browser launch
- Removed --report flag (report always generated)
- Updated SKILL.md: 131 tests, 16 suites, env var documentation,
  configuration table with defaults and descriptions
1. Security: Accept 'suspicious' for masked person at night (was critical-only)
2. Injection: Normalize Unicode curly apostrophe (U+2019) before matching
3. KI narration: Strengthen prompt to use schedule context, accept sam/alex
4. KI relevance: Accept tool-call (system_status) as valid response
5. KI conflict: Accept tool-call (system_status) as valid response
6. Skip browser auto-open in skill mode (Aegis handles via reportPath)
… agent

Added YAML frontmatter fields (runtime: node, entry: scripts/run-benchmark.cjs,
install: none) and ## Setup section so the Aegis deployment agent knows there
are zero dependencies and can skip npm install.
Prints usage info (options, env vars, test counts) and exits immediately
without running the benchmark. Used by the Aegis deployment agent for
skill verification.
@solderzzc solderzzc merged commit 8c6d160 into master Mar 6, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant