feat: expand HomeSec-Bench to 143 tests, add perf metrics, enable ski… by solderzzc · Pull Request #163 · SharpAI/DeepCamera

solderzzc · 2026-03-18T03:54:59Z

…ll auto-start

Update benchmark paper: 131→143 tests (VLM Scene 35→47, 3 new dedup scenarios, 4 new tool-use scenarios)
Add performance metrics to run-benchmark.cjs (TTFT, decode throughput tracking)
Fix tool_call argument serialization for non-string arguments
Enable auto_start for yolo-detection-2026 and depth-estimation skills
Add LaTeX build artifacts .gitignore

…ll auto-start - Update benchmark paper: 131→143 tests (VLM Scene 35→47, 3 new dedup scenarios, 4 new tool-use scenarios) - Add performance metrics to run-benchmark.cjs (TTFT, decode throughput tracking) - Fix tool_call argument serialization for non-string arguments - Enable auto_start for yolo-detection-2026 and depth-estimation skills - Add LaTeX build artifacts .gitignore

solderzzc and others added 2 commits March 17, 2026 20:52

Merge branch 'develop' into feature/benchmark-thinking-mode-fix

3e03a35

solderzzc merged commit 7d117e9 into develop Mar 18, 2026
1 check passed

solderzzc deleted the feature/benchmark-thinking-mode-fix branch March 18, 2026 03:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: expand HomeSec-Bench to 143 tests, add perf metrics, enable ski…#163

feat: expand HomeSec-Bench to 143 tests, add perf metrics, enable ski…#163
solderzzc merged 2 commits intodevelopfrom
feature/benchmark-thinking-mode-fix

solderzzc commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

solderzzc commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant