🚀 Static Site SDK: CI reliability and performance improvements #153

Robdel12 · 2025-12-22T16:37:51Z

Summary

Screenshot timeout: Added 45s default timeout (vs Puppeteer's 180s) to prevent hanging on slow pages
Tab recycling: Tabs are now recycled after 10 uses to prevent memory leaks in long-running jobs
Retry logic: Failed screenshots automatically retry once with a fresh tab (catches timeouts, protocol errors, crashed tabs)
CI-optimized browser args: Disabled GPU, reduced memory usage, limited V8 heap to 512MB, uses /tmp instead of /dev/shm
Protocol timeout: Reduced from 180s to 60s for faster failure detection
Output formatting: Fixed CLI output to use proper output module methods (print(), blank(), hint(), warn())

These changes make the static site SDK much more robust in resource-constrained CI environments where screenshot timeouts were causing ETA calculations to spike and jobs to hang.

Test plan

All 157 unit tests pass
Lint passes
Test in CI with resource-constrained runners
Verify timeout behavior with slow-loading pages

Major changes to make screenshot capture more robust: **Browser Optimizations:** - Add CI-optimized browser args (disable GPU, reduce memory, etc.) - Reduce protocolTimeout from 180s to 60s for faster failure detection - Limit V8 heap size to 512MB to prevent memory bloat **Tab Recycling:** - Recycle tabs after 10 uses to prevent memory leaks - Track tab usage and automatically close/replace stale tabs - Prevents accumulating memory from page JavaScript **Retry Logic:** - On timeout/crash, close bad tab and retry with fresh one - Only report failure after retry also fails - Detect timeout, Target closed, and Protocol errors **Screenshot Timeout:** - Add 45s timeout to screenshot capture (was using 180s default) - Add --timeout CLI flag to override - Configurable via screenshot.timeout in config These changes address issues where CI runners would hang on certain pages due to resource exhaustion or stuck browser tabs.

Use output.print(), blank(), hint(), warn() instead of logger.info() for the 'No TDD server' message to match the CLI's output style.

blacksmith-sh · 2025-12-22T16:39:32Z

Found 4 test failures on Blacksmith runners:

Failures

Test	View Logs
`[firefox-desktop] › tests/reporter/specs/` `settings-workflow.spec.js:70:3 › Settings Workflow › update threshold and save`	View Logs
`[firefox-mobile] › tests/reporter/specs/` `review-workflow.spec.js:127:3 › Review Workflow › user can cancel bulk accept`	View Logs
`Review Workflow/user can cancel bulk accept`	View Logs
`Settings Workflow/update threshold and save`	View Logs

claude · 2025-12-22T16:42:36Z

PR Review: Static Site SDK CI Reliability and Performance Improvements

This is a well-crafted PR that addresses real-world CI/CD pain points with thoughtful solutions. The changes significantly improve reliability in resource-constrained environments.

🎯 Strengths

1. Excellent Problem-Solution Alignment

Each change directly addresses a specific CI failure mode (timeouts, memory leaks, resource exhaustion)
The retry logic intelligently handles transient failures without wasting resources
Tab recycling is a smart solution to memory accumulation in long-running jobs

2. Strong Code Quality

Comprehensive JSDoc comments explain the "why" behind design decisions
Good use of constants (DEFAULT_RECYCLE_AFTER, DEFAULT_SCREENSHOT_TIMEOUT)
Error handling is defensive without being overly broad

3. Well-Structured Changes

Tab pool maintains backward compatibility
Retry logic is encapsulated in attemptTask() helper
CLI flag --timeout properly flows through config schema

🔍 Critical Issues

Test Coverage Gap ⚠️

The pool recycling and retry logic have no test coverage. Missing tests:

In pool.test.js:

Tab recycling after N uses
recycled stat tracking
_poolEntry metadata preservation
New tab creation failure during recycling (line 129-131)
Use count increment behavior

In tasks.test.js:

attemptTask() retry logic for timeout errors
Retry with fresh tab after timeout
No retry on second failure
Retry note appended to errors
Tab closure on timeout before retry

Recommendation: Add comprehensive tests for these code paths before merging.

Bug: Tab Metadata Loss in pool.js:145

available.push(entry || { tab, useCount: entry?.useCount || 0 });

Problem: If entry is falsy, creates new object with useCount: 0, losing actual count.

Fix:

if (entry) {
  available.push(entry);
} else {
  available.push({ tab, useCount: 0 });
}

🤔 Other Concerns

Protocol Timeout Too Long (browser.js:61)

60s is still long; combined with 45s screenshot timeout = 105s total
Consider 30s (2x screenshot timeout for buffer)

CI Args Always Applied (browser.js:59)

CI_OPTIMIZED_ARGS applied everywhere, not just CI
May hurt local dev performance (disabled GPU, limited heap)
Consider: let args = process.env.CI ? CI_OPTIMIZED_ARGS : MINIMAL_ARGS;

Config Schema Defaults Duplicated

timeout default in 3 places (config-schema.js lines 47, 99, 114)
Extract to constant: const DEFAULT_SCREENSHOT_TIMEOUT = 45_000;

🔒 Security: ✅ No Issues

Browser args are security-hardening
No user input in new code paths
Tab isolation intact

📊 Performance

Positive:

Tab recycling prevents memory growth
Reduced protocol timeout speeds failure detection
Retry logic avoids full job failures

Concerns:

Recycling overhead: ~10% for fast pages (<1s render)
Retry doubles latency: up to 90s per failed task
Memory spike during recycling (should be negligible)

✅ Recommendations

Before merging (REQUIRED):

Add test coverage for tab recycling
Add test coverage for retry logic
Fix bug in pool.js:145

Consider:
4. Make CI optimizations conditional on process.env.CI
5. Extract attemptTask for testability
6. Reduce protocol timeout to 30s
7. Consolidate timeout constants

🎬 Conclusion

Strong PR with well-reasoned solutions. Code quality is high. However, lack of test coverage for critical features is a blocker.

Recommendation: Request changes - add tests before merging. These are failure-path features that need verification.

Great work on the retry logic and tab recycling! 💯

- Add pool.js tests: tab recycling after N uses, recycled stat tracking, _poolEntry metadata preservation, new tab creation failure handling, and use count increment behavior - Add tasks.js tests: retry logic for timeout errors, retry with fresh tab, no retry on second failure, retry note appended to errors, tab closure before retry, and pool returning null during retry - Fix bug in pool.js where tab metadata could be lost in edge case

Robdel12 added 2 commits December 22, 2025 10:22

💄 Use CLI output module for cleaner formatting

c279c5a

Use output.print(), blank(), hint(), warn() instead of logger.info() for the 'No TDD server' message to match the CLI's output style.

Robdel12 merged commit 282c738 into main Dec 22, 2025
16 of 18 checks passed

Robdel12 deleted the rd/static-site-speed branch December 22, 2025 17:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🚀 Static Site SDK: CI reliability and performance improvements #153

🚀 Static Site SDK: CI reliability and performance improvements #153

Uh oh!

Robdel12 commented Dec 22, 2025

Uh oh!

blacksmith-sh bot commented Dec 22, 2025 •

edited

Loading

Uh oh!

claude bot commented Dec 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

🚀 Static Site SDK: CI reliability and performance improvements #153

🚀 Static Site SDK: CI reliability and performance improvements #153

Uh oh!

Conversation

Robdel12 commented Dec 22, 2025

Summary

Test plan

Uh oh!

blacksmith-sh bot commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Failures

Uh oh!

claude bot commented Dec 22, 2025

PR Review: Static Site SDK CI Reliability and Performance Improvements

🎯 Strengths

🔍 Critical Issues

Test Coverage Gap ⚠️

Bug: Tab Metadata Loss in pool.js:145

🤔 Other Concerns

🔒 Security: ✅ No Issues

📊 Performance

✅ Recommendations

🎬 Conclusion

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

blacksmith-sh bot commented Dec 22, 2025 •

edited

Loading