WIP feat (browsers): create throughput benchmark for browser providers#115
Open
kisernl wants to merge 2 commits into
Open
WIP feat (browsers): create throughput benchmark for browser providers#115kisernl wants to merge 2 commits into
kisernl wants to merge 2 commits into
Conversation
Contributor
Browser Benchmark Results
View full run · SVG available as build artifact |
Contributor
Browser Throughput Benchmark Results
View full run · SVG available as build artifact |
Contributor
Sandbox Benchmark ResultsSequential
Staggered
Burst
View full run · SVGs available as build artifacts |
Contributor
Storage Benchmark Results1MB Files
4MB Files
10MB Files
16MB Files
View full run · SVGs available as build artifacts |
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new “browser step throughput” benchmark mode to measure per-action performance within a single long-lived browser session, complementing the existing browser lifecycle benchmark.
Changes:
- Introduces a new browser-throughput benchmark runner (50-action Wikipedia loop), result schema, and composite scoring.
- Adds provider configs, SVG generation, CLI wiring, and npm scripts for running and reporting throughput benchmarks.
- Adds a dedicated GitHub Actions workflow to run/merge throughput results and post PR comments.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| THROUGHPUT.md | Documents the new throughput benchmark methodology, scoring, running, and scheduling. |
| src/run.ts | Adds a new browser-throughput mode to run the benchmark and write results. |
| src/merge-results.ts | Adds merge + table-printing logic for browser-throughput artifacts. |
| src/browser/throughput-types.ts | Defines result and provider config types for throughput benchmarking. |
| src/browser/throughput-scoring.ts | Implements composite scoring + sorting for throughput results. |
| src/browser/throughput-providers.ts | Adds provider definitions and session options (stealth/headless/viewport). |
| src/browser/throughput-benchmark.ts | Implements the 50-action throughput benchmark runner and JSON writer. |
| src/browser/generate-throughput-svg.ts | Generates an SVG leaderboard for throughput results. |
| results/browser-throughput/.gitkeep | Ensures the results directory exists in-repo. |
| package.json | Adds bench and SVG generation scripts for browser-throughput. |
| .github/workflows/browser-throughput-benchmarks.yml | Adds CI workflow to run, merge, render, and publish throughput benchmark results. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new browser step throughput benchmark to measure and compare how quickly different browser providers can execute a sequence of agent-style actions within a single session. It adds a comprehensive workflow for automated benchmarking, updates documentation, and enhances configuration and reporting for these new benchmarks.
Key changes:
New Benchmarking Capability
.github/workflows/browser-throughput-benchmarks.yml) to automate browser throughput benchmarking across multiple providers, including scheduled daily runs, PR-triggered runs, and result collection/posting.package.jsonfor running browser throughput benchmarks per provider and for generating SVG summary tables. [1] [2]Documentation
THROUGHPUT.mdto thoroughly document the new browser step throughput benchmark, including its motivation, methodology, scoring, action sequence, and limitations.Benchmark Implementation Improvements
src/browser/benchmark.tsto allow configurable timeout and to correctly derive the iteration count for reporting, improving result accuracy and flexibility. [1] [2] [3]