Skip to content

[Test] Add dedicated perf test suite with entrypoint matrix#15

Open
cennn wants to merge 5 commits intoSandAI-org:mainfrom
cennn:chore/perf-tests-main-clean
Open

[Test] Add dedicated perf test suite with entrypoint matrix#15
cennn wants to merge 5 commits intoSandAI-org:mainfrom
cennn:chore/perf-tests-main-clean

Conversation

@cennn
Copy link
Copy Markdown
Collaborator

@cennn cennn commented Apr 1, 2026

🗂️ PR Category

  • ✨ New Feature
  • 🚀 Optimization (performance, memory, etc.)
  • 💥 Breaking Change
  • 🐛 Bug Fix
  • 🛠️ Development / Refactoring
  • 📚 Documentation
  • 🧹 Chore (Dependencies, CI/CD, Configuration, etc.)
  • 🧪 Testing

📝 Description

Add a dedicated tests/perf_tests suite and move performance benchmarking helpers there.

This PR adds end-to-end perf coverage for MLP, norm+residual fusion, and pointwise fusion across class/instance/function/method entrypoints (including instance + TORCH_COMPILE mode), compares results against eager and raw torch.compile baselines, and documents the known fusion-gap follow-up with TODO(perf-fusion-gap).

📊 Perf Snapshot (current run)

Measured with pytest -q tests/perf_tests -s on current branch.

Scenario torch.compile vs eager magi_compile vs eager (entrypoint range) magi_compile vs torch.compile
MLP 1.80x 1.80x ~ 1.89x 1.00x ~ 1.05x
Norm + Residual + SiLU 9.91x 4.51x ~ 4.63x 0.46x ~ 0.47x
Pointwise fusion chain 5.92x 3.46x ~ 3.58x 0.58x ~ 0.60x

Notes

  • All perf tests pass with current thresholds.
  • Coverage includes class / instance / function / method entrypoints, plus instance with TORCH_COMPILE mode.
  • TODO(perf-fusion-gap) is added for fusion-heavy workloads where magi_compile still trails raw torch.compile.

cennn added 3 commits April 1, 2026 21:30
Split perf benchmarks into tests/perf_tests with shared benchmarking helpers and add class/instance/function/method coverage plus torch-compile mode checks across MLP, norm-residual fusion, and pointwise chains.

Made-with: Cursor
Document the known magi vs torch.compile gap in fusion-heavy perf suites so follow-up optimization work has explicit tracking context.

Made-with: Cursor
Apply black-driven formatting updates for perf benchmark utilities and perf test files so repository hooks pass consistently in local and CI workflows.

Made-with: Cursor
@cennn cennn changed the title [test] add dedicated perf test suite with entrypoint matrix [Test] Add dedicated perf test suite with entrypoint matrix Apr 1, 2026
Lower MLP, norm-residual, and pointwise speedup gates to reflect observed CI variance while preserving meaningful eager-baseline improvements across entrypoints.

Made-with: Cursor
Move the repeated perf speedup assertion helper into tests/perf_tests/utils.py and reuse it across MLP, norm-residual fusion, and pointwise perf tests to reduce duplication and keep threshold checks consistent.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants