-
Notifications
You must be signed in to change notification settings - Fork 244
Add RTFx tracking to qwen3-asr-benchmark and validate RTFx in all benchmarks #454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
05ac224
c3aba6a
8b79517
31066de
8ea07ff
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -104,6 +104,13 @@ jobs: | |
| echo "MAX_FILES=$MAX_FILES" >> $GITHUB_OUTPUT | ||
| echo "BENCHMARK_STATUS=$BENCHMARK_STATUS" >> $GITHUB_OUTPUT | ||
|
|
||
| # Validate RTFx - 0 or N/A indicates benchmark failure | ||
| if [ "$RTFx" = "0.00" ] || [ "$RTFx" = "N/A" ]; then | ||
| echo "❌ CRITICAL: RTFx is 0 or N/A - benchmark failed" | ||
| echo "RTFx value: $RTFx" | ||
| exit 1 | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Same issue as in Prompt for agentsWas this helpful? React with 👍 or 👎 to provide feedback. |
||
| fi | ||
|
|
||
| - name: Comment PR | ||
| if: github.event_name == 'pull_request' | ||
| continue-on-error: true | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 CI cache path references old folder name The CI workflow caches (Refers to line 30) Was this helpful? React with 👍 or 👎 to provide feedback. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -69,6 +69,31 @@ jobs: | |
| echo "SMOKE_STATUS=FAILED" >> $GITHUB_OUTPUT | ||
| fi | ||
|
|
||
| # Extract RTFx metrics if results file exists | ||
| if [ -f qwen3_results_int8.json ]; then | ||
| MEDIAN_RTFx=$(jq -r '.summary.medianRTFx // "N/A"' qwen3_results_int8.json 2>/dev/null) | ||
| OVERALL_RTFx=$(jq -r '.summary.overallRTFx // "N/A"' qwen3_results_int8.json 2>/dev/null) | ||
|
|
||
| [ "$MEDIAN_RTFx" != "null" ] && [ "$MEDIAN_RTFx" != "N/A" ] && MEDIAN_RTFx=$(printf "%.2f" "$MEDIAN_RTFx") || MEDIAN_RTFx="N/A" | ||
| [ "$OVERALL_RTFx" != "null" ] && [ "$OVERALL_RTFx" != "N/A" ] && OVERALL_RTFx=$(printf "%.2f" "$OVERALL_RTFx") || OVERALL_RTFx="N/A" | ||
|
|
||
| echo "MEDIAN_RTFx=$MEDIAN_RTFx" >> $GITHUB_OUTPUT | ||
| echo "OVERALL_RTFx=$OVERALL_RTFx" >> $GITHUB_OUTPUT | ||
|
|
||
| # Fail if RTFx is 0 or N/A - indicates benchmark failure | ||
| if [ "$MEDIAN_RTFx" = "N/A" ] || [ "$MEDIAN_RTFx" = "0.00" ] || [ "$OVERALL_RTFx" = "N/A" ] || [ "$OVERALL_RTFx" = "0.00" ]; then | ||
| echo "❌ CRITICAL: RTFx is 0 or N/A - benchmark failed to produce valid results" | ||
| echo "Median RTFx: $MEDIAN_RTFx" | ||
| echo "Overall RTFx: $OVERALL_RTFx" | ||
| exit 1 | ||
| fi | ||
| else | ||
| echo "❌ CRITICAL: Results file not found - benchmark failed" | ||
| echo "MEDIAN_RTFx=N/A" >> $GITHUB_OUTPUT | ||
| echo "OVERALL_RTFx=N/A" >> $GITHUB_OUTPUT | ||
| exit 1 | ||
|
Comment on lines
+88
to
+94
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Same issue: the new Prompt for agentsWas this helpful? React with 👍 or 👎 to provide feedback. |
||
| fi | ||
|
|
||
| EXECUTION_TIME=$(( ($(date +%s) - BENCHMARK_START) / 60 ))m$(( ($(date +%s) - BENCHMARK_START) % 60 ))s | ||
| echo "EXECUTION_TIME=$EXECUTION_TIME" >> $GITHUB_OUTPUT | ||
|
|
||
|
|
@@ -81,6 +106,9 @@ jobs: | |
| const status = '${{ steps.smoketest.outputs.SMOKE_STATUS }}'; | ||
| const emoji = status === 'PASSED' ? '✅' : '❌'; | ||
|
|
||
| const medianRTFx = '${{ steps.smoketest.outputs.MEDIAN_RTFx }}'; | ||
| const overallRTFx = '${{ steps.smoketest.outputs.OVERALL_RTFx }}'; | ||
|
|
||
| const body = `## Qwen3-ASR int8 Smoke Test ${emoji} | ||
|
|
||
| | Check | Result | | ||
|
|
@@ -91,6 +119,12 @@ jobs: | |
| | Transcription pipeline | ${emoji} | | ||
| | Decoder size | 571 MB (vs 1.1 GB f32) | | ||
|
|
||
| ### Performance Metrics | ||
| | Metric | CI Value | Expected on Apple Silicon | | ||
| |--------|----------|--------------------------| | ||
| | Median RTFx | ${medianRTFx}x | ~2.5x | | ||
| | Overall RTFx | ${overallRTFx}x | ~2.5x | | ||
|
|
||
| <sub>Runtime: ${{ steps.smoketest.outputs.EXECUTION_TIME }}</sub> | ||
|
|
||
| <sub>**Note:** CI VM lacks physical GPU — CoreML MLState (macOS 15) KV cache produces degraded results on virtualized runners. On Apple Silicon: ~1.3% WER / 2.5x RTFx.</sub> | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -129,16 +129,8 @@ public enum Repo: String, CaseIterable { | |
| return "nemotron-streaming/560ms" | ||
| case .sortformer: | ||
| return "sortformer" | ||
| case .lseend: | ||
| return "ls-eend" | ||
| case .pocketTts: | ||
| return "pocket-tts" | ||
| case .multilingualG2p: | ||
| return "charsiu-g2p-byt5" | ||
| case .parakeetTdtCtc110m: | ||
| return "parakeet-tdt-ctc-110m" | ||
| default: | ||
| return name | ||
| return name.replacingOccurrences(of: "-coreml", with: "") | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 Changing The refactoring intended to simplify the 4 explicitly removed cases (
Was this helpful? React with 👍 or 👎 to provide feedback. |
||
| } | ||
| } | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴
exit 1prevents PR comment from being posted because Comment PR step lacksalways()The old code at this location had a deliberate comment:
# Don't exit with error to allow PR comment to be posted. This PR replaces that withexit 1. The subsequent "Comment PR" step atasr-benchmark.yml:254usesif: github.event_name == 'pull_request'which, per GitHub Actions docs, implicitly becomesif: success() && github.event_name == 'pull_request'. Whenexit 1fires,success()evaluates to false and the PR comment step is skipped entirely. Benchmark results will not be visible on the PR. The other workflows that already hadexit 1and work correctly (diarizer, sortformer) useif: always()on their comment steps.Prompt for agents
Was this helpful? React with 👍 or 👎 to provide feedback.