Skip to content

PTX: add PTXRSqrtFastPass to fold afn 1/sqrt(x) to nvvm.rsqrt.approx#807

Merged
maleadt merged 2 commits into
mainfrom
tb/rsqrt
May 21, 2026
Merged

PTX: add PTXRSqrtFastPass to fold afn 1/sqrt(x) to nvvm.rsqrt.approx#807
maleadt merged 2 commits into
mainfrom
tb/rsqrt

Conversation

@maleadt
Copy link
Copy Markdown
Member

@maleadt maleadt commented May 21, 2026

Pattern-match fdiv afn 1.0, sqrt afn(x) and emit nvvm.rsqrt.approx.{f,d} directly. Runs before PTXFDivFastPass/PTXFSqrtFastPass so the pattern isn't destroyed before instruction selection sees it.

x-ref JuliaGPU/CUDA.jl#3149 (comment), cc @vchuravy
Continuing the effort from #800, #804, #805

Pattern-match `fdiv afn 1.0, sqrt afn(x)` and emit `nvvm.rsqrt.approx.{f,d}`
directly. Runs before PTXFDivFastPass/PTXFSqrtFastPass so the pattern isn't
destroyed before instruction selection sees it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

Codecov Report

❌ Patch coverage is 98.03922% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 76.23%. Comparing base (f7d7418) to head (65b5b91).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/ptx.jl 98.03% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #807      +/-   ##
==========================================
+ Coverage   75.95%   76.23%   +0.27%     
==========================================
  Files          25       25              
  Lines        4026     4077      +51     
==========================================
+ Hits         3058     3108      +50     
- Misses        968      969       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maleadt maleadt merged commit b57a0e1 into main May 21, 2026
71 of 73 checks passed
@maleadt maleadt deleted the tb/rsqrt branch May 21, 2026 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant