-
Notifications
You must be signed in to change notification settings - Fork 132
Pull requests: pytorch/helion
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[cutedsl] Fix packed-RHS lowering and add general stack/reshape views
CLA Signed
This label is managed by the Meta Open Source bot.
#1784
opened Mar 22, 2026 by
jansel
Loading…
[cutedsl] Fix broadcast and reshape-backed matmul lowering
CLA Signed
This label is managed by the Meta Open Source bot.
#1783
opened Mar 22, 2026 by
jansel
Loading…
[cutedsl] tcgen05 MMA support
CLA Signed
This label is managed by the Meta Open Source bot.
#1777
opened Mar 21, 2026 by
jansel
Loading…
Fix device sync in generic benchmarking functions for TPU/Pallas
CLA Signed
This label is managed by the Meta Open Source bot.
#1773
opened Mar 21, 2026 by
norx1991
Loading…
restrict to persistent pid for dist kernels
CLA Signed
This label is managed by the Meta Open Source bot.
#1772
opened Mar 21, 2026 by
shunting314
Loading…
move distributed runtime utils out of examples/ folder
CLA Signed
This label is managed by the Meta Open Source bot.
#1771
opened Mar 20, 2026 by
shunting314
Loading…
avoid matching against class name in default_autotuner_fn
CLA Signed
This label is managed by the Meta Open Source bot.
#1770
opened Mar 20, 2026 by
shunting314
Loading…
Support block_ptr/TensorDescriptor with extra_mask for loads
CLA Signed
This label is managed by the Meta Open Source bot.
#1768
opened Mar 20, 2026 by
hinriksnaer
Loading…
Removing skips and in some cases adding skipIfNotCUDA for cuda only features.
CLA Signed
This label is managed by the Meta Open Source bot.
#1766
opened Mar 20, 2026 by
umechand-amd
•
Draft
[metal] Support elementwise kernels with >1D tensors
CLA Signed
This label is managed by the Meta Open Source bot.
[metal] MslAstWalker + MetalBackend MSL codegen + device function hook + tests
CLA Signed
This label is managed by the Meta Open Source bot.
[metal] Metal codegen for load/store/mask_to
CLA Signed
This label is managed by the Meta Open Source bot.
increase signal pad size for dist matmul kernels
CLA Signed
This label is managed by the Meta Open Source bot.
#1753
opened Mar 20, 2026 by
shunting314
Loading…
avoid nvshmem symm-mem backend
CLA Signed
This label is managed by the Meta Open Source bot.
#1750
opened Mar 19, 2026 by
shunting314
Loading…
[WIP] Add reduction support to helion autodiff
CLA Signed
This label is managed by the Meta Open Source bot.
#1747
opened Mar 19, 2026 by
karthickai
•
Draft
[Autotuner] Add This label is managed by the Meta Open Source bot.
generation_ctx() context manager for each autotuner generation
CLA Signed
#1745
opened Mar 19, 2026 by
yf225
Loading…
1 task done
add kernel-filter to select kernel for allreduce-rmsnorm
CLA Signed
This label is managed by the Meta Open Source bot.
#1744
opened Mar 19, 2026 by
shunting314
Loading…
more APIs to debug distributed kernel
CLA Signed
This label is managed by the Meta Open Source bot.
#1743
opened Mar 19, 2026 by
shunting314
Loading…
[Helion + torch.compile] Enable torch.compile fusion tests
CLA Signed
This label is managed by the Meta Open Source bot.
#1727
opened Mar 16, 2026 by
yf225
Loading…
[Helion + torch.compile] Add store/load transform hooks and prologue/epilogue fusion codegen
CLA Signed
This label is managed by the Meta Open Source bot.
#1724
opened Mar 16, 2026 by
yf225
Loading…
Add scheduled workflow to rerun GPU health check failures
CLA Signed
This label is managed by the Meta Open Source bot.
#1683
opened Mar 14, 2026 by
v0i0
Loading…
Fix #934: Add setting to disable 0/1 specialization
CLA Signed
This label is managed by the Meta Open Source bot.
#1678
opened Mar 13, 2026 by
tianrengao
•
Draft
Increasing block size dimensions to avoid configs which are slow and poor candidates.
CLA Signed
This label is managed by the Meta Open Source bot.
#1677
opened Mar 13, 2026 by
umechand-amd
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.