Eliminate redundant NCHW↔NHWC permute_copy and NHWC-safe view_copy transposes in ToTosaMemoryFormatPass (#18314)#18314
Eliminate redundant NCHW↔NHWC permute_copy and NHWC-safe view_copy transposes in ToTosaMemoryFormatPass (#18314)#18314
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18314
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 4 Unrelated FailuresAs of commit 7ebe074 with merge base fb90480 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
…ansposes in ToTosaMemoryFormatPass (#18314) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves both the batch dim (index 0) and the last dimension (NHWC channel) alone in their output groups, skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant with the tosa_dim_order annotation. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute models (NCHW input from placeholder) are not affected. Differential Revision: D97266678
e505b3e to
7ebe074
Compare
|
Hi, thanks for the PR! This is a complex topic to get right in all cases and FYI we are also planning on improving this internally so it is very nice to get some help with that. I see there are some errors in our unittests so looks like there are a few edge-cases to iron out before a proper review. Let us know if you have any questions about the current logic to help with this. In the meanwhile that I have two comments:
|
Summary:
Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes:
NHWC-safe reshape detection: When a 4D→4D view_copy has monotonic
shape_indices on the raw shapes and preserves both the batch dim (index 0)
and the last dimension (NHWC channel) alone in their output groups, skip
inserting input/output transposes. The view_copy can operate directly on
NHWC data.
Redundant permute_copy elimination: Model-level permute_copy ops whose
permutation matches channels_last_order (NCHW→NHWC) or its inverse
(NHWC→NCHW) AND whose input already has NHWC tosa_dim_order are redundant
with the tosa_dim_order annotation. Replace them with view_copy (identity
reshape) to avoid generating TOSA TRANSPOSE nodes. Standalone permute
models (NCHW input from placeholder) are not affected.
Differential Revision: D97266678