Skip to content

[XLA:CPU] Optimize Gather with non-contiguous slice sizes#39420

Open
copybara-service[bot] wants to merge 1 commit intomainfrom
test_885360168
Open

[XLA:CPU] Optimize Gather with non-contiguous slice sizes#39420
copybara-service[bot] wants to merge 1 commit intomainfrom
test_885360168

Conversation

@copybara-service
Copy link
Copy Markdown

[XLA:CPU] Optimize Gather with non-contiguous slice sizes

Optimizes gather with non-contiguous slice sizes using two complementary approaches:

  1. Gather-Transpose prepacking in GatherSimplifier: Allows constant folding to hoist the transpose completely out of loops when weights are static.
  2. Contiguous priority heuristics in LayoutAssignment: Defends against cache thrashing in dynamic buffers by guiding the layout assignment heuristic to assign a column-major memory stride for corresponding column-major slice gathers.

Optimizes gather with non-contiguous slice sizes using two complementary approaches:
1. Gather-Transpose prepacking in GatherSimplifier: Allows constant folding to hoist the transpose completely out of loops when weights are static.
2. Contiguous priority heuristics in LayoutAssignment: Defends against cache thrashing in dynamic buffers by guiding the layout assignment heuristic to assign a column-major memory stride for corresponding column-major slice gathers.

PiperOrigin-RevId: 885360168
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant