Conversation
change a comment to be more clear
put back some of the original comments concerning FieldArray
fix minor grammar error in comment
There was a problem hiding this comment.
Pull request overview
This PR adds 64-byte (512-bit) buffer alignment to MPI send/receive staging buffers in the mpisppy cylinder communication framework. The motivation is that some solvers (e.g., those using AVX-512 SIMD) internally align memory to 512-bit boundaries, which could cause memory collisions if the mpisppy communication buffers are not similarly aligned.
Changes:
spwindow.py: TheSPWindowlayout tuple is extended from(offset, length)to(offset, logical_len, padded_len), wherepadded_lenrounds uplogical_lento the next multiple of 8 doubles (64 bytes). A_padded_len_8doubleshelper is added.spcommunicator.py:communicator_arraynow allocates MPI-aligned memory viaMPI.Alloc_memwith a padded length, andFieldArrayexposes both a padded "window view" (window_array()) and a logical view (array()). Validation functions are updated to check both logical and padded lengths.reduced_costs_spoke.py: Minor comment and a (tautological) assertion added.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
mpisppy/cylinders/spwindow.py |
Extended layout tuple to 3-element (offset, logical_len, padded_len); added _padded_len_8doubles helper; updated get/put to transfer padded_len doubles; updated WHOLE field construction |
mpisppy/cylinders/spcommunicator.py |
communicator_array now uses MPI.Alloc_mem for alignment; FieldArray gains dual-view (window_array / array); validation in _validate_recv_field and register_recv_field updated for new tuple layout |
mpisppy/cylinders/reduced_costs_spoke.py |
Added a comment about using logical_len and a tautological assert |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def _padded_len_8doubles(logical_len: int) -> int: | ||
| """Round up length (in doubles) to a multiple of 8 doubles (64 bytes).""" | ||
| return ((logical_len + 7) // 8) * 8 | ||
|
|
There was a problem hiding this comment.
We hand-calculate this in a bunch of places -- ideally this function would be the singular source of truth on computing the pad.
There was a problem hiding this comment.
And ideally, we'd rename it to something a bit more generic so we can just change the padding here, if needed in the future. (Drop the 8.)
| # Fallback: caller provided padded only (not recommended) | ||
| padded_len = int(entry) | ||
| logical_len = padded_len | ||
|
|
There was a problem hiding this comment.
Is this really needed? If not, let's get rid of it.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
It seems that Send/Receive buffers (windows) need to align on 512 bit boundaries because solvers use special packages that do that and someone might clobber someone else if we don't all line up.