-
Notifications
You must be signed in to change notification settings - Fork 936
Updated ABI generation code and new libraries #13280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
hppritcha
wants to merge
149
commits into
open-mpi:main
Choose a base branch
from
hppritcha:abi-generate-ver2
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+81,788
−1,666
Draft
Changes from all commits
Commits
Show all changes
149 commits
Select commit
Hold shift + click to select a range
80f6260
Add initial ABI generation code and new libraries
jtronge d166410
Move ABI support into big count binding code
hppritcha ef656f2
bindings: fix up makefile for c interfaces
hppritcha 7f2cd9a
makefile fixes
hppritcha d6e5c6c
checkpoint
hppritcha 381f4e5
some fixes and start of a wrapper method
hppritcha eee038b
attributes: add wrapper infrastructure
hppritcha 71791dc
abi: move converter functions to a persistent file
hppritcha 49403a9
minor compiler complaint fix
hppritcha e748851
temp commit with comments notes
hppritcha c633170
Switch to using MPI Standard ABI values
Joe-Downs b2229a2
distcheck fix and more
hppritcha be305d9
git ignore additions
hppritcha 479bd68
ABI library: don't include functions in 19.3.4/5
hppritcha 7cf7127
fixes to handle count/offset/aint in abi.h
hppritcha af2271f
fix req converter abi to ompi
hppritcha 6e1edf5
fix problem with request_get_status_some
hppritcha 2d9002d
bindings: fix up REQUEST_CONST for abi
hppritcha 4b0eb93
tools: first steps to add support for ABI
hppritcha dd5c2fb
fix problem with undefined symbols in libmpi_abi
hppritcha 21a7577
hack on abi json file
hppritcha a2bc17e
abi.h template - add MPI_T related structs
hppritcha 69c928d
binding framework: fixes for MPI T stuff
hppritcha ba56a10
binding framework: add a TS_LEVEL type
hppritcha dfa5f38
more hacks on the mpi-standard-5.0-abi.json
hppritcha 19bbbb5
fix for enable-mca-dso
hppritcha 1073417
makefile changes to add symbols to libmpi_abi
hppritcha 3da4572
abi: fix problems with error handler converters
hppritcha 1c63109
fix problems with makefiles and some symbols
hppritcha dc44841
abi: move mpi_type_get_envelope etc. into templates
hppritcha ae2cb8f
rebase fixup
hppritcha 632980d
add abi variantes of mpi_aint_diff and add
hppritcha a278aa4
add abi_set/get_fortran_info
hppritcha 2aed67f
abi_fortran_stuff: fix up the imp of these
hppritcha a9cef94
abi_converters: add fortran datatypes to
hppritcha 873c979
fix for mac-os CI
hppritcha 5ac0810
pr feedback on add/diff for aints
hppritcha 91a9df5
configury: discover fortran logical false
hppritcha 0a3858b
configury fix
hppritcha e3e6536
squashme: temporary commit
hppritcha 2924479
add abi_get/set_fortran_booleans c interfaces
hppritcha 6158671
abi_fortran: add support for LOGICAL16
hppritcha cbc60ab
logical16 patch
hppritcha b4f7e3d
add comm_from/toint
hppritcha b066003
complete toint/fromint interfaces
hppritcha 3a8b0e2
fix error return values for ABI routines
hppritcha 6acb24c
minor fixup for toint/fromint
hppritcha 3ad3056
rebase fix
hppritcha 031c219
handle TAG more correctly
hppritcha 5d5e239
squash compiler warning
hppritcha 66d2232
add hooks for TAG_OUT type
hppritcha bbbceae
add better support for MPI_ROOT and source
hppritcha e56f2ff
squash a compiler warning
hppritcha 46302f1
some fixes to comm attributes wrappers
hppritcha 59450a0
fix bug in comm attr copy code
hppritcha 3a273c3
some fixes for attributes and more
hppritcha d4e6d20
checkpoint
hppritcha 0c6544e
fix for datatype converters
hppritcha 9183e23
distcheck fix
hppritcha 4ff7ad1
EVENT_INSTANCE: arg type cast fix
hppritcha d4009d5
MPI_ROOT: capture proc null type too
hppritcha 0238be8
requests: fixes to some multirequest test functions
hppritcha 20d3588
weights and source out support/fixes
hppritcha 895df0c
some fixes for message related functions
hppritcha ff2ac2f
fix mpi4py break
hppritcha d69af57
fix a few problems with datatype bindings
hppritcha 47eae52
update gitignore
hppritcha 24a00e6
add replacements to code bodies for various string lengths
hppritcha 34788ad
add support for distrib array and order
hppritcha 025a035
add support for mode bits - in only
hppritcha d668021
add support for amode out
hppritcha fcf7b2f
add support for whence
hppritcha 13bd733
add support for some win attributes
hppritcha 1595c7c
handle special case of MPI_DISPLACEMENT_CURRENT
hppritcha a384ed5
add support for combiner, typeclass, win lock assert
hppritcha e342242
c_header: comment out deprecated functions
hppritcha 749c6e4
fix problem with special attrs for windows
hppritcha 8ad1745
fix for win_shared_query
hppritcha 7ebcc06
fixes to rget/rget_accumulate
hppritcha 2299970
fix problem with code gen for win create keyval
hppritcha 48d3fa9
fix rank problem in rput/raccumulate
hppritcha 13f772c
add MPI_GROUP_EMPTY to predefined group handles
hppritcha 1b123ac
handle user error classes and codes
hppritcha dea9a8e
temporary WAR for non-blocking alltoallw
hppritcha eeecd1e
add support for comm topos
hppritcha 3d4bbd1
support INOUT attribute for all handles
hppritcha c9cd969
fix problem with attribute callback handling
hppritcha 6ed862e
catch use of special buffer consts
hppritcha ac67000
remove some debug statements
hppritcha 21d1d2b
swat nit
hppritcha df30cd6
patch get_address
hppritcha 5fe5231
add new type to handle out void stars
hppritcha f6b9480
fixes for datatypes for neighbor collectives
hppritcha 439ec21
add inouts for op, errhandler, info
hppritcha 38fb64f
cleanup datatype tmps for ialltoallw and friends
hppritcha 79ceb9c
a logical16 fix
hppritcha c9c32a0
abi_get_version/get_info add to ompi abi lib
hppritcha 7a2bcae
toint fixes
hppritcha f50928e
fix issue with ASYNC data arrays cleanup
hppritcha 0c60182
various fixes from dalcinl
hppritcha 96c507d
fix a problem
hppritcha e275324
ompi-codegen.patch from dalcinl
hppritcha d63a63f
ompi-abiinfo.patch from dalcinl
hppritcha 3ac45aa
ompi-status.patch from dalcinl
hppritcha 8dea027
move some deprecated funcs out of libmpi-abi
hppritcha f145131
add support for typeclass
hppritcha 8466a54
ops: convert data from internal to abi
hppritcha 14514d4
split rdma modes out from modes for files
hppritcha 9ca1097
add SOURCE_ARRAY type
hppritcha 4bab097
fixes to abi_get_fortran_info
hppritcha bb6b477
apply patch omp-op-inout.patch
hppritcha 6c9ab02
apply patch ompi-abi-fortran.patch
hppritcha 9bf2914
fix for FD_INOUT
hppritcha 01e2e23
fix for isend dst arg
hppritcha daf1419
apply patch ompi-query-thread.patch
hppritcha c6dbcb2
various pt2pt fixes to handle proc_null etc.
hppritcha d2cec64
rma: updates to args to handle proc_null etc
hppritcha 3dc59ca
adjustments for improbe and iprobe
hppritcha ea98001
fix for status array for MPI_Request_testsome
hppritcha 2d80888
switch to using a malloc wrapper
hppritcha 5bd0849
plug memory leak handling REQUEST_INOUT type
hppritcha 5eba4d7
patch status out to handle copy in
hppritcha 974106a
handle dargs for darray_create correctly
hppritcha 33ac09c
fixes for spawn multiple - array of info args
hppritcha 6f76e2f
disable async NBC-based cleanup for now
hppritcha 892ec37
turn back on async array cleanup stuff
hppritcha 6ab84db
cleanup: patch the coll libnbc to free data arrays
hppritcha 513516b
small patch from dalcinl
hppritcha ae7c854
info array tweak from dalcinl (ompi-info-array.patch)
hppritcha 70bfcb3
MPI_Info_toint/fromint allow to be calleable
hppritcha 88a0957
first pass at errhandler support
hppritcha 1d505a4
gitignore - add a file
hppritcha ca91828
simplify python support for user-defined err handlers
hppritcha 51fde87
minor compiler warning cleanups
hppritcha c7067ca
VERSION - add support for versioning libmpi_abi
hppritcha 1ce30e7
add man pages for new MPI_Abi and fromint/toint functions
hppritcha 37ddb2c
attributes - clean up extra helper data
hppritcha 0799d0c
minor cleanup
hppritcha 06bb520
a bit more cleanup
hppritcha 221a043
add a readme about the MPI ABI support
hppritcha 31f6c32
add short blurb for users about c ABI support
hppritcha ca05d9d
fix up blurb about libmpi_abi versioning
hppritcha ff26f79
fortran: add the MPI_Abi entry points
hppritcha a27a1b4
python framework: improve debugging
hppritcha f66ef68
fortran: fix it
hppritcha 18f3b3e
pr feedback
hppritcha 0fcd866
corrections etc. to ABI README
hppritcha 64edd0c
pr feedback for README_ABI
hppritcha 2a8c9f1
python framework: fix problem with source order
hppritcha File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule pympistandard
updated
4 files
| +7 −0 | src/pympistandard/_kinds.py | |
| +7,265 −3,104 | src/pympistandard/data/apis.json | |
| +22 −7 | src/pympistandard/isoc.py | |
| +6 −0 | tests/test_iso_c.py |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I built MPICH 5.0b1 and see that it installs libmpi_abi.so.0[.0.0].
Per our conventions, will we be shifting this CRA value to something other than 0:0:0 on release branches (e.g., 0:1:0, so that C-A is still 0, and we still produce libmpi_abi.so.0 to match MPICH)? I ask this only because Libtool recommends not shipping 0:0:0.
Thinking a little further down this rabbit hole: does having both ABI-enabled Open MPI and MPICH allow installing Open MPI and MPICH into the same
$prefix(with all default sub directories, like$includedirbeing$prefix/include)? I'm thinking "no" for at least the following reasons:mpi.hwill have all the ABI things being identical, but we'll have other differences from MPICH'smpi.h(right?).$prefix/include/mpi.h../configure --includedir=...could workaround that.libmpi_abi.so.*-- you couldn't tell if it was from Open MPI or MPICH. From a user perspective, that might be ok, but from a package manager and/or system administrator point of view -- that might get a little weird. For example, what if we both shiplibmpi_abi.so.0.a.bwith the sameaandbvalues?libmpi_abi.so.0.0.0.aandbvalues are different than MPICH's somehow? (I really haven't thought this through to know if this is even possible in a sustainable way over time -- nor what the consequences are outside of Linux)I guess I'm wondering if it's useful to build Open MPI and MPICH with something like:
This would keep a single libdir so that we don't introduce (more) LD_LIBRARY_PATH complexity, but still allow unique
mpi.h.But then again, there's still problems with
mpirunandmpiexecfilename clashes in$bindir(not to mention CLI flag differences). Maybe something like Linux-style alternatives could be useful here...? Shrug.I understand how ABI between MPI libraries solves some perceived problems for users, but trying to go the next steps to actually hide the differences between MPI implementations gets pretty tricky pretty quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of having a unique ABI is to be able to switch between different backends, which effectively requires same sonames (and filenames), so that's pretty much by design. You'd either have a single MPI library at a time in a single-prefix scenario, or as many as you want in a multiple-prefix scenario (think of Spack).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another use case I could see is the usual "modules" setup on an HPC cluster. the user could just switch between the mpich module and the openmpi module without needing to recompile/relink. that's basically how one would use spack modules system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot just blindly switch from one module to another UNLESS the ABI's are an exact match - in other words, you have to ensure that the version of the ABI you are switching to is the same as the one you currently are using. An ABI is a rigid specification - there is no such thing as a minor revision to it. Any change - be it an addition, subtraction, or (heaven forbid) a modification - results in a new ABI, and the version number (in libtool parlance, the .so number) must change.
That's the entire point of the libtool .so number - to guide the linker to picking the library that matches the signature required by the executable. In this case, that's the ABI.
People who have been building ABIs learned this the hard way. As @jsquyres pointed out, there are a ton of other problems - but setting the .so number to the ABI version is a basic necessity. Having an unchanging .so number even when the ABI changes is a disaster. The linker will basically be playing russian roulette, and users will rapidly find it...let's politely say, less than useful.
In this case, you want the ABI library to have a .so that matches the ABI it supports. You benefit from having a second library - the actual backend implementation - that can change as it is modified. Key is to tie the ABI library to the matching backend, and then change that connection as you update backends. In other words, you update the ABI-backend combination when the backend gets updated.
So the "module" is an ABI-backend combination, and the user picks the ABI they want supported along with the underlying implementation that supports that ABI. In other words, "give me MPI v2 ABI and the OMPI v6.2.1 backend". If you don't care about ABI, then just pick the implementation library. If you don't care about backend, then select the ABI module and let it pick the default backend.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsquyres IMHO, the way this should be handled is the following: The CURRENT and AGE number are implicitly defined from the MPI_ABI_VERSION/MPI_ABI_SUBVERSION numbers, that is, by the set of backward-compatible additions or the backward-incompatible changes. I am assuming that
MPI_ABI_VERSIONwill stay at1as long as there are no backward-incompatible changes, andMPI_ABI_SUBVERSIONwill bump on backward-compatible additions/updates.The REVISION number could be left for use by the MPI implementation, this way multiple revisions can be installed in the same prefix location, with ldconfig going through its usual cache update rules.
I tried to layout the rules here mpi-forum/mpi-abi-stubs#28. The "formulas" there would produce a soname
libmpi_abi.so.1, but if we want a.0suffix, that's trivial to fix by subtracting1from the formula for current.Can you point me to such recommendation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear: I know what the use cases are for ABI (e.g., env modules-style or spack-style environments). My question about whether two installs could share a common
$prefixwas probably more of a hypothetical musing more than anything else. I even answered my own question -- the answer is probably "no", for the reasons already discussed.I admit that I don't quite understand the purpose of
MPI_ABI_VERSIONandMPI_ABI_SUBVERSION. As you pointed out earlier in this thread, there's essentially 2 common styles of maintaining binary compatibility these days:How exactly do a pair of compile-time constants fit into either of those schemes? It's not described in MPI-5.0, nor is an alternate (i.e., 3rd) scheme described. MPI-5.0:20.2 loosely implies that
MPI_ABI_SUBVERSIONcan be used as a proxy forMPI_VERSIONandMPI_SUBVERSION(i.e., be used for conditional compilation of various MPI symbols / types / functions / etc.). But that seems odd -- why have new constants for a mechanism that has worked for decades?Sure,
MPI_ABI_VERSIONcould be a proxy for a Linux SONAME. But then what's the point ofMPI_ABI_SUBVERSIONat compile time (or even run time, viaMPI_ABI_GET_VERSION())?If we intend
MPI_ABI_VERSIONto be a proxy for Linux SONAME, that seems fine. Is there a scheme for howMPI_ABI_SUBVERSIONshould factor in here? The way thatMPI_ABI_SUBVERSIONis (loosely) defined in MPI-5.0 does not seem like a hypothetical scheme such as -- for example --(MPI_ABI_VERSION * 100 + MPI_ABI_SUBVERSION)would be a good candidate as a proxy for the Linux SONAME. So what do implementations and/or users useMPI_ABI_SUBVERSIONfor?I'm digressing from the main question here, and I don't mean to open a whole debate about these 2 compile-time constants here in OMPI -- such issues can be discussed at the Forum level.
For an ABI to satisfy the use cases described above (e.g., swapping out the back end), the questions of how to create linker-compatible shared library versions should really be resolved into some kind of scheme that both Open MPI and MPICH -- and our various derivative implementations -- follow. This doesn't necessarily have to be in the MPI spec itself; it's probably better as an agreement between the Open MPI and MPICH communities. My point: we need to have the discussion and then publicly document the scheme so that anyone can follow it (e.g., even outside of Open MPI and MPICH).
Doh! I just re-read the LT docs and I cannot find such a recommendation. So I guess I'm wrong here. Perhaps I'm remembering some super-old conversation about how we (OMPI?) didn't want to release with 0:0:0 because that's what's on
mainand we don't release offmain(similar to the project version number) -- i.e., more of a release philosophy kind of thing than a strict technical requirement.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My assumption at that time is now part of the standard, MPI 5.0 says (sec 20.2 pp 844):
Backwards-compatible changes, such as the addition of new handle types, will incre-
ment the minor version. Backwards-incompatible changes will increment the major version.
The addition of new functions to the MPI API does not change the ABI version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I saw that. But can you provide an example of how it would be useful / used?
I.e., how exactly is it different than
MPI_VERSIONanMPI_SUBVERSION?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Long ago, somewhere, I claimed that MPI_ABI_VERSION/SUBVERSION was not that useful. As usual, I was ignored ;'-), and now we have these version numbers in the standard. Anyway, now I believe it is still good to have the version defined (though not necessary the C macros). The MPI ABI version/subversion values are to be updated following similar rules as libtool, therefore we can use them to define the C/R/A tuple the following way:
and then under these rules we get a SONAME
libmpi_abi.so.0and then all the planets are aligned.Could you please give a bit of though to this claim of mine? Think again about the rules for updating the MPI ABI version/subversion, the libtool c/r/a update rules, my formulas above, my claim about the SONAME, and confirm whether am I right?
The other obvious uses if conditional compilation with the macros. I use the presence of MPI_ABI_VERSION in mpi4py to conditionally-compile if building against the MPI standard ABI. Regarding the use of the values of MPI_ABI_VERSION/SUBVERSION, there is definitely some overlap with MPI_VERSION/SUBVERSION.
MPI_VERSION/SUBVERSION follow the version of the MPI standard, this is unrelated to ABI or even API. The MPI standard version is not only about API but also about runtime behavior changes. MPI_VERSION/SUBVERSION evolve in ways that are totally unrelated to whether the API/ABI changes are backward compatible or not, while MPI_ABI_VERSION will stay at 1 for as long as the MPI Forum does not introduce backward incompatible changes.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting CRA proposal. I don't think it's quite right, though -- I think you have to give the MPI_ABI_[SUB]VERSION their own distinct digits (somewhat akin to bit mapping). E.g.:
This gives you unique values. Otherwise, you could end up with
Put differently: the original scheme only works if MPI_ABI_VERSION never increases (which would be morally equivalent to hard-coding C=MPI_ABI_SUBVERSION-1, A=MPI_ABI_SUBVERSION). Otherwise, we can get repeat C and A values for different values of MPI_ABI_[SUB]VERSION.
I guess what I'm asking for: can you give an example of something you'd need to #if on that is based on ABI and not API. This might be a failure of imagination on my part to come up with a useful example here...
And FWIW, I tend to prefer always defining preprocessor macros (as opposed to undefining them vs. defining them). If you always define them, you protect against typos:
whereas this will result in a compilation error: