-
Notifications
You must be signed in to change notification settings - Fork 15.5k
[Flang] Move builtin .mod generation into runtimes #169497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Meinersbur
wants to merge
10,000
commits into
llvm:main
from
ROCm:users/meinersbur/rocm_flang_builtin-mods
Closed
[Flang] Move builtin .mod generation into runtimes #169497
Meinersbur
wants to merge
10,000
commits into
llvm:main
from
ROCm:users/meinersbur/rocm_flang_builtin-mods
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[AMDGPU] Handle empty-except-for-DI regions in PreRARematerialize The existing check for this case only comes after a derefence of what can be an iterator sentinel (leading to an assert). This may not be purely NFC in that it also avoids queuing the effectively-empty region for rescheduling, but AFAICT this should be purely an optimization. Testing this seems difficult, as the high-level scheduler avoids scheduling these "empty" regions. This means a reproducer has to depend on behavior of the scheduler passes before PreRARematStage in order to craft a region which triggers the bug. Since this is a release blocker I am posting a PR now, as both Shore Shen and I have manually verified that this resolves the particular crash from SWDEV-564142 but I am still working on making a reasonable test.
Print out the loaded envars when `LIBOMPTARGET_DEBUG=1` example output: ``` TARGET AMDGPU RTL --> Envar config for MI210 is used. TARGET AMDGPU RTL --> Loaded envar: OMPX_UseMultipleSdmaEngines=1, OMPX_AdjustNumTeamsForXteamRedSmallBlockSize=0 ``` --------- Co-authored-by: Ron Lieberman <ron.lieberman@amd.com>
…er Box types (llvm#165954) Currently we handle BoxChars separately and a little differently to the other BoxType's, however realistically they can be handled the same and should be to simplify the pass as much as we can.
Summary: This was a lot of code that was only used for upstream LLVM builds of AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so just use that. Simplifies code, can be added back if we start providing alternate forms but I don't think there's a single use-case that would justify it yet. Co-authored-by: Joseph Huber <huberjn@outlook.com>
…hain (llvm#168135)" breaks build of CK This reverts commit 9e9fe08.
This PR introduces a new additional type of map lowering for record types that Clang currently supports, in which a user can map a top-level record type and then individual members with different mapping, effectively creating a sort of "overlapping" mapping that we attempt to cut around. This is currently most predominantly used in Fortran, when mapping descriptors and there data, we map the descriptor and its data with separate map modifiers and "cut around" the pointer data, so that wedo not overwrite it unless the runtime deems it a neccesary action based on its reference counting mechanism. However, it is a mechanism that will come in handy/trigger when a user explitily maps a record type (derived type or structure) and then explicitly maps a member with a different map type. These additions were predominantly in the OpenMPToLLVMIRTranslation.cpp file and phase, however, one Flang test that checks end-to-end IR compilation (as far as we care for now at least) was altered. 2/3 required PRs to enable declare target to mapping, should look at PR 3/3 to check for full green passes (this one will fail a number due to some dependencies). Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com
Using ${CMAKE_SOURCE_DIR}/DEBIAN for the directory
in which amd/hipcc/utils.cmake expects e.g. copyright.in
can break in some situations if hipcc is build with
llvm as an external project because the path evaluates
to llvm/DEBIAN instead of amd/hipcc/DEBIAN.
Define the path relative to the utils.cmake path. Extract a variable
for the build path for consistency.
…ntation While the infrastructure for declare target to/enter and link for variables exists in the MLIR dialect and at the Flang level, the current lowering from MLIR -> LLVM IR isn't in place, it's only in place for variables that have the link clause applied. This PR aims to extend that lowering to an initial implementation that incorporates declare target to as well, which primarily requires changes in the OpenMPToLLVMIRTranslation phase. However, a minor addition to the OpenMP dialect was required to extend the declare target enumerator to include a default None field as well. This also requires a minor change to the Flang lowering's MapInfoFinlization.cpp pass to alter the map type for descriptors to deal with cases where a variable is marked declare to. Currently, when a descriptor variable is mapped declare target to the descriptor component can become attatched, and cannot be updated, this results in issues when an unusual allocation range is specified (effectively an off-by X error). The current solution is to map the descriptor always, as we always require an up-to-date version of this data. However, this also requires an interlinked PR that adds a more intricate type of mapping of structures/record types that clang currently implements, to circumvent the overwriting of the pointer in the descriptor. 3/3 required PRs to enable declare target to mapping, this PR should pass all tests and provide an all green CI. Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#137828 merged into amd-staging