Skip to content

Conversation

@Meinersbur
Copy link
Member

#137828 merged into amd-staging

lamb-j and others added 30 commits November 6, 2025 21:17
[AMDGPU] Handle empty-except-for-DI regions in PreRARematerialize

The existing check for this case only comes after a derefence of what
can be an iterator sentinel (leading to an assert).

This may not be purely NFC in that it also avoids queuing the
effectively-empty region for rescheduling, but AFAICT this should be
purely an optimization.

Testing this seems difficult, as the high-level scheduler avoids
scheduling these "empty" regions. This means a reproducer has to depend
on behavior of the scheduler passes before PreRARematStage in order to
craft a region which triggers the bug.

Since this is a release blocker I am posting a PR now, as both Shore
Shen and I have manually verified that this resolves the particular
crash from SWDEV-564142 but I am still working on making a reasonable
test.
Print out the loaded envars when `LIBOMPTARGET_DEBUG=1`

example output:
```
TARGET AMDGPU RTL --> Envar config for MI210 is used.
TARGET AMDGPU RTL --> Loaded envar: OMPX_UseMultipleSdmaEngines=1, OMPX_AdjustNumTeamsForXteamRedSmallBlockSize=0
```

---------

Co-authored-by: Ron Lieberman <ron.lieberman@amd.com>
…er Box types (llvm#165954)

Currently we handle BoxChars separately and a little differently to the
other BoxType's, however realistically they can be handled the same and
should be to simplify the pass as much as we can.
Summary:
This was a lot of code that was only used for upstream LLVM builds of
AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so
just use that. Simplifies code, can be added back if we start providing
alternate forms but I don't think there's a single use-case that would
justify it yet.

Co-authored-by: Joseph Huber <huberjn@outlook.com>
z1-cciauto and others added 29 commits November 23, 2025 08:45
This PR introduces a new additional type of map lowering for record types that Clang currently supports, in which a user can map a top-level record type and then individual members with different mapping, effectively creating a sort of "overlapping" mapping that we attempt to cut around.

This is currently most predominantly used in Fortran, when mapping descriptors and there data, we map the descriptor and its data with separate map modifiers and "cut around" the pointer data, so that wedo not overwrite it unless the runtime deems it a neccesary action based on its reference counting mechanism. However, it is a mechanism that will come in handy/trigger when a user explitily maps a record type (derived type or structure) and then explicitly maps a member with a different map type.

These additions were predominantly in the OpenMPToLLVMIRTranslation.cpp file and phase, however, one Flang test that checks end-to-end IR compilation (as far as we care for now at least) was altered.

2/3 required PRs to enable declare target to mapping, should look at PR 3/3 to check for full green passes (this one will fail a number due to some dependencies).

Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com
Using ${CMAKE_SOURCE_DIR}/DEBIAN for the directory
in which amd/hipcc/utils.cmake expects e.g. copyright.in
can break in some situations if hipcc is build with
llvm as an external project because the path evaluates
to llvm/DEBIAN instead of amd/hipcc/DEBIAN.

Define the path relative to the utils.cmake path. Extract a variable
for the build path for consistency.
…ntation

While the infrastructure for declare target to/enter and link for variables exists in the MLIR dialect and at the Flang level, the current lowering from MLIR -> LLVM IR isn't in place, it's only in place for variables that have the link clause applied.

This PR aims to extend that lowering to an initial implementation that incorporates declare target to as well, which primarily requires changes in the OpenMPToLLVMIRTranslation phase. However, a minor addition to the OpenMP dialect was required to extend the declare target enumerator to include a default None field as well.

This also requires a minor change to the Flang lowering's MapInfoFinlization.cpp pass to alter the map type for descriptors to deal with cases where a variable is marked declare to. Currently, when a descriptor variable is mapped declare target to the descriptor component can become attatched, and cannot be updated, this results in issues when an unusual allocation range is specified (effectively an off-by X error). The current solution is to map the descriptor always, as we always require an up-to-date version of this data. However, this also requires an interlinked PR that adds a more intricate type of mapping of structures/record types that clang currently implements, to circumvent the overwriting of the pointer in the descriptor.

3/3 required PRs to enable declare target to mapping, this PR should pass all tests and provide an all green CI.

Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com
@Meinersbur Meinersbur closed this Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.