Skip to content

Optimize transformation of firstprivate clause in OpenMP GPU offloading #157

@ouankou

Description

@ouankou

In both ROSE and REX, the private/shared/firstprivate clauses in a target region are first converted into map clause. The lowering module only handles map clauses and ignores all the other data clauses.

Given the following code as an example:

for (int j = 0; j < 100; j++)
#pragma omp target teams distribute parallel for map(to: x[0:200]) map(from: y[0:200]) firstprivate(a, n)
  for (int i = 0; i < 200; i++)
    y[i] += a * x[i];

REX will convert it to:

for (int j = 0; j < 100; j++)
#pragma omp target teams distribute parallel for map(to: x[0:200], a, n) map(from: y[0:200]) firstprivate(a, n)
  for (int i = 0; i < 200; i++)
    y[i] += a * x[i];

The firstprivate clause will be used for kernel generation (e.g. private variable initilization) but not for data transferring. However, LLVM transforms the original code without such conversion.

For example code, in LLVM, a and n is not mapped but are directly passed by value. REX creates a mapping between the host and the device. As a result, LLVM performed 200 times of data transfers (100 for x, 100 for y), but REX performed 400 times (100 for x, y, a, and n). It won't cause incorrect computing results but may cause significant performance differences.

In NeoRodinia nn benchmark, in each iteration of a while loop, it launches an omp target region. Because of the issue descriable above, the REX version has 12000 times of data transfers, and the LLVM version only has 4000 times. The data transfer time is 24ms vs. 6.7ms on Carina. When we manually change the mapping type in the REX version from map to to firstprivate, the REX version also shows 4000 times of data transfers, which takes 6.7ms.

Therefore, we need to make significant changes to the way of handling data transfers in REX to address this issue.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions