Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions gen/semantic-dcompute.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,11 @@ struct DComputeSemanticAnalyser : public StoppableVisitor {
}
}
void visit(CallExp *e) override {
// Indirect calls via function pointers / delegates have no associated
// FuncDeclaration, so there is no module to check.
if (!e->f)
return;

// SynchronizedStatement is lowered to
// Critsec __critsec105; // 105 == line number
// _d_criticalenter(& __critsec105); <--
Expand Down Expand Up @@ -246,6 +251,16 @@ struct DComputeSemanticAnalyser : public StoppableVisitor {
return;
}

// Skip compiler-generated struct support functions (e.g. __xopEquals,
// __xopCmp, postblit, destructor). Their bodies may reference non-@compute
// templates (such as __equals for static array comparison) that are outside
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__equals is a template which is specifically allowed, any non-template code it generates will still cause problems though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that __equals (and its generated code, like isEqual) should be included in isNonComputeCallExpVaild similar to dcReflect, rather than using the module-based check in visit(TemplateInstance *)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or should we make __equals really work on GPU?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well it should either fall back on default element-wise comparison, or use a user-supplied opEquals both of which should be doable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I've been going in circles here--With the current patch (skipping the semantic check for these
generated template instances), __equals should actually be able to run on GPU.

Are you suggesting that I should implement the new lowering for == on GPU targets, so that == doesn't rely on __equals from the CPU runtime at all)?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that __equals (and its generated code, like isEqual) should be included in isNonComputeCallExpVaild similar to dcReflect, rather than using the module-based check in visit(TemplateInstance *)?

No, templates should already work across host/device (assuming that they are leaf level function or only call other templates. If __equals does something like call memset then that obviously will not work.

Why specifically does it not currently work? is it because of memset.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why specifically does it not currently work? is it because of memset.

No, it's not about memset
The problem isn't that __equals does anything GPU-incompatible — at first, I didn't quite figure it out either. The problem is how it gets instantiated. These template instances are added as members of the @compute module. Then dcomputeSemanticAnalysis walks them and either:

  • Crashes on the function pointer cast (cast(PureType)&isEqual)() inside __equals because e->f is null for indirect calls.
  • Reports spurious errors on the nested __equals calls , because __eqauls is not tagged with @compute then dcomputeSemanticAnalysis triggers Error: can only call functions from other @compute modules in @compute code.

// of user control. They are only codegenerated if actually referenced by
// user code, at which point the codegen layer will report any issues.
if (fd->isGenerated()) {
stop = true;
return;
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just find this guard is redundant.

IF_LOG Logger::println("current function = %s", fd->toChars());
currentFunction = fd;
}
Expand All @@ -260,6 +275,20 @@ struct DComputeSemanticAnalyser : public StoppableVisitor {
// as they contain unsupported global variables.
if (ti->tempdecl == Type::rtinfo || ti->tempdecl == Type::rtinfoImpl) {
stop = true;
return;
}

// Template instantiations for templates declared in non-@compute modules
// (e.g. __equals and isEqual from core.internal.array.equality) are
// created as a side effect of compiler-generated support functions. They
// contain calls back into their declaring (non-@compute) module, which
// would produce spurious errors. Skip them.
if (ti->tempdecl) {
Module *m = ti->tempdecl->getModule();
if (m && hasComputeAttr(m) == DComputeCompileFor::hostOnly) {
stop = true;
return;
}
}
}

Expand Down
19 changes: 19 additions & 0 deletions tests/compilable/issue5116.d
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
// Regression test for issue #5116: defining a struct with a static array
// field in a @compute module previously caused a spurious semantic error
// ("can only call functions from other `@compute` modules") followed by a
// null-pointer dereference crash in DComputeSemanticAnalyser::visit(CallExp*).
// The crash happened because compiler-generated support functions (__xopEquals)
// triggered instantiation of __equals templates from core.internal.array.equality,
// whose body contains an indirect call through a function pointer (e->f == null).

// REQUIRES: target_NVPTX
// RUN: %ldc -mdcompute-targets=cuda-350 %s

@compute(CompileFor.deviceOnly) module tests.compilable.issue5116;
import ldc.dcompute;

private enum N = 16u;

struct S {
float[N] data;
}
Loading