Skip to content

Fix #5116 - Skip the sematics checks of compiler-generated functions and template instances in dcompute#5131

Open
gulugulubing wants to merge 2 commits into
ldc-developers:masterfrom
gulugulubing:issue5116
Open

Fix #5116 - Skip the sematics checks of compiler-generated functions and template instances in dcompute#5131
gulugulubing wants to merge 2 commits into
ldc-developers:masterfrom
gulugulubing:issue5116

Conversation

@gulugulubing
Copy link
Copy Markdown
Contributor

It is related to #5116 and #5100.

Shared!(float[N]) tile; is actually a struct with static array, DMD's semantic analysis automatically generates support functions for such structs, and those functions drag in template instantiations from non-@compute modules, which triggers both the error message and the crash of compiler.

Comment thread gen/semantic-dcompute.cpp

// Skip compiler-generated struct support functions (e.g. __xopEquals,
// __xopCmp, postblit, destructor). Their bodies may reference non-@compute
// templates (such as __equals for static array comparison) that are outside
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__equals is a template which is specifically allowed, any non-template code it generates will still cause problems though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that __equals (and its generated code, like isEqual) should be included in isNonComputeCallExpVaild similar to dcReflect, rather than using the module-based check in visit(TemplateInstance *)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or should we make __equals really work on GPU?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well it should either fall back on default element-wise comparison, or use a user-supplied opEquals both of which should be doable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I've been going in circles here--With the current patch (skipping the semantic check for these
generated template instances), __equals should actually be able to run on GPU.

Are you suggesting that I should implement the new lowering for == on GPU targets, so that == doesn't rely on __equals from the CPU runtime at all)?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that __equals (and its generated code, like isEqual) should be included in isNonComputeCallExpVaild similar to dcReflect, rather than using the module-based check in visit(TemplateInstance *)?

No, templates should already work across host/device (assuming that they are leaf level function or only call other templates. If __equals does something like call memset then that obviously will not work.

Why specifically does it not currently work? is it because of memset.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why specifically does it not currently work? is it because of memset.

No, it's not about memset
The problem isn't that __equals does anything GPU-incompatible — at first, I didn't quite figure it out either. The problem is how it gets instantiated. These template instances are added as members of the @compute module. Then dcomputeSemanticAnalysis walks them and either:

  • Crashes on the function pointer cast (cast(PureType)&isEqual)() inside __equals because e->f is null for indirect calls.
  • Reports spurious errors on the nested __equals calls , because __eqauls is not tagged with @compute then dcomputeSemanticAnalysis triggers Error: can only call functions from other @compute modules in @compute code.

Comment thread gen/semantic-dcompute.cpp
stop = true;
return;
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just find this guard is redundant.

Comment thread gen/semantic-dcompute.cpp

Module *m = e->f->getModule();
if ((m == nullptr || (hasComputeAttr(m) == DComputeCompileFor::hostOnly)) &&
!isNonComputeCallExpVaild(e)) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is here a better place to fix the issue by skipping template-instantiated functions? Because When the compiler instantiates a template, it copies the function body into the instantiating scope, but the internal field that getModule() reads is never updated — it still references the module where the template was originally declared.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants