Fix #5116 - Skip the sematics checks of compiler-generated functions and template instances in dcompute#5131
Fix #5116 - Skip the sematics checks of compiler-generated functions and template instances in dcompute#5131gulugulubing wants to merge 2 commits into
Conversation
|
|
||
| // Skip compiler-generated struct support functions (e.g. __xopEquals, | ||
| // __xopCmp, postblit, destructor). Their bodies may reference non-@compute | ||
| // templates (such as __equals for static array comparison) that are outside |
There was a problem hiding this comment.
__equals is a template which is specifically allowed, any non-template code it generates will still cause problems though.
There was a problem hiding this comment.
Are you suggesting that __equals (and its generated code, like isEqual) should be included in isNonComputeCallExpVaild similar to dcReflect, rather than using the module-based check in visit(TemplateInstance *)?
There was a problem hiding this comment.
Or should we make __equals really work on GPU?
There was a problem hiding this comment.
well it should either fall back on default element-wise comparison, or use a user-supplied opEquals both of which should be doable.
There was a problem hiding this comment.
Sorry if I've been going in circles here--With the current patch (skipping the semantic check for these
generated template instances), __equals should actually be able to run on GPU.
Are you suggesting that I should implement the new lowering for == on GPU targets, so that == doesn't rely on __equals from the CPU runtime at all)?
There was a problem hiding this comment.
Are you suggesting that __equals (and its generated code, like isEqual) should be included in isNonComputeCallExpVaild similar to dcReflect, rather than using the module-based check in visit(TemplateInstance *)?
No, templates should already work across host/device (assuming that they are leaf level function or only call other templates. If __equals does something like call memset then that obviously will not work.
Why specifically does it not currently work? is it because of memset.
There was a problem hiding this comment.
Why specifically does it not currently work? is it because of
memset.
No, it's not about memset
The problem isn't that __equals does anything GPU-incompatible — at first, I didn't quite figure it out either. The problem is how it gets instantiated. These template instances are added as members of the @compute module. Then dcomputeSemanticAnalysis walks them and either:
- Crashes on the function pointer cast
(cast(PureType)&isEqual)() inside __equalsbecause e->f is null for indirect calls. - Reports spurious errors on the nested __equals calls , because __eqauls is not tagged with
@computethendcomputeSemanticAnalysistriggersError: can only call functions from other @compute modules in @compute code.
| stop = true; | ||
| return; | ||
| } | ||
|
|
There was a problem hiding this comment.
I just find this guard is redundant.
|
|
||
| Module *m = e->f->getModule(); | ||
| if ((m == nullptr || (hasComputeAttr(m) == DComputeCompileFor::hostOnly)) && | ||
| !isNonComputeCallExpVaild(e)) { |
There was a problem hiding this comment.
Is here a better place to fix the issue by skipping template-instantiated functions? Because When the compiler instantiates a template, it copies the function body into the instantiating scope, but the internal field that getModule() reads is never updated — it still references the module where the template was originally declared.
It is related to #5116 and #5100.
Shared!(float[N]) tile;is actually a struct with static array, DMD's semantic analysis automatically generates support functions for such structs, and those functions drag in template instantiations fromnon-@computemodules, which triggers both the error message and the crash of compiler.