Is this a duplicate?
Type of Bug
Runtime Error
Component
cuda.pathfinder
Describe the bug
Forwwarding from cupy/cupy#9803. If everything else fails for e.g. header discovery, cuda-pathfinder falls back to using:
_resolve_system_loaded_abs_path_in_subprocess
- and:
run_in_spawned_child_process
However, if you use this inside a script (i.e. missing if __name__ == "__main__"), the process spawn is problematic, as it leads to the script itself being executed during bootstrapping.
I.e. there are (apparently) two possible outcomes here:
- Python raises an error pointing out that
if __name == "__main__" seems missing (I got this if I just call _resolve_system_loaded_abs_path_in_subprocess() directly.
- In the OP the script spawning process just triggered a second run of the script with the first run hitting the 10 second time-out if the script runs more than 10 seconds.
(I am not sure why the nested call succeeds spawning, but I doubt it matters.)
Not sure if this is a priority, because I am not sure how likely it is for cuda-pathfinder to reach this path. The work-around of setting CUDA_PATH= is also straight-forward. But it limits the reliability of this fallback.
How to Reproduce
I am not sure what likely needs to be weird about the environment for cuda-pathfinder to go to such lengths to find the right paths.
However, calling:
cuda.pathfinder._headers.find_nvidia_headers._find_ctk_header_directory_via_canary("nvrtc", "nvrtc.h")
in a Python script reproduces the first error for me (an older cuda-pathfinder version, main requires a different signature there, I think.).
Expected behavior
Maybe rather than spawning, this needs to use a subprocess to avoid issues with scripts?
I.e. execute something like:
subprocess.check_output([sys.executable, "-c", f"from cuda.pathfinder import ...; probe_canary_abs_path_and_print_json('{libname}')"])
or maybe nicer and safer, via:
[sys.executable, "-m", cuda.pathfinder._something, libname]
(I am not sure if there are subtleties around subprocess, but I guess spawn must be similar already?).
Operating System
No response
nvidia-smi output
No response
Is this a duplicate?
Type of Bug
Runtime Error
Component
cuda.pathfinder
Describe the bug
Forwwarding from cupy/cupy#9803. If everything else fails for e.g. header discovery,
cuda-pathfinderfalls back to using:_resolve_system_loaded_abs_path_in_subprocessrun_in_spawned_child_processHowever, if you use this inside a script (i.e. missing
if __name__ == "__main__"), the processspawnis problematic, as it leads to the script itself being executed during bootstrapping.I.e. there are (apparently) two possible outcomes here:
if __name == "__main__"seems missing (I got this if I just call_resolve_system_loaded_abs_path_in_subprocess()directly.(I am not sure why the nested call succeeds spawning, but I doubt it matters.)
Not sure if this is a priority, because I am not sure how likely it is for cuda-pathfinder to reach this path. The work-around of setting
CUDA_PATH=is also straight-forward. But it limits the reliability of this fallback.How to Reproduce
I am not sure what likely needs to be weird about the environment for
cuda-pathfinderto go to such lengths to find the right paths.However, calling:
in a Python script reproduces the first error for me (an older cuda-pathfinder version,
mainrequires a different signature there, I think.).Expected behavior
Maybe rather than spawning, this needs to use a subprocess to avoid issues with scripts?
I.e. execute something like:
or maybe nicer and safer, via:
(I am not sure if there are subtleties around
subprocess, but I guessspawnmust be similar already?).Operating System
No response
nvidia-smi output
No response