Skip to content

[GLUTEN-11828][VL] Use immutable gpu config and add cuda runtime detection#11830

Open
marin-ma wants to merge 1 commit intoapache:mainfrom
marin-ma:gluten-11828
Open

[GLUTEN-11828][VL] Use immutable gpu config and add cuda runtime detection#11830
marin-ma wants to merge 1 commit intoapache:mainfrom
marin-ma:gluten-11828

Conversation

@marin-ma
Copy link
Contributor

@marin-ma marin-ma commented Mar 25, 2026

Related issue: #11828

@github-actions github-actions bot added CORE works for Gluten Core VELOX labels Mar 25, 2026
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

1 similar comment
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@marin-ma
Copy link
Contributor Author

Verified using gpu build + spark.gluten.sql.columnar.cudf=true on cpu node. @jinchengchenghh Could you help to review? Thanks!

bool hasCudaRuntimeAndDevice() {
#ifdef GLUTEN_ENABLE_GPU
int count = 0;
cudaError_t err = cudaGetDeviceCount(&count);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this code would execute on CPU node, but is the function executes by header, if not, the cuda library does not exist in CPU node, I'm not sure if it can run successfully. If you very it can run well on CPU node without CUDA environment, we may need to add a comment on it.

And the common way is to check if nvidia-smi command exists, if exists, we can check further.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we execute the GPU build on a cpu node without CUDA Runtime installed, the process will fail early when loading libvelox.so and reporting the cuda library is missing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new problem, we should not require user to install CUDA in CPU node, the build pipeline may also need to be updated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create an issue to track this, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants