Skip to content

feat(runtime): implement device runtime layer (Python)#277

Open
Peter9606 wants to merge 1 commit intoROCm:mainfrom
Deep-Spark:fujun.han/pluggable-compile-backend
Open

feat(runtime): implement device runtime layer (Python)#277
Peter9606 wants to merge 1 commit intoROCm:mainfrom
Deep-Spark:fujun.han/pluggable-compile-backend

Conversation

@Peter9606
Copy link

  • Add flydsl.runtime.device_runtime: Device/DeviceRuntime/Event, RocmDeviceRuntime
  • FLYDSL_RUNTIME_KIND (env.runtime.kind), compile/runtime pairing validation
  • get_backend() triggers get_device_runtime() for single-stack invariant
  • register_device_runtime, register_compile_runtime_mapping for extensions
  • Unit tests.

code with cursor

@Peter9606
Copy link
Author

This PR is intended to show the impact of RFC : Device and runtime layer (aligned with pluggable compile backends)

# single GPU stack per process; validate against FLYDSL_RUNTIME_KIND.
from ...runtime.device_runtime import get_device_runtime

get_device_runtime()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compile get runtime?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — get_backend() is the compile-backend resolver, and calling get_device_runtime() here couples compile resolution to the device-runtime singleton. We did it to fail fast on FLYDSL_COMPILE_BACKEND vs FLYDSL_RUNTIME_KIND mismatch (RFC).
If we’d rather keep compile resolution free of runtime, we can move this check to the first JIT compile / launch path (e.g. jit_function / CompiledArtifact) and leave get_backend() pure. Happy to adjust in the next push.

n = 0
self._torch_device_count = max(n, 1)
return self._torch_device_count

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we add those Jit runner funcs here after done?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RocmDeviceRuntime in this PR is intentionally minimal — it only exists so we have a process-wide runtime kind (rocm) and can validate it against FLYDSL_COMPILE_BACKEND.
device_count() is a small probe via PyTorch for now; we are not planning to add JIT runner / ExecutionEngine glue here. That stays in jit_executor / compiler as today.
Follow-up work (per RFC) could add stream/event helpers on DeviceRuntime, but still not duplicate the JIT launch path into this module.
We can change device_count() to return 0 when no CUDA/ROCm device is visible instead of max(n, 1) to make it clear.

@Peter9606 Peter9606 force-pushed the fujun.han/pluggable-compile-backend branch 2 times, most recently from 0e746c2 to d47854a Compare March 25, 2026 02:08
@Peter9606 Peter9606 changed the base branch from refactor/pluggable-compile-backend to main March 25, 2026 05:26
- Add flydsl.runtime.device_runtime: Device/DeviceRuntime/Event, RocmDeviceRuntime
- FLYDSL_RUNTIME_KIND (env.runtime.kind), compile/runtime pairing validation
- get_backend() triggers get_device_runtime() for single-stack invariant
- register_device_runtime, register_compile_runtime_mapping for extensions
- Unit tests.

Signed-off-by: Fujun Han <fujun.han@iluvatar.com>
@Peter9606 Peter9606 force-pushed the fujun.han/pluggable-compile-backend branch from 01cd244 to 0a4c4a8 Compare March 25, 2026 05:31
@Peter9606
Copy link
Author

Hi @coderfeli , I've updated the base branch of this PR from refactor/pluggable-compile-backend to main. The current diff now correctly reflects the changes against the main branch. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants