[Python] Fix mx.array DLPack dispatch#3476
Open
apstenku123 wants to merge 1 commit intoml-explore:mainfrom
Open
[Python] Fix mx.array DLPack dispatch#3476apstenku123 wants to merge 1 commit intoml-explore:mainfrom
apstenku123 wants to merge 1 commit intoml-explore:mainfrom
Conversation
apstenku123
added a commit
to DatasunriseOU/cppmega_mlx
that referenced
this pull request
May 4, 2026
Files documenting the actual PRs we just opened upstream: - PR #1: ml-explore/mlx#3476 — from_dlpack Metal-aware consumer (against main, clean) - PR #2: apache/tvm#19504 — TVM_METAL_STORAGE_MODE env opt-in (against main, clean) - PR #3: tile-ai/tilelang#2139 — mixed-dtype T.gemm via scalar fallback (stacks on PR #2130) - PR #4: tile-ai/tilelang#2140 — FP8-input T.gemm scalar fallback routing (stacks on PR #2130) - PR #5: tile-ai/tilelang#2141 — T.Pipelined num_stages>1 3D buffer fix (stacks on PR #2130) - PR #6: tile-ai/tilelang#2142 — T.fp8_scaled_matmul DSL intrinsic (stacks on PR #2130) Deferred (split into companion PRs needed): tilelang_metal_fp8 and tilelang_metal_fp8_vector each touch both tilelang supermodule and the TileLang/tvm vendored submodule. These need 2 PRs each — one to tile-ai/tilelang, one to TileLang/tvm — separate filing round. PRs #3-#6 are independent of each other; each branches directly from jorgecurious/tilelang:metal-gemm-upstream-rebase HEAD 971c17b, so they can be reviewed in any order. They DO depend on the upstream 4-PR Apple Metal landing chain (#1869, #2118, #2121, #2130) merging first; if any of those land separately, ours can be retargeted at main.
zcbenz
requested changes
May 4, 2026
Collaborator
zcbenz
left a comment
There was a problem hiding this comment.
mx.array(tvm_metal_tensor) falls into create_array → to_array_with_accessor and fails with bad_cast.
This would be a bug and should be fixed instead of introducing a new API that does the same thing. The constructor of mx.array is supposed to take dlpack.
223f948 to
3c6e454
Compare
zcbenz
reviewed
May 5, 2026
|
|
||
| const bool has_mlx_array = nb::hasattr(v, "__mlx_array__"); | ||
| const bool is_dlpack = | ||
| PyCapsule_CheckExact(v.ptr()) || nb::hasattr(v, "__dlpack__"); |
Collaborator
There was a problem hiding this comment.
The ndarray API of nanobind is supposed to handle dlpack:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes the existing mx.array(...) constructor path for DLPack producers/capsules instead of adding a new public workaround API.\n\nmx.array(obj) now detects raw DLPack PyCapsules and objects exposing dlpack before falling through to the generic accessor conversion path, so DLPack producers do not hit to_array_with_accessor / bad_cast. mlx_array remains higher precedence.\n\nThe shared consumer handles kDLCPU by copying into a fresh MLX allocation and kDLMetal by wrapping a shared MTLBuffer without copying. Unsupported devices, non-row-contiguous strides, non-zero byte offsets, invalid shapes/dtypes, and already-consumed capsules are rejected with explicit errors. Rejected capsules are left unconsumed.\n\nTests cover mx.array(...) from a DLPack producer, raw capsules, dtype override/default behavior, mlx_array precedence, used capsule rejection, and preserving rejected capsules.