Skip to content

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks#39428

Closed
ezhulenev wants to merge 2 commits intoopenxla:mainfrom
ezhulenev:async-execution-3
Closed

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks#39428
ezhulenev wants to merge 2 commits intoopenxla:mainfrom
ezhulenev:async-execution-3

Conversation

@ezhulenev
Copy link
Copy Markdown
Contributor

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

@ezhulenev ezhulenev requested review from penpornk and seantalts March 18, 2026 16:45
Copy link
Copy Markdown
Member

@penpornk penpornk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to test internally.

copybara-service Bot pushed a commit to tensorflow/tensorflow that referenced this pull request Mar 18, 2026
…e memcpy thunks

Imported from GitHub PR openxla/xla#39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt openxla/xla#39006)
Copybara import of the project:

--
e98a6f854b2ed999c77737799d8f9de5e637e180 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5de99f45d7bcf8b6b05c09c6ddb53e7184 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#39428 from ezhulenev:async-execution-3 bbad4c5de99f45d7bcf8b6b05c09c6ddb53e7184
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit that referenced this pull request Mar 18, 2026
…e memcpy thunks

Imported from GitHub PR #39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt #39006)
Copybara import of the project:

--
e98a6f8 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=#39428 from ezhulenev:async-execution-3 bbad4c5
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit that referenced this pull request Mar 18, 2026
…e memcpy thunks

Imported from GitHub PR #39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt #39006)
Copybara import of the project:

--
e98a6f8 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=#39428 from ezhulenev:async-execution-3 bbad4c5
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit to tensorflow/tensorflow that referenced this pull request Mar 18, 2026
…e memcpy thunks

Imported from GitHub PR openxla/xla#39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt openxla/xla#39006)
Copybara import of the project:

--
e98a6f854b2ed999c77737799d8f9de5e637e180 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5de99f45d7bcf8b6b05c09c6ddb53e7184 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#39428 from ezhulenev:async-execution-3 bbad4c5de99f45d7bcf8b6b05c09c6ddb53e7184
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit that referenced this pull request Mar 18, 2026
…e memcpy thunks

Imported from GitHub PR #39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt #39006)
Copybara import of the project:

--
e98a6f8 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=#39428 from ezhulenev:async-execution-3 bbad4c5
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit to tensorflow/tensorflow that referenced this pull request Mar 18, 2026
…e memcpy thunks

Imported from GitHub PR openxla/xla#39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt openxla/xla#39006)
Copybara import of the project:

--
e98a6f854b2ed999c77737799d8f9de5e637e180 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5de99f45d7bcf8b6b05c09c6ddb53e7184 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#39428 from ezhulenev:async-execution-3 bbad4c5de99f45d7bcf8b6b05c09c6ddb53e7184
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit that referenced this pull request Mar 18, 2026
…e memcpy thunks

Imported from GitHub PR #39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt #39006)
Copybara import of the project:

--
e98a6f8 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=#39428 from ezhulenev:async-execution-3 bbad4c5
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit that referenced this pull request Mar 20, 2026
…e memcpy thunks

Imported from GitHub PR #39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt #39006)
Copybara import of the project:

--
e98a6f8 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=#39428 from ezhulenev:async-execution-3 bbad4c5
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit to tensorflow/tensorflow that referenced this pull request Mar 20, 2026
…e memcpy thunks

Imported from GitHub PR openxla/xla#39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt openxla/xla#39006)
Copybara import of the project:

--
e98a6f854b2ed999c77737799d8f9de5e637e180 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5de99f45d7bcf8b6b05c09c6ddb53e7184 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#39428 from ezhulenev:async-execution-3 bbad4c5de99f45d7bcf8b6b05c09c6ddb53e7184
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit that referenced this pull request Mar 20, 2026
…e memcpy thunks

Imported from GitHub PR #39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt #39006)
Copybara import of the project:

--
e98a6f8 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=#39428 from ezhulenev:async-execution-3 bbad4c5
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit that referenced this pull request Mar 20, 2026
…e memcpy thunks

Imported from GitHub PR #39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt #39006)
Copybara import of the project:

--
e98a6f8 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=#39428 from ezhulenev:async-execution-3 bbad4c5
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit to tensorflow/tensorflow that referenced this pull request Mar 20, 2026
…e memcpy thunks

Imported from GitHub PR openxla/xla#39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt openxla/xla#39006)
Copybara import of the project:

--
e98a6f854b2ed999c77737799d8f9de5e637e180 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5de99f45d7bcf8b6b05c09c6ddb53e7184 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#39428 from ezhulenev:async-execution-3 bbad4c5de99f45d7bcf8b6b05c09c6ddb53e7184
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit that referenced this pull request Mar 20, 2026
…e memcpy thunks

Imported from GitHub PR #39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt #39006)
Copybara import of the project:

--
e98a6f8 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=#39428 from ezhulenev:async-execution-3 bbad4c5
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit to tensorflow/tensorflow that referenced this pull request Mar 20, 2026
…e memcpy thunks

Imported from GitHub PR openxla/xla#39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt openxla/xla#39006)
Copybara import of the project:

--
e98a6f854b2ed999c77737799d8f9de5e637e180 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5de99f45d7bcf8b6b05c09c6ddb53e7184 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#39428 from ezhulenev:async-execution-3 bbad4c5de99f45d7bcf8b6b05c09c6ddb53e7184
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit that referenced this pull request Mar 20, 2026
…e memcpy thunks

Imported from GitHub PR #39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt #39006)
Copybara import of the project:

--
e98a6f8 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
bbad4c5 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=#39428 from ezhulenev:async-execution-3 bbad4c5
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit that referenced this pull request Mar 23, 2026
…e memcpy thunks

Imported from GitHub PR #39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt #39006)
Copybara import of the project:

--
c7b9c25 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
1dc71b5 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=#39428 from ezhulenev:async-execution-3 1dc71b5
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit to tensorflow/tensorflow that referenced this pull request Mar 23, 2026
…e memcpy thunks

Imported from GitHub PR openxla/xla#39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt openxla/xla#39006)
Copybara import of the project:

--
c7b9c250652aa597cf822cde8f4b3144ca24d5ce by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
1dc71b5e3336a58936b35799eeff4f0d85772f9b by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#39428 from ezhulenev:async-execution-3 1dc71b5e3336a58936b35799eeff4f0d85772f9b
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit that referenced this pull request Mar 23, 2026
…e memcpy thunks

Imported from GitHub PR #39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt #39006)
Copybara import of the project:

--
c7b9c25 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
1dc71b5 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=#39428 from ezhulenev:async-execution-3 1dc71b5
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit to tensorflow/tensorflow that referenced this pull request Mar 23, 2026
…e memcpy thunks

Imported from GitHub PR openxla/xla#39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt openxla/xla#39006)
Copybara import of the project:

--
c7b9c250652aa597cf822cde8f4b3144ca24d5ce by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
1dc71b5e3336a58936b35799eeff4f0d85772f9b by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#39428 from ezhulenev:async-execution-3 1dc71b5e3336a58936b35799eeff4f0d85772f9b
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit that referenced this pull request Mar 23, 2026
…e memcpy thunks

Imported from GitHub PR #39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt #39006)
Copybara import of the project:

--
c7b9c25 by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
1dc71b5 by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=#39428 from ezhulenev:async-execution-3 1dc71b5
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit to tensorflow/tensorflow that referenced this pull request Mar 23, 2026
…e memcpy thunks

Imported from GitHub PR openxla/xla#39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt openxla/xla#39006)
Copybara import of the project:

--
c7b9c250652aa597cf822cde8f4b3144ca24d5ce by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
1dc71b5e3336a58936b35799eeff4f0d85772f9b by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#39428 from ezhulenev:async-execution-3 1dc71b5e3336a58936b35799eeff4f0d85772f9b
PiperOrigin-RevId: 885647980
copybara-service Bot pushed a commit to tensorflow/tensorflow that referenced this pull request Mar 23, 2026
…e memcpy thunks

Imported from GitHub PR openxla/xla#39428

Replace copy thunk async events with generic AsyncStartThunk/AsyncDoneThunk for H2D/D2H copies. Remove CopyDoneThunk.

+ add a fix for correct execution stream resolving (previous attempt openxla/xla#39006)
Copybara import of the project:

--
c7b9c250652aa597cf822cde8f4b3144ca24d5ce by Eugene Zhulenev <ezhulenev@openxla.org>:

[xla:gpu] Use generic AsynStart/Done thunks for host/device memcpy thunks

--
1dc71b5e3336a58936b35799eeff4f0d85772f9b by Eugene Zhulenev <ezhulenev@openxla.org>:

Use params.stream isntead of resolving it from attributes

Merging this change closes #39428

PiperOrigin-RevId: 888338491
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants