Clarification on "only output tensors can be retained" vs Example 3

Hi,

I noticed a potential confusion regarding the rule:

> only output tensors can be retained

In Example 3 (Recomputation strategy), Tensor0 is used in both subgraphs:
- Subgraph 0: [0,1]
- Subgraph 1: [0,2]

However, Tensor0 is not listed in tensors_to_retain, and it is not an output of either subgraph.

From the description, it seems Tensor0 is reloaded from slow memory in each subgraph rather than retained in fast memory.

So my understanding is:
- tensors_to_retain only refers to tensors that persist in fast memory across subgraphs
- tensors not in tensors_to_retain can still be reused, but must be reloaded from slow memory

Could you confirm if this interpretation is correct?

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on "only output tensors can be retained" vs Example 3 #60

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Clarification on "only output tensors can be retained" vs Example 3 #60

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions