I noticed that there are two noisy_lq_latents in the code, which are ultimately fed into the transformer by concatenating them in sequence. This approach results in higher computational costs. May I ask:
- Compared to this, would channel-wise concatenation be better?
- If only one noisy_lq_latent is introduced, restoration can still be achieved, but the performance seems worse than your sequence concatenation method. Is my understanding correct?
Thank you for your reply!
I noticed that there are two noisy_lq_latents in the code, which are ultimately fed into the transformer by concatenating them in sequence. This approach results in higher computational costs. May I ask:
Thank you for your reply!