Question about some implementation details

First of all, thank you for sharing such great work and code.

I have two questions here:
1. When giving inference to video_gen (image_pair), why is the reconstructed rendering result passed in instead of the original image directly?
2. I see that the 'noise_timestep' seems to be fixed in the inference code, and I don't see any DDPNet related code?