Dear Authors,
Thanks for this interesting work. I've tested your code, and it works pretty awesome.
However, one thing that worth discussing is that the method seems over-relying on AdaIN. That is, the method lacks robustnes (even not workable!) after I disabled the get_adain_callback function in the cactif_model, when leaving other mechanisms intact.
Here are examples before I disable the AdaIN:


These are examples after I commented out line 75-82 and replace with return:
def get_adain_callback(self) -> Callable:
"""
Returns a callback function for AdaIN or class-AdaIN based on the current step and config.
"""
def callback(st: int, t: int, latents: torch.FloatTensor) -> None:
return
# self.step = st
# if self.config.class_adain_range.start <= self.step < self.config.class_adain_range.end and self.config.adain_class:
# # Apply class-wise AdaIN
# latents[0] = custom_adain_pixel(latents[0], latents[1], self.label_content_adain, self.label_style_adain)
# else:
# # Apply standard AdaIN
# latents[0] = adain(latents[0], latents[1])
return callback


Since this method is heavily based on the cross-image attention. I'm not sure if this is one of limitation of it?
Dear Authors,
Thanks for this interesting work. I've tested your code, and it works pretty awesome.
However, one thing that worth discussing is that the method seems over-relying on AdaIN. That is, the method lacks robustnes (even not workable!) after I disabled the
get_adain_callbackfunction in thecactif_model, when leaving other mechanisms intact.Here are examples before I disable the AdaIN:
These are examples after I commented out line 75-82 and replace with
return:Since this method is heavily based on the cross-image attention. I'm not sure if this is one of limitation of it?