Hi, great work! I find your training set up quite interesting and wanted to get a bit more detail as to ho wyou managed to get such a flexible inference editing mechanism.
I understand that during training you use batches of both bounding box and free-form ROIs, inlcuding also empty channels, accompanied with the adequate conditioning vector.
I would like to know how do you select when to use each of the ROI modes. Are they selected randomly? Did some of the ROI perform better than others? Does empty channel mean a tensor of zeros?
Also, do you have plans of publishing a quantitative evalution?
Thanks!