Thank you for sharing the paper and code.
While reading the Experimental Settings section in the 5.2 Implementation, I have a question about fine-tuning time.
Could you please let me know approximate fine-tuning time for Multimodal-CoT if you remember?
I am trying to understand the paper and code for re-implementation.
However, due to limited computing resources(no multi-GPUs), I have to use cloud services.
This has led me to calculate the approximate fine-tuning time, as cloud companies charge based on hour.
Thank you for sharing the paper and code.
While reading the Experimental Settings section in the 5.2 Implementation, I have a question about fine-tuning time.
Could you please let me know approximate fine-tuning time for Multimodal-CoT if you remember?
I am trying to understand the paper and code for re-implementation.
However, due to limited computing resources(no multi-GPUs), I have to use cloud services.
This has led me to calculate the approximate fine-tuning time, as cloud companies charge based on hour.