Encountered when running dryml code in a jupyter notebook.
I decorated a training method with @dryml.compute and realized I had included debug print statements in a forward method of a pytorch model (which is called very frequently) and the cell output was being flooded with messages. I interrupted the cell, but realized that the subprocess doesn't receive the interrupt signal, and in fact won't stop because it waits for the closing signal from one of the queues. I had to terminate the process.
Could be more reason to go to ray for this type of execution?
Encountered when running dryml code in a jupyter notebook.
I decorated a training method with @dryml.compute and realized I had included debug print statements in a
forwardmethod of a pytorch model (which is called very frequently) and the cell output was being flooded with messages. I interrupted the cell, but realized that the subprocess doesn't receive the interrupt signal, and in fact won't stop because it waits for the closing signal from one of the queues. I had to terminate the process.Could be more reason to go to ray for this type of execution?