Your code's GPU utilization seems to be low and less stable. I used it to train the results of a small dataset (about 1000 images). 
Your code's GPU utilization seems to be low and less stable. I used it to train the results of a small dataset (about 1000 images).