forked from guoyww/AnimateDiff
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathtraining_final.log
More file actions
6387 lines (5176 loc) · 477 KB
/
training_final.log
File metadata and controls
6387 lines (5176 loc) · 477 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
loaded 3D unet's pretrained weights from runwayml/stable-diffusion-v1-5 ...
### missing keys: 520;
### unexpected keys: 0;
### Motion Module Parameters: 417.1376 M
11/14/2025 06:08:39 - INFO - root - ***** Running training *****
11/14/2025 06:08:39 - INFO - root - Num examples = 32
11/14/2025 06:08:39 - INFO - root - DataLoader length = 32
11/14/2025 06:08:39 - INFO - root - Num update steps per epoch = 32
11/14/2025 06:08:39 - INFO - root - Num Epochs = 63
11/14/2025 06:08:39 - INFO - root - Instantaneous batch size per device = 1
11/14/2025 06:08:39 - INFO - root - Total train batch size (w. parallel, distributed & accumulation) = 1
11/14/2025 06:08:39 - INFO - root - Gradient Accumulation steps = 1
11/14/2025 06:08:39 - INFO - root - Total optimization steps = 2000
0%| | 0/2000 [00:00<?, ?it/s]Steps: 0%| | 0/2000 [00:00<?, ?it/s]11/14/2025 06:08:39 - INFO - root - ### DEBUG: Starting epoch 0/63, global_step=0, max_train_steps=2000
/home/takahashit/.local/lib/python3.11/site-packages/torch/utils/checkpoint.py:464: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/home/takahashit/.local/lib/python3.11/site-packages/torch/utils/checkpoint.py:91: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn(
Steps: 0%| | 1/2000 [00:01<44:57, 1.35s/it]Steps: 0%| | 1/2000 [00:01<44:57, 1.35s/it, lr=0.0001, step_loss=0.011]Steps: 0%| | 2/2000 [00:02<33:37, 1.01s/it, lr=0.0001, step_loss=0.011]Steps: 0%| | 2/2000 [00:02<33:37, 1.01s/it, lr=0.0001, step_loss=0.00576]Steps: 0%| | 3/2000 [00:02<30:17, 1.10it/s, lr=0.0001, step_loss=0.00576]Steps: 0%| | 3/2000 [00:02<30:17, 1.10it/s, lr=0.0001, step_loss=0.1] Steps: 0%| | 4/2000 [00:03<28:25, 1.17it/s, lr=0.0001, step_loss=0.1]Steps: 0%| | 4/2000 [00:03<28:25, 1.17it/s, lr=0.0001, step_loss=0.193]Steps: 0%| | 5/2000 [00:04<27:22, 1.21it/s, lr=0.0001, step_loss=0.193]Steps: 0%| | 5/2000 [00:04<27:22, 1.21it/s, lr=0.0001, step_loss=0.138]Steps: 0%| | 6/2000 [00:05<26:44, 1.24it/s, lr=0.0001, step_loss=0.138]Steps: 0%| | 6/2000 [00:05<26:44, 1.24it/s, lr=0.0001, step_loss=0.0671]Steps: 0%| | 7/2000 [00:05<26:21, 1.26it/s, lr=0.0001, step_loss=0.0671]Steps: 0%| | 7/2000 [00:06<26:21, 1.26it/s, lr=0.0001, step_loss=0.00197]Steps: 0%| | 8/2000 [00:06<26:05, 1.27it/s, lr=0.0001, step_loss=0.00197]Steps: 0%| | 8/2000 [00:06<26:05, 1.27it/s, lr=0.0001, step_loss=0.00339]Steps: 0%| | 9/2000 [00:07<25:53, 1.28it/s, lr=0.0001, step_loss=0.00339]Steps: 0%| | 9/2000 [00:07<25:53, 1.28it/s, lr=0.0001, step_loss=0.0887] Steps: 0%| | 10/2000 [00:08<25:46, 1.29it/s, lr=0.0001, step_loss=0.0887]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.62it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.66it/s][A
12%|█▏ | 3/25 [00:01<00:13, 1.68it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.69it/s][A
20%|██ | 5/25 [00:02<00:11, 1.69it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.69it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 41.67it/s][A100%|██████████| 8/8 [00:00<00:00, 31.26it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.69it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:09, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:09<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.12it/s][A100%|██████████| 8/8 [00:00<00:00, 32.26it/s]
11/14/2025 06:09:20 - INFO - root - Saved samples to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/samples/sample-10.gif
Steps: 0%| | 10/2000 [00:40<25:46, 1.29it/s, lr=0.0001, step_loss=0.00157]Steps: 1%| | 11/2000 [00:41<5:55:10, 10.71s/it, lr=0.0001, step_loss=0.00157]Steps: 1%| | 11/2000 [00:41<5:55:10, 10.71s/it, lr=0.0001, step_loss=0.00148]Steps: 1%| | 12/2000 [00:42<4:14:44, 7.69s/it, lr=0.0001, step_loss=0.00148]Steps: 1%| | 12/2000 [00:42<4:14:44, 7.69s/it, lr=0.0001, step_loss=0.00125]Steps: 1%| | 13/2000 [00:43<3:05:11, 5.59s/it, lr=0.0001, step_loss=0.00125]Steps: 1%| | 13/2000 [00:43<3:05:11, 5.59s/it, lr=0.0001, step_loss=0.0166] Steps: 1%| | 14/2000 [00:43<2:16:52, 4.14s/it, lr=0.0001, step_loss=0.0166]Steps: 1%| | 14/2000 [00:43<2:16:52, 4.14s/it, lr=0.0001, step_loss=0.112] Steps: 1%| | 15/2000 [00:44<1:43:14, 3.12s/it, lr=0.0001, step_loss=0.112]Steps: 1%| | 15/2000 [00:44<1:43:14, 3.12s/it, lr=0.0001, step_loss=0.268]Steps: 1%| | 16/2000 [00:45<1:19:46, 2.41s/it, lr=0.0001, step_loss=0.268]Steps: 1%| | 16/2000 [00:45<1:19:46, 2.41s/it, lr=0.0001, step_loss=0.011]Steps: 1%| | 17/2000 [00:46<1:03:23, 1.92s/it, lr=0.0001, step_loss=0.011]Steps: 1%| | 17/2000 [00:46<1:03:23, 1.92s/it, lr=0.0001, step_loss=0.00293]Steps: 1%| | 18/2000 [00:46<51:57, 1.57s/it, lr=0.0001, step_loss=0.00293] Steps: 1%| | 18/2000 [00:46<51:57, 1.57s/it, lr=0.0001, step_loss=0.00147]Steps: 1%| | 19/2000 [00:47<44:05, 1.34s/it, lr=0.0001, step_loss=0.00147]Steps: 1%| | 19/2000 [00:47<44:05, 1.34s/it, lr=0.0001, step_loss=0.0172] Steps: 1%| | 20/2000 [00:48<38:33, 1.17s/it, lr=0.0001, step_loss=0.0172]Steps: 1%| | 20/2000 [00:48<38:33, 1.17s/it, lr=0.0001, step_loss=0.0292]Steps: 1%| | 21/2000 [00:49<34:34, 1.05s/it, lr=0.0001, step_loss=0.0292]Steps: 1%| | 21/2000 [00:49<34:34, 1.05s/it, lr=0.0001, step_loss=0.00174]Steps: 1%| | 22/2000 [00:50<31:49, 1.04it/s, lr=0.0001, step_loss=0.00174]Steps: 1%| | 22/2000 [00:50<31:49, 1.04it/s, lr=0.0001, step_loss=0.0968] Steps: 1%| | 23/2000 [00:50<29:53, 1.10it/s, lr=0.0001, step_loss=0.0968]Steps: 1%| | 23/2000 [00:50<29:53, 1.10it/s, lr=0.0001, step_loss=0.0203]Steps: 1%| | 24/2000 [00:51<28:30, 1.15it/s, lr=0.0001, step_loss=0.0203]Steps: 1%| | 24/2000 [00:51<28:30, 1.15it/s, lr=0.0001, step_loss=0.0939]Steps: 1%|▏ | 25/2000 [00:52<27:31, 1.20it/s, lr=0.0001, step_loss=0.0939]Steps: 1%|▏ | 25/2000 [00:52<27:31, 1.20it/s, lr=0.0001, step_loss=0.163] Steps: 1%|▏ | 26/2000 [00:53<26:50, 1.23it/s, lr=0.0001, step_loss=0.163]Steps: 1%|▏ | 26/2000 [00:53<26:50, 1.23it/s, lr=0.0001, step_loss=0.0677]Steps: 1%|▏ | 27/2000 [00:53<26:22, 1.25it/s, lr=0.0001, step_loss=0.0677]Steps: 1%|▏ | 27/2000 [00:53<26:22, 1.25it/s, lr=0.0001, step_loss=0.0741]Steps: 1%|▏ | 28/2000 [00:54<26:01, 1.26it/s, lr=0.0001, step_loss=0.0741]Steps: 1%|▏ | 28/2000 [00:54<26:01, 1.26it/s, lr=0.0001, step_loss=0.0356]Steps: 1%|▏ | 29/2000 [00:55<25:46, 1.27it/s, lr=0.0001, step_loss=0.0356]Steps: 1%|▏ | 29/2000 [00:55<25:46, 1.27it/s, lr=0.0001, step_loss=0.00114]Steps: 2%|▏ | 30/2000 [00:56<25:36, 1.28it/s, lr=0.0001, step_loss=0.00114]Steps: 2%|▏ | 30/2000 [00:56<25:36, 1.28it/s, lr=0.0001, step_loss=0.195] Steps: 2%|▏ | 31/2000 [00:56<25:29, 1.29it/s, lr=0.0001, step_loss=0.195]Steps: 2%|▏ | 31/2000 [00:56<25:29, 1.29it/s, lr=0.0001, step_loss=0.00376]Steps: 2%|▏ | 32/2000 [00:57<25:24, 1.29it/s, lr=0.0001, step_loss=0.00376]11/14/2025 06:09:43 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 32)
Steps: 2%|▏ | 32/2000 [01:03<25:24, 1.29it/s, lr=0.0001, step_loss=0.000869]11/14/2025 06:09:43 - INFO - root - ### DEBUG: Finished epoch 0, epoch_steps=32, global_step=32
11/14/2025 06:09:43 - INFO - root - ### DEBUG: Starting epoch 1/63, global_step=32, max_train_steps=2000
Steps: 2%|▏ | 33/2000 [01:04<1:22:42, 2.52s/it, lr=0.0001, step_loss=0.000869]Steps: 2%|▏ | 33/2000 [01:04<1:22:42, 2.52s/it, lr=0.0001, step_loss=0.0756] Steps: 2%|▏ | 34/2000 [01:05<1:05:23, 2.00s/it, lr=0.0001, step_loss=0.0756]Steps: 2%|▏ | 34/2000 [01:05<1:05:23, 2.00s/it, lr=0.0001, step_loss=0.0235]Steps: 2%|▏ | 35/2000 [01:05<53:20, 1.63s/it, lr=0.0001, step_loss=0.0235] Steps: 2%|▏ | 35/2000 [01:05<53:20, 1.63s/it, lr=0.0001, step_loss=0.168] Steps: 2%|▏ | 36/2000 [01:06<44:54, 1.37s/it, lr=0.0001, step_loss=0.168]Steps: 2%|▏ | 36/2000 [01:06<44:54, 1.37s/it, lr=0.0001, step_loss=0.00462]Steps: 2%|▏ | 37/2000 [01:07<38:59, 1.19s/it, lr=0.0001, step_loss=0.00462]Steps: 2%|▏ | 37/2000 [01:07<38:59, 1.19s/it, lr=0.0001, step_loss=0.0833] Steps: 2%|▏ | 38/2000 [01:08<34:51, 1.07s/it, lr=0.0001, step_loss=0.0833]Steps: 2%|▏ | 38/2000 [01:08<34:51, 1.07s/it, lr=0.0001, step_loss=0.234] Steps: 2%|▏ | 39/2000 [01:08<31:59, 1.02it/s, lr=0.0001, step_loss=0.234]Steps: 2%|▏ | 39/2000 [01:08<31:59, 1.02it/s, lr=0.0001, step_loss=0.00324]Steps: 2%|▏ | 40/2000 [01:09<30:01, 1.09it/s, lr=0.0001, step_loss=0.00324]Steps: 2%|▏ | 40/2000 [01:09<30:01, 1.09it/s, lr=0.0001, step_loss=0.0303] Steps: 2%|▏ | 41/2000 [01:10<28:32, 1.14it/s, lr=0.0001, step_loss=0.0303]Steps: 2%|▏ | 41/2000 [01:10<28:32, 1.14it/s, lr=0.0001, step_loss=0.0956]Steps: 2%|▏ | 42/2000 [01:11<27:31, 1.19it/s, lr=0.0001, step_loss=0.0956]Steps: 2%|▏ | 42/2000 [01:11<27:31, 1.19it/s, lr=0.0001, step_loss=0.0411]Steps: 2%|▏ | 43/2000 [01:12<26:47, 1.22it/s, lr=0.0001, step_loss=0.0411]Steps: 2%|▏ | 43/2000 [01:12<26:47, 1.22it/s, lr=0.0001, step_loss=0.0031]Steps: 2%|▏ | 44/2000 [01:12<26:16, 1.24it/s, lr=0.0001, step_loss=0.0031]Steps: 2%|▏ | 44/2000 [01:12<26:16, 1.24it/s, lr=0.0001, step_loss=0.000841]Steps: 2%|▏ | 45/2000 [01:13<25:56, 1.26it/s, lr=0.0001, step_loss=0.000841]Steps: 2%|▏ | 45/2000 [01:13<25:56, 1.26it/s, lr=0.0001, step_loss=0.203] Steps: 2%|▏ | 46/2000 [01:14<25:41, 1.27it/s, lr=0.0001, step_loss=0.203]Steps: 2%|▏ | 46/2000 [01:14<25:41, 1.27it/s, lr=0.0001, step_loss=0.00922]Steps: 2%|▏ | 47/2000 [01:15<25:29, 1.28it/s, lr=0.0001, step_loss=0.00922]Steps: 2%|▏ | 47/2000 [01:15<25:29, 1.28it/s, lr=0.0001, step_loss=0.00128]Steps: 2%|▏ | 48/2000 [01:15<25:20, 1.28it/s, lr=0.0001, step_loss=0.00128]Steps: 2%|▏ | 48/2000 [01:15<25:20, 1.28it/s, lr=0.0001, step_loss=0.00505]Steps: 2%|▏ | 49/2000 [01:16<25:15, 1.29it/s, lr=0.0001, step_loss=0.00505]Steps: 2%|▏ | 49/2000 [01:16<25:15, 1.29it/s, lr=0.0001, step_loss=0.0714] Steps: 2%|▎ | 50/2000 [01:17<25:11, 1.29it/s, lr=0.0001, step_loss=0.0714]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:09<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.04it/s][A100%|██████████| 8/8 [00:00<00:00, 32.21it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:09, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.10it/s][A100%|██████████| 8/8 [00:00<00:00, 32.24it/s]
11/14/2025 06:10:29 - INFO - root - Saved samples to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/samples/sample-50.gif
Steps: 2%|▎ | 50/2000 [01:49<25:11, 1.29it/s, lr=0.0001, step_loss=0.0235]Steps: 3%|▎ | 51/2000 [01:50<5:39:48, 10.46s/it, lr=0.0001, step_loss=0.0235]Steps: 3%|▎ | 51/2000 [01:50<5:39:48, 10.46s/it, lr=0.0001, step_loss=0.00466]Steps: 3%|▎ | 52/2000 [01:51<4:05:13, 7.55s/it, lr=0.0001, step_loss=0.00466]Steps: 3%|▎ | 52/2000 [01:51<4:05:13, 7.55s/it, lr=0.0001, step_loss=0.0394] Steps: 3%|▎ | 53/2000 [01:52<2:59:02, 5.52s/it, lr=0.0001, step_loss=0.0394]Steps: 3%|▎ | 53/2000 [01:52<2:59:02, 5.52s/it, lr=0.0001, step_loss=0.00107]Steps: 3%|▎ | 54/2000 [01:52<2:12:44, 4.09s/it, lr=0.0001, step_loss=0.00107]Steps: 3%|▎ | 54/2000 [01:52<2:12:44, 4.09s/it, lr=0.0001, step_loss=0.468] Steps: 3%|▎ | 55/2000 [01:53<1:40:20, 3.10s/it, lr=0.0001, step_loss=0.468]Steps: 3%|▎ | 55/2000 [01:53<1:40:20, 3.10s/it, lr=0.0001, step_loss=0.0223]Steps: 3%|▎ | 56/2000 [01:54<1:17:42, 2.40s/it, lr=0.0001, step_loss=0.0223]Steps: 3%|▎ | 56/2000 [01:54<1:17:42, 2.40s/it, lr=0.0001, step_loss=0.0425]Steps: 3%|▎ | 57/2000 [01:55<1:01:49, 1.91s/it, lr=0.0001, step_loss=0.0425]Steps: 3%|▎ | 57/2000 [01:55<1:01:49, 1.91s/it, lr=0.0001, step_loss=0.1] Steps: 3%|▎ | 58/2000 [01:55<50:44, 1.57s/it, lr=0.0001, step_loss=0.1] Steps: 3%|▎ | 58/2000 [01:55<50:44, 1.57s/it, lr=0.0001, step_loss=0.0417]Steps: 3%|▎ | 59/2000 [01:56<42:58, 1.33s/it, lr=0.0001, step_loss=0.0417]Steps: 3%|▎ | 59/2000 [01:56<42:58, 1.33s/it, lr=0.0001, step_loss=0.00307]Steps: 3%|▎ | 60/2000 [01:57<37:32, 1.16s/it, lr=0.0001, step_loss=0.00307]Steps: 3%|▎ | 60/2000 [01:57<37:32, 1.16s/it, lr=0.0001, step_loss=0.547] Steps: 3%|▎ | 61/2000 [01:58<33:43, 1.04s/it, lr=0.0001, step_loss=0.547]Steps: 3%|▎ | 61/2000 [01:58<33:43, 1.04s/it, lr=0.0001, step_loss=0.269]Steps: 3%|▎ | 62/2000 [01:58<31:03, 1.04it/s, lr=0.0001, step_loss=0.269]Steps: 3%|▎ | 62/2000 [01:58<31:03, 1.04it/s, lr=0.0001, step_loss=0.0686]Steps: 3%|▎ | 63/2000 [01:59<29:11, 1.11it/s, lr=0.0001, step_loss=0.0686]Steps: 3%|▎ | 63/2000 [01:59<29:11, 1.11it/s, lr=0.0001, step_loss=0.0722]Steps: 3%|▎ | 64/2000 [02:00<27:53, 1.16it/s, lr=0.0001, step_loss=0.0722]11/14/2025 06:10:48 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 64)
Steps: 3%|▎ | 64/2000 [02:08<27:53, 1.16it/s, lr=0.0001, step_loss=0.0645]11/14/2025 06:10:48 - INFO - root - ### DEBUG: Finished epoch 1, epoch_steps=32, global_step=64
11/14/2025 06:10:48 - INFO - root - ### DEBUG: Starting epoch 2/63, global_step=64, max_train_steps=2000
Steps: 3%|▎ | 65/2000 [02:09<1:45:42, 3.28s/it, lr=0.0001, step_loss=0.0645]Steps: 3%|▎ | 65/2000 [02:09<1:45:42, 3.28s/it, lr=0.0001, step_loss=0.124] Steps: 3%|▎ | 66/2000 [02:10<1:21:24, 2.53s/it, lr=0.0001, step_loss=0.124]Steps: 3%|▎ | 66/2000 [02:10<1:21:24, 2.53s/it, lr=0.0001, step_loss=0.00084]Steps: 3%|▎ | 67/2000 [02:10<1:04:22, 2.00s/it, lr=0.0001, step_loss=0.00084]Steps: 3%|▎ | 67/2000 [02:10<1:04:22, 2.00s/it, lr=0.0001, step_loss=0.072] Steps: 3%|▎ | 68/2000 [02:11<52:28, 1.63s/it, lr=0.0001, step_loss=0.072] Steps: 3%|▎ | 68/2000 [02:11<52:28, 1.63s/it, lr=0.0001, step_loss=0.0249]Steps: 3%|▎ | 69/2000 [02:12<44:08, 1.37s/it, lr=0.0001, step_loss=0.0249]Steps: 3%|▎ | 69/2000 [02:12<44:08, 1.37s/it, lr=0.0001, step_loss=0.27] Steps: 4%|▎ | 70/2000 [02:13<38:18, 1.19s/it, lr=0.0001, step_loss=0.27]Steps: 4%|▎ | 70/2000 [02:13<38:18, 1.19s/it, lr=0.0001, step_loss=0.0913]Steps: 4%|▎ | 71/2000 [02:14<34:14, 1.07s/it, lr=0.0001, step_loss=0.0913]Steps: 4%|▎ | 71/2000 [02:14<34:14, 1.07s/it, lr=0.0001, step_loss=0.00956]Steps: 4%|▎ | 72/2000 [02:14<31:23, 1.02it/s, lr=0.0001, step_loss=0.00956]Steps: 4%|▎ | 72/2000 [02:14<31:23, 1.02it/s, lr=0.0001, step_loss=0.108] Steps: 4%|▎ | 73/2000 [02:15<29:20, 1.09it/s, lr=0.0001, step_loss=0.108]Steps: 4%|▎ | 73/2000 [02:15<29:20, 1.09it/s, lr=0.0001, step_loss=0.0088]Steps: 4%|▎ | 74/2000 [02:16<27:56, 1.15it/s, lr=0.0001, step_loss=0.0088]Steps: 4%|▎ | 74/2000 [02:16<27:56, 1.15it/s, lr=0.0001, step_loss=0.0186]Steps: 4%|▍ | 75/2000 [02:17<26:56, 1.19it/s, lr=0.0001, step_loss=0.0186]Steps: 4%|▍ | 75/2000 [02:17<26:56, 1.19it/s, lr=0.0001, step_loss=0.159] Steps: 4%|▍ | 76/2000 [02:17<26:14, 1.22it/s, lr=0.0001, step_loss=0.159]Steps: 4%|▍ | 76/2000 [02:17<26:14, 1.22it/s, lr=0.0001, step_loss=0.000974]Steps: 4%|▍ | 77/2000 [02:18<25:44, 1.24it/s, lr=0.0001, step_loss=0.000974]Steps: 4%|▍ | 77/2000 [02:18<25:44, 1.24it/s, lr=0.0001, step_loss=0.0223] Steps: 4%|▍ | 78/2000 [02:19<25:24, 1.26it/s, lr=0.0001, step_loss=0.0223]Steps: 4%|▍ | 78/2000 [02:19<25:24, 1.26it/s, lr=0.0001, step_loss=0.027] Steps: 4%|▍ | 79/2000 [02:20<25:09, 1.27it/s, lr=0.0001, step_loss=0.027]Steps: 4%|▍ | 79/2000 [02:20<25:09, 1.27it/s, lr=0.0001, step_loss=0.000742]Steps: 4%|▍ | 80/2000 [02:20<24:58, 1.28it/s, lr=0.0001, step_loss=0.000742]Steps: 4%|▍ | 80/2000 [02:20<24:58, 1.28it/s, lr=0.0001, step_loss=0.021] Steps: 4%|▍ | 81/2000 [02:21<24:51, 1.29it/s, lr=0.0001, step_loss=0.021]Steps: 4%|▍ | 81/2000 [02:21<24:51, 1.29it/s, lr=0.0001, step_loss=0.00129]Steps: 4%|▍ | 82/2000 [02:22<24:45, 1.29it/s, lr=0.0001, step_loss=0.00129]Steps: 4%|▍ | 82/2000 [02:22<24:45, 1.29it/s, lr=0.0001, step_loss=0.0593] Steps: 4%|▍ | 83/2000 [02:23<24:41, 1.29it/s, lr=0.0001, step_loss=0.0593]Steps: 4%|▍ | 83/2000 [02:23<24:41, 1.29it/s, lr=0.0001, step_loss=0.0538]Steps: 4%|▍ | 84/2000 [02:24<24:39, 1.30it/s, lr=0.0001, step_loss=0.0538]Steps: 4%|▍ | 84/2000 [02:24<24:39, 1.30it/s, lr=0.0001, step_loss=0.0101]Steps: 4%|▍ | 85/2000 [02:24<24:36, 1.30it/s, lr=0.0001, step_loss=0.0101]Steps: 4%|▍ | 85/2000 [02:24<24:36, 1.30it/s, lr=0.0001, step_loss=0.00522]Steps: 4%|▍ | 86/2000 [02:25<24:34, 1.30it/s, lr=0.0001, step_loss=0.00522]Steps: 4%|▍ | 86/2000 [02:25<24:34, 1.30it/s, lr=0.0001, step_loss=0.195] Steps: 4%|▍ | 87/2000 [02:26<24:33, 1.30it/s, lr=0.0001, step_loss=0.195]Steps: 4%|▍ | 87/2000 [02:26<24:33, 1.30it/s, lr=0.0001, step_loss=0.0957]Steps: 4%|▍ | 88/2000 [02:27<24:31, 1.30it/s, lr=0.0001, step_loss=0.0957]Steps: 4%|▍ | 88/2000 [02:27<24:31, 1.30it/s, lr=0.0001, step_loss=0.0416]Steps: 4%|▍ | 89/2000 [02:27<24:29, 1.30it/s, lr=0.0001, step_loss=0.0416]Steps: 4%|▍ | 89/2000 [02:27<24:29, 1.30it/s, lr=0.0001, step_loss=0.11] Steps: 4%|▍ | 90/2000 [02:28<24:28, 1.30it/s, lr=0.0001, step_loss=0.11]Steps: 4%|▍ | 90/2000 [02:28<24:28, 1.30it/s, lr=0.0001, step_loss=0.00359]Steps: 5%|▍ | 91/2000 [02:29<24:28, 1.30it/s, lr=0.0001, step_loss=0.00359]Steps: 5%|▍ | 91/2000 [02:29<24:28, 1.30it/s, lr=0.0001, step_loss=0.00548]Steps: 5%|▍ | 92/2000 [02:30<24:28, 1.30it/s, lr=0.0001, step_loss=0.00548]Steps: 5%|▍ | 92/2000 [02:30<24:28, 1.30it/s, lr=0.0001, step_loss=0.0275] Steps: 5%|▍ | 93/2000 [02:30<24:27, 1.30it/s, lr=0.0001, step_loss=0.0275]Steps: 5%|▍ | 93/2000 [02:30<24:27, 1.30it/s, lr=0.0001, step_loss=0.000776]Steps: 5%|▍ | 94/2000 [02:31<24:26, 1.30it/s, lr=0.0001, step_loss=0.000776]Steps: 5%|▍ | 94/2000 [02:31<24:26, 1.30it/s, lr=0.0001, step_loss=0.379] Steps: 5%|▍ | 95/2000 [02:32<24:25, 1.30it/s, lr=0.0001, step_loss=0.379]Steps: 5%|▍ | 95/2000 [02:32<24:25, 1.30it/s, lr=0.0001, step_loss=0.0384]Steps: 5%|▍ | 96/2000 [02:33<24:23, 1.30it/s, lr=0.0001, step_loss=0.0384]11/14/2025 06:11:20 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 96)
Steps: 5%|▍ | 96/2000 [02:41<24:23, 1.30it/s, lr=0.0001, step_loss=0.0339]11/14/2025 06:11:20 - INFO - root - ### DEBUG: Finished epoch 2, epoch_steps=32, global_step=96
11/14/2025 06:11:20 - INFO - root - ### DEBUG: Starting epoch 3/63, global_step=96, max_train_steps=2000
Steps: 5%|▍ | 97/2000 [02:41<1:39:25, 3.13s/it, lr=0.0001, step_loss=0.0339]Steps: 5%|▍ | 97/2000 [02:41<1:39:25, 3.13s/it, lr=0.0001, step_loss=0.0175]Steps: 5%|▍ | 98/2000 [02:42<1:16:53, 2.43s/it, lr=0.0001, step_loss=0.0175]Steps: 5%|▍ | 98/2000 [02:42<1:16:53, 2.43s/it, lr=0.0001, step_loss=0.00688]Steps: 5%|▍ | 99/2000 [02:43<1:01:09, 1.93s/it, lr=0.0001, step_loss=0.00688]Steps: 5%|▍ | 99/2000 [02:43<1:01:09, 1.93s/it, lr=0.0001, step_loss=0.0814] Steps: 5%|▌ | 100/2000 [02:44<50:09, 1.58s/it, lr=0.0001, step_loss=0.0814] Steps: 5%|▌ | 100/2000 [02:44<50:09, 1.58s/it, lr=0.0001, step_loss=0.00387]Steps: 5%|▌ | 101/2000 [02:45<42:29, 1.34s/it, lr=0.0001, step_loss=0.00387]Steps: 5%|▌ | 101/2000 [02:45<42:29, 1.34s/it, lr=0.0001, step_loss=0.158] Steps: 5%|▌ | 102/2000 [02:45<37:04, 1.17s/it, lr=0.0001, step_loss=0.158]Steps: 5%|▌ | 102/2000 [02:45<37:04, 1.17s/it, lr=0.0001, step_loss=0.000858]Steps: 5%|▌ | 103/2000 [02:46<33:18, 1.05s/it, lr=0.0001, step_loss=0.000858]Steps: 5%|▌ | 103/2000 [02:46<33:18, 1.05s/it, lr=0.0001, step_loss=0.0353] Steps: 5%|▌ | 104/2000 [02:47<30:38, 1.03it/s, lr=0.0001, step_loss=0.0353]Steps: 5%|▌ | 104/2000 [02:47<30:38, 1.03it/s, lr=0.0001, step_loss=0.00691]Steps: 5%|▌ | 105/2000 [02:48<28:48, 1.10it/s, lr=0.0001, step_loss=0.00691]Steps: 5%|▌ | 105/2000 [02:48<28:48, 1.10it/s, lr=0.0001, step_loss=0.00218]Steps: 5%|▌ | 106/2000 [02:48<27:30, 1.15it/s, lr=0.0001, step_loss=0.00218]Steps: 5%|▌ | 106/2000 [02:48<27:30, 1.15it/s, lr=0.0001, step_loss=0.00367]Steps: 5%|▌ | 107/2000 [02:49<26:33, 1.19it/s, lr=0.0001, step_loss=0.00367]Steps: 5%|▌ | 107/2000 [02:49<26:33, 1.19it/s, lr=0.0001, step_loss=0.042] Steps: 5%|▌ | 108/2000 [02:50<25:53, 1.22it/s, lr=0.0001, step_loss=0.042]Steps: 5%|▌ | 108/2000 [02:50<25:53, 1.22it/s, lr=0.0001, step_loss=0.0739]Steps: 5%|▌ | 109/2000 [02:51<25:26, 1.24it/s, lr=0.0001, step_loss=0.0739]Steps: 5%|▌ | 109/2000 [02:51<25:26, 1.24it/s, lr=0.0001, step_loss=0.00441]Steps: 6%|▌ | 110/2000 [02:51<25:05, 1.26it/s, lr=0.0001, step_loss=0.00441]Steps: 6%|▌ | 110/2000 [02:52<25:05, 1.26it/s, lr=0.0001, step_loss=0.000771]Steps: 6%|▌ | 111/2000 [02:52<24:53, 1.26it/s, lr=0.0001, step_loss=0.000771]Steps: 6%|▌ | 111/2000 [02:52<24:53, 1.26it/s, lr=0.0001, step_loss=0.0251] Steps: 6%|▌ | 112/2000 [02:53<24:44, 1.27it/s, lr=0.0001, step_loss=0.0251]Steps: 6%|▌ | 112/2000 [02:53<24:44, 1.27it/s, lr=0.0001, step_loss=0.0174]Steps: 6%|▌ | 113/2000 [02:54<24:36, 1.28it/s, lr=0.0001, step_loss=0.0174]Steps: 6%|▌ | 113/2000 [02:54<24:36, 1.28it/s, lr=0.0001, step_loss=0.0851]Steps: 6%|▌ | 114/2000 [02:55<24:29, 1.28it/s, lr=0.0001, step_loss=0.0851]Steps: 6%|▌ | 114/2000 [02:55<24:29, 1.28it/s, lr=0.0001, step_loss=0.0159]Steps: 6%|▌ | 115/2000 [02:55<24:28, 1.28it/s, lr=0.0001, step_loss=0.0159]Steps: 6%|▌ | 115/2000 [02:55<24:28, 1.28it/s, lr=0.0001, step_loss=0.0731]Steps: 6%|▌ | 116/2000 [02:56<24:25, 1.29it/s, lr=0.0001, step_loss=0.0731]Steps: 6%|▌ | 116/2000 [02:56<24:25, 1.29it/s, lr=0.0001, step_loss=0.11] Steps: 6%|▌ | 117/2000 [02:57<24:23, 1.29it/s, lr=0.0001, step_loss=0.11]Steps: 6%|▌ | 117/2000 [02:57<24:23, 1.29it/s, lr=0.0001, step_loss=0.015]Steps: 6%|▌ | 118/2000 [02:58<24:22, 1.29it/s, lr=0.0001, step_loss=0.015]Steps: 6%|▌ | 118/2000 [02:58<24:22, 1.29it/s, lr=0.0001, step_loss=0.0097]Steps: 6%|▌ | 119/2000 [02:58<24:18, 1.29it/s, lr=0.0001, step_loss=0.0097]Steps: 6%|▌ | 119/2000 [02:58<24:18, 1.29it/s, lr=0.0001, step_loss=0.00252]Steps: 6%|▌ | 120/2000 [02:59<24:17, 1.29it/s, lr=0.0001, step_loss=0.00252]Steps: 6%|▌ | 120/2000 [02:59<24:17, 1.29it/s, lr=0.0001, step_loss=0.00117]Steps: 6%|▌ | 121/2000 [03:00<24:13, 1.29it/s, lr=0.0001, step_loss=0.00117]Steps: 6%|▌ | 121/2000 [03:00<24:13, 1.29it/s, lr=0.0001, step_loss=0.268] Steps: 6%|▌ | 122/2000 [03:01<24:13, 1.29it/s, lr=0.0001, step_loss=0.268]Steps: 6%|▌ | 122/2000 [03:01<24:13, 1.29it/s, lr=0.0001, step_loss=0.122]Steps: 6%|▌ | 123/2000 [03:02<24:10, 1.29it/s, lr=0.0001, step_loss=0.122]Steps: 6%|▌ | 123/2000 [03:02<24:10, 1.29it/s, lr=0.0001, step_loss=0.148]Steps: 6%|▌ | 124/2000 [03:02<24:09, 1.29it/s, lr=0.0001, step_loss=0.148]Steps: 6%|▌ | 124/2000 [03:02<24:09, 1.29it/s, lr=0.0001, step_loss=0.000954]Steps: 6%|▋ | 125/2000 [03:03<24:06, 1.30it/s, lr=0.0001, step_loss=0.000954]Steps: 6%|▋ | 125/2000 [03:03<24:06, 1.30it/s, lr=0.0001, step_loss=0.169] Steps: 6%|▋ | 126/2000 [03:04<24:06, 1.30it/s, lr=0.0001, step_loss=0.169]Steps: 6%|▋ | 126/2000 [03:04<24:06, 1.30it/s, lr=0.0001, step_loss=0.00108]Steps: 6%|▋ | 127/2000 [03:05<24:04, 1.30it/s, lr=0.0001, step_loss=0.00108]Steps: 6%|▋ | 127/2000 [03:05<24:04, 1.30it/s, lr=0.0001, step_loss=0.0712] Steps: 6%|▋ | 128/2000 [03:05<24:02, 1.30it/s, lr=0.0001, step_loss=0.0712]11/14/2025 06:11:53 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 128)
Steps: 6%|▋ | 128/2000 [03:13<24:02, 1.30it/s, lr=0.0001, step_loss=0.0296]11/14/2025 06:11:53 - INFO - root - ### DEBUG: Finished epoch 3, epoch_steps=32, global_step=128
11/14/2025 06:11:53 - INFO - root - ### DEBUG: Starting epoch 4/63, global_step=128, max_train_steps=2000
Steps: 6%|▋ | 129/2000 [03:14<1:36:04, 3.08s/it, lr=0.0001, step_loss=0.0296]Steps: 6%|▋ | 129/2000 [03:14<1:36:04, 3.08s/it, lr=0.0001, step_loss=0.0911]Steps: 6%|▋ | 130/2000 [03:15<1:14:25, 2.39s/it, lr=0.0001, step_loss=0.0911]Steps: 6%|▋ | 130/2000 [03:15<1:14:25, 2.39s/it, lr=0.0001, step_loss=0.119] Steps: 7%|▋ | 131/2000 [03:15<59:16, 1.90s/it, lr=0.0001, step_loss=0.119] Steps: 7%|▋ | 131/2000 [03:15<59:16, 1.90s/it, lr=0.0001, step_loss=0.00186]Steps: 7%|▋ | 132/2000 [03:16<48:38, 1.56s/it, lr=0.0001, step_loss=0.00186]Steps: 7%|▋ | 132/2000 [03:16<48:38, 1.56s/it, lr=0.0001, step_loss=0.00904]Steps: 7%|▋ | 133/2000 [03:17<41:14, 1.33s/it, lr=0.0001, step_loss=0.00904]Steps: 7%|▋ | 133/2000 [03:17<41:14, 1.33s/it, lr=0.0001, step_loss=0.00655]Steps: 7%|▋ | 134/2000 [03:18<36:01, 1.16s/it, lr=0.0001, step_loss=0.00655]Steps: 7%|▋ | 134/2000 [03:18<36:01, 1.16s/it, lr=0.0001, step_loss=0.0291] Steps: 7%|▋ | 135/2000 [03:18<32:23, 1.04s/it, lr=0.0001, step_loss=0.0291]Steps: 7%|▋ | 135/2000 [03:19<32:23, 1.04s/it, lr=0.0001, step_loss=0.00143]Steps: 7%|▋ | 136/2000 [03:19<29:50, 1.04it/s, lr=0.0001, step_loss=0.00143]Steps: 7%|▋ | 136/2000 [03:19<29:50, 1.04it/s, lr=0.0001, step_loss=0.0127] Steps: 7%|▋ | 137/2000 [03:20<28:05, 1.11it/s, lr=0.0001, step_loss=0.0127]Steps: 7%|▋ | 137/2000 [03:20<28:05, 1.11it/s, lr=0.0001, step_loss=0.221] Steps: 7%|▋ | 138/2000 [03:21<26:48, 1.16it/s, lr=0.0001, step_loss=0.221]Steps: 7%|▋ | 138/2000 [03:21<26:48, 1.16it/s, lr=0.0001, step_loss=0.00137]Steps: 7%|▋ | 139/2000 [03:22<25:56, 1.20it/s, lr=0.0001, step_loss=0.00137]Steps: 7%|▋ | 139/2000 [03:22<25:56, 1.20it/s, lr=0.0001, step_loss=0.382] Steps: 7%|▋ | 140/2000 [03:22<25:19, 1.22it/s, lr=0.0001, step_loss=0.382]Steps: 7%|▋ | 140/2000 [03:22<25:19, 1.22it/s, lr=0.0001, step_loss=0.017]Steps: 7%|▋ | 141/2000 [03:23<24:52, 1.25it/s, lr=0.0001, step_loss=0.017]Steps: 7%|▋ | 141/2000 [03:23<24:52, 1.25it/s, lr=0.0001, step_loss=0.203]Steps: 7%|▋ | 142/2000 [03:24<24:32, 1.26it/s, lr=0.0001, step_loss=0.203]Steps: 7%|▋ | 142/2000 [03:24<24:32, 1.26it/s, lr=0.0001, step_loss=0.0584]Steps: 7%|▋ | 143/2000 [03:25<24:18, 1.27it/s, lr=0.0001, step_loss=0.0584]Steps: 7%|▋ | 143/2000 [03:25<24:18, 1.27it/s, lr=0.0001, step_loss=0.00766]Steps: 7%|▋ | 144/2000 [03:25<24:08, 1.28it/s, lr=0.0001, step_loss=0.00766]Steps: 7%|▋ | 144/2000 [03:25<24:08, 1.28it/s, lr=0.0001, step_loss=0.0398] Steps: 7%|▋ | 145/2000 [03:26<24:02, 1.29it/s, lr=0.0001, step_loss=0.0398]Steps: 7%|▋ | 145/2000 [03:26<24:02, 1.29it/s, lr=0.0001, step_loss=0.000945]Steps: 7%|▋ | 146/2000 [03:27<23:57, 1.29it/s, lr=0.0001, step_loss=0.000945]Steps: 7%|▋ | 146/2000 [03:27<23:57, 1.29it/s, lr=0.0001, step_loss=0.00776] Steps: 7%|▋ | 147/2000 [03:28<23:53, 1.29it/s, lr=0.0001, step_loss=0.00776]Steps: 7%|▋ | 147/2000 [03:28<23:53, 1.29it/s, lr=0.0001, step_loss=0.00103]Steps: 7%|▋ | 148/2000 [03:28<23:49, 1.30it/s, lr=0.0001, step_loss=0.00103]Steps: 7%|▋ | 148/2000 [03:29<23:49, 1.30it/s, lr=0.0001, step_loss=0.0107] Steps: 7%|▋ | 149/2000 [03:29<23:48, 1.30it/s, lr=0.0001, step_loss=0.0107]Steps: 7%|▋ | 149/2000 [03:29<23:48, 1.30it/s, lr=0.0001, step_loss=0.4] Steps: 8%|▊ | 150/2000 [03:30<23:48, 1.30it/s, lr=0.0001, step_loss=0.4]Steps: 8%|▊ | 150/2000 [03:30<23:48, 1.30it/s, lr=0.0001, step_loss=0.00538]Steps: 8%|▊ | 151/2000 [03:31<23:45, 1.30it/s, lr=0.0001, step_loss=0.00538]Steps: 8%|▊ | 151/2000 [03:31<23:45, 1.30it/s, lr=0.0001, step_loss=0.00349]Steps: 8%|▊ | 152/2000 [03:32<23:44, 1.30it/s, lr=0.0001, step_loss=0.00349]Steps: 8%|▊ | 152/2000 [03:32<23:44, 1.30it/s, lr=0.0001, step_loss=0.136] Steps: 8%|▊ | 153/2000 [03:32<23:44, 1.30it/s, lr=0.0001, step_loss=0.136]Steps: 8%|▊ | 153/2000 [03:32<23:44, 1.30it/s, lr=0.0001, step_loss=0.0492]Steps: 8%|▊ | 154/2000 [03:33<23:42, 1.30it/s, lr=0.0001, step_loss=0.0492]Steps: 8%|▊ | 154/2000 [03:33<23:42, 1.30it/s, lr=0.0001, step_loss=0.00119]Steps: 8%|▊ | 155/2000 [03:34<23:40, 1.30it/s, lr=0.0001, step_loss=0.00119]Steps: 8%|▊ | 155/2000 [03:34<23:40, 1.30it/s, lr=0.0001, step_loss=0.00142]Steps: 8%|▊ | 156/2000 [03:35<23:39, 1.30it/s, lr=0.0001, step_loss=0.00142]Steps: 8%|▊ | 156/2000 [03:35<23:39, 1.30it/s, lr=0.0001, step_loss=0.0662] Steps: 8%|▊ | 157/2000 [03:35<23:37, 1.30it/s, lr=0.0001, step_loss=0.0662]Steps: 8%|▊ | 157/2000 [03:35<23:37, 1.30it/s, lr=0.0001, step_loss=0.00608]Steps: 8%|▊ | 158/2000 [03:36<23:38, 1.30it/s, lr=0.0001, step_loss=0.00608]Steps: 8%|▊ | 158/2000 [03:36<23:38, 1.30it/s, lr=0.0001, step_loss=0.101] Steps: 8%|▊ | 159/2000 [03:37<23:37, 1.30it/s, lr=0.0001, step_loss=0.101]Steps: 8%|▊ | 159/2000 [03:37<23:37, 1.30it/s, lr=0.0001, step_loss=0.000848]Steps: 8%|▊ | 160/2000 [03:38<23:38, 1.30it/s, lr=0.0001, step_loss=0.000848]11/14/2025 06:12:25 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 160)
Steps: 8%|▊ | 160/2000 [03:45<23:38, 1.30it/s, lr=0.0001, step_loss=0.0067] 11/14/2025 06:12:25 - INFO - root - ### DEBUG: Finished epoch 4, epoch_steps=32, global_step=160
11/14/2025 06:12:25 - INFO - root - ### DEBUG: Starting epoch 5/63, global_step=160, max_train_steps=2000
Steps: 8%|▊ | 161/2000 [03:46<1:33:36, 3.05s/it, lr=0.0001, step_loss=0.0067]Steps: 8%|▊ | 161/2000 [03:46<1:33:36, 3.05s/it, lr=0.0001, step_loss=0.00727]Steps: 8%|▊ | 162/2000 [03:47<1:12:34, 2.37s/it, lr=0.0001, step_loss=0.00727]Steps: 8%|▊ | 162/2000 [03:47<1:12:34, 2.37s/it, lr=0.0001, step_loss=0.035] Steps: 8%|▊ | 163/2000 [03:48<57:50, 1.89s/it, lr=0.0001, step_loss=0.035] Steps: 8%|▊ | 163/2000 [03:48<57:50, 1.89s/it, lr=0.0001, step_loss=0.00426]Steps: 8%|▊ | 164/2000 [03:48<47:32, 1.55s/it, lr=0.0001, step_loss=0.00426]Steps: 8%|▊ | 164/2000 [03:48<47:32, 1.55s/it, lr=0.0001, step_loss=0.0434] Steps: 8%|▊ | 165/2000 [03:49<40:17, 1.32s/it, lr=0.0001, step_loss=0.0434]Steps: 8%|▊ | 165/2000 [03:49<40:17, 1.32s/it, lr=0.0001, step_loss=0.00402]Steps: 8%|▊ | 166/2000 [03:50<35:16, 1.15s/it, lr=0.0001, step_loss=0.00402]Steps: 8%|▊ | 166/2000 [03:50<35:16, 1.15s/it, lr=0.0001, step_loss=0.0764] Steps: 8%|▊ | 167/2000 [03:51<32:31, 1.06s/it, lr=0.0001, step_loss=0.0764]Steps: 8%|▊ | 167/2000 [03:51<32:31, 1.06s/it, lr=0.0001, step_loss=0.0179]Steps: 8%|▊ | 168/2000 [03:52<29:47, 1.02it/s, lr=0.0001, step_loss=0.0179]Steps: 8%|▊ | 168/2000 [03:52<29:47, 1.02it/s, lr=0.0001, step_loss=0.058] Steps: 8%|▊ | 169/2000 [03:52<27:53, 1.09it/s, lr=0.0001, step_loss=0.058]Steps: 8%|▊ | 169/2000 [03:52<27:53, 1.09it/s, lr=0.0001, step_loss=0.00373]Steps: 8%|▊ | 170/2000 [03:53<26:33, 1.15it/s, lr=0.0001, step_loss=0.00373]Steps: 8%|▊ | 170/2000 [03:53<26:33, 1.15it/s, lr=0.0001, step_loss=0.0344] Steps: 9%|▊ | 171/2000 [03:54<25:37, 1.19it/s, lr=0.0001, step_loss=0.0344]Steps: 9%|▊ | 171/2000 [03:54<25:37, 1.19it/s, lr=0.0001, step_loss=0.00058]Steps: 9%|▊ | 172/2000 [03:55<24:56, 1.22it/s, lr=0.0001, step_loss=0.00058]Steps: 9%|▊ | 172/2000 [03:55<24:56, 1.22it/s, lr=0.0001, step_loss=0.0546] Steps: 9%|▊ | 173/2000 [03:55<24:27, 1.24it/s, lr=0.0001, step_loss=0.0546]Steps: 9%|▊ | 173/2000 [03:55<24:27, 1.24it/s, lr=0.0001, step_loss=0.0322]Steps: 9%|▊ | 174/2000 [03:56<24:08, 1.26it/s, lr=0.0001, step_loss=0.0322]Steps: 9%|▊ | 174/2000 [03:56<24:08, 1.26it/s, lr=0.0001, step_loss=0.00828]Steps: 9%|▉ | 175/2000 [03:57<23:57, 1.27it/s, lr=0.0001, step_loss=0.00828]Steps: 9%|▉ | 175/2000 [03:57<23:57, 1.27it/s, lr=0.0001, step_loss=0.0227] Steps: 9%|▉ | 176/2000 [03:58<23:48, 1.28it/s, lr=0.0001, step_loss=0.0227]Steps: 9%|▉ | 176/2000 [03:58<23:48, 1.28it/s, lr=0.0001, step_loss=0.0101]Steps: 9%|▉ | 177/2000 [03:59<23:40, 1.28it/s, lr=0.0001, step_loss=0.0101]Steps: 9%|▉ | 177/2000 [03:59<23:40, 1.28it/s, lr=0.0001, step_loss=0.00308]Steps: 9%|▉ | 178/2000 [03:59<23:37, 1.29it/s, lr=0.0001, step_loss=0.00308]Steps: 9%|▉ | 178/2000 [03:59<23:37, 1.29it/s, lr=0.0001, step_loss=0.055] Steps: 9%|▉ | 179/2000 [04:00<23:34, 1.29it/s, lr=0.0001, step_loss=0.055]Steps: 9%|▉ | 179/2000 [04:00<23:34, 1.29it/s, lr=0.0001, step_loss=0.0765]Steps: 9%|▉ | 180/2000 [04:01<23:30, 1.29it/s, lr=0.0001, step_loss=0.0765]Steps: 9%|▉ | 180/2000 [04:01<23:30, 1.29it/s, lr=0.0001, step_loss=0.0188]Steps: 9%|▉ | 181/2000 [04:02<23:26, 1.29it/s, lr=0.0001, step_loss=0.0188]Steps: 9%|▉ | 181/2000 [04:02<23:26, 1.29it/s, lr=0.0001, step_loss=0.114] Steps: 9%|▉ | 182/2000 [04:02<23:23, 1.30it/s, lr=0.0001, step_loss=0.114]Steps: 9%|▉ | 182/2000 [04:02<23:23, 1.30it/s, lr=0.0001, step_loss=0.051]Steps: 9%|▉ | 183/2000 [04:03<23:20, 1.30it/s, lr=0.0001, step_loss=0.051]Steps: 9%|▉ | 183/2000 [04:03<23:20, 1.30it/s, lr=0.0001, step_loss=0.000768]Steps: 9%|▉ | 184/2000 [04:04<23:19, 1.30it/s, lr=0.0001, step_loss=0.000768]Steps: 9%|▉ | 184/2000 [04:04<23:19, 1.30it/s, lr=0.0001, step_loss=0.142] Steps: 9%|▉ | 185/2000 [04:05<23:17, 1.30it/s, lr=0.0001, step_loss=0.142]Steps: 9%|▉ | 185/2000 [04:05<23:17, 1.30it/s, lr=0.0001, step_loss=0.125]Steps: 9%|▉ | 186/2000 [04:05<23:16, 1.30it/s, lr=0.0001, step_loss=0.125]Steps: 9%|▉ | 186/2000 [04:05<23:16, 1.30it/s, lr=0.0001, step_loss=0.00197]Steps: 9%|▉ | 187/2000 [04:06<23:16, 1.30it/s, lr=0.0001, step_loss=0.00197]Steps: 9%|▉ | 187/2000 [04:06<23:16, 1.30it/s, lr=0.0001, step_loss=0.0097] Steps: 9%|▉ | 188/2000 [04:07<23:14, 1.30it/s, lr=0.0001, step_loss=0.0097]Steps: 9%|▉ | 188/2000 [04:07<23:14, 1.30it/s, lr=0.0001, step_loss=0.163] Steps: 9%|▉ | 189/2000 [04:08<23:13, 1.30it/s, lr=0.0001, step_loss=0.163]Steps: 9%|▉ | 189/2000 [04:08<23:13, 1.30it/s, lr=0.0001, step_loss=0.0211]Steps: 10%|▉ | 190/2000 [04:09<23:13, 1.30it/s, lr=0.0001, step_loss=0.0211]Steps: 10%|▉ | 190/2000 [04:09<23:13, 1.30it/s, lr=0.0001, step_loss=0.0711]Steps: 10%|▉ | 191/2000 [04:09<23:11, 1.30it/s, lr=0.0001, step_loss=0.0711]Steps: 10%|▉ | 191/2000 [04:09<23:11, 1.30it/s, lr=0.0001, step_loss=0.00813]Steps: 10%|▉ | 192/2000 [04:10<23:10, 1.30it/s, lr=0.0001, step_loss=0.00813]11/14/2025 06:12:58 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 192)
Steps: 10%|▉ | 192/2000 [04:18<23:10, 1.30it/s, lr=0.0001, step_loss=0.119] 11/14/2025 06:12:58 - INFO - root - ### DEBUG: Finished epoch 5, epoch_steps=32, global_step=192
11/14/2025 06:12:58 - INFO - root - ### DEBUG: Starting epoch 6/63, global_step=192, max_train_steps=2000
Steps: 10%|▉ | 193/2000 [04:19<1:35:40, 3.18s/it, lr=0.0001, step_loss=0.119]Steps: 10%|▉ | 193/2000 [04:19<1:35:40, 3.18s/it, lr=0.0001, step_loss=0.000375]Steps: 10%|▉ | 194/2000 [04:20<1:13:52, 2.45s/it, lr=0.0001, step_loss=0.000375]Steps: 10%|▉ | 194/2000 [04:20<1:13:52, 2.45s/it, lr=0.0001, step_loss=0.00181] Steps: 10%|▉ | 195/2000 [04:20<58:37, 1.95s/it, lr=0.0001, step_loss=0.00181] Steps: 10%|▉ | 195/2000 [04:20<58:37, 1.95s/it, lr=0.0001, step_loss=0.172] Steps: 10%|▉ | 196/2000 [04:21<47:56, 1.59s/it, lr=0.0001, step_loss=0.172]Steps: 10%|▉ | 196/2000 [04:21<47:56, 1.59s/it, lr=0.0001, step_loss=0.00809]Steps: 10%|▉ | 197/2000 [04:22<40:28, 1.35s/it, lr=0.0001, step_loss=0.00809]Steps: 10%|▉ | 197/2000 [04:22<40:28, 1.35s/it, lr=0.0001, step_loss=0.0143] Steps: 10%|▉ | 198/2000 [04:23<35:14, 1.17s/it, lr=0.0001, step_loss=0.0143]Steps: 10%|▉ | 198/2000 [04:23<35:14, 1.17s/it, lr=0.0001, step_loss=0.0173]Steps: 10%|▉ | 199/2000 [04:23<31:36, 1.05s/it, lr=0.0001, step_loss=0.0173]Steps: 10%|▉ | 199/2000 [04:24<31:36, 1.05s/it, lr=0.0001, step_loss=0.00897]Steps: 10%|█ | 200/2000 [04:24<29:02, 1.03it/s, lr=0.0001, step_loss=0.00897]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.03it/s][A100%|██████████| 8/8 [00:00<00:00, 32.20it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 43.96it/s][A100%|██████████| 8/8 [00:00<00:00, 32.17it/s]
11/14/2025 06:13:36 - INFO - root - Saved samples to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/samples/sample-200.gif
Steps: 10%|█ | 200/2000 [04:56<29:02, 1.03it/s, lr=0.0001, step_loss=0.0101] Steps: 10%|█ | 201/2000 [04:57<5:16:13, 10.55s/it, lr=0.0001, step_loss=0.0101]Steps: 10%|█ | 201/2000 [04:57<5:16:13, 10.55s/it, lr=0.0001, step_loss=0.00372]Steps: 10%|█ | 202/2000 [04:58<3:48:04, 7.61s/it, lr=0.0001, step_loss=0.00372]Steps: 10%|█ | 202/2000 [04:58<3:48:04, 7.61s/it, lr=0.0001, step_loss=0.000773]Steps: 10%|█ | 203/2000 [04:59<2:46:23, 5.56s/it, lr=0.0001, step_loss=0.000773]Steps: 10%|█ | 203/2000 [04:59<2:46:23, 5.56s/it, lr=0.0001, step_loss=0.00541] Steps: 10%|█ | 204/2000 [04:59<2:03:14, 4.12s/it, lr=0.0001, step_loss=0.00541]Steps: 10%|█ | 204/2000 [04:59<2:03:14, 4.12s/it, lr=0.0001, step_loss=0.00551]Steps: 10%|█ | 205/2000 [05:00<1:33:03, 3.11s/it, lr=0.0001, step_loss=0.00551]Steps: 10%|█ | 205/2000 [05:00<1:33:03, 3.11s/it, lr=0.0001, step_loss=0.000453]Steps: 10%|█ | 206/2000 [05:01<1:11:57, 2.41s/it, lr=0.0001, step_loss=0.000453]Steps: 10%|█ | 206/2000 [05:01<1:11:57, 2.41s/it, lr=0.0001, step_loss=0.334] Steps: 10%|█ | 207/2000 [05:02<57:10, 1.91s/it, lr=0.0001, step_loss=0.334] Steps: 10%|█ | 207/2000 [05:02<57:10, 1.91s/it, lr=0.0001, step_loss=0.0202]Steps: 10%|█ | 208/2000 [05:02<46:50, 1.57s/it, lr=0.0001, step_loss=0.0202]Steps: 10%|█ | 208/2000 [05:03<46:50, 1.57s/it, lr=0.0001, step_loss=0.00784]Steps: 10%|█ | 209/2000 [05:03<39:35, 1.33s/it, lr=0.0001, step_loss=0.00784]Steps: 10%|█ | 209/2000 [05:03<39:35, 1.33s/it, lr=0.0001, step_loss=0.0778] Steps: 10%|█ | 210/2000 [05:04<34:31, 1.16s/it, lr=0.0001, step_loss=0.0778]Steps: 10%|█ | 210/2000 [05:04<34:31, 1.16s/it, lr=0.0001, step_loss=0.0253]Steps: 11%|█ | 211/2000 [05:05<30:57, 1.04s/it, lr=0.0001, step_loss=0.0253]Steps: 11%|█ | 211/2000 [05:05<30:57, 1.04s/it, lr=0.0001, step_loss=0.195] Steps: 11%|█ | 212/2000 [05:06<28:27, 1.05it/s, lr=0.0001, step_loss=0.195]Steps: 11%|█ | 212/2000 [05:06<28:27, 1.05it/s, lr=0.0001, step_loss=0.102]Steps: 11%|█ | 213/2000 [05:06<26:43, 1.11it/s, lr=0.0001, step_loss=0.102]Steps: 11%|█ | 213/2000 [05:06<26:43, 1.11it/s, lr=0.0001, step_loss=0.001]Steps: 11%|█ | 214/2000 [05:07<25:31, 1.17it/s, lr=0.0001, step_loss=0.001]Steps: 11%|█ | 214/2000 [05:07<25:31, 1.17it/s, lr=0.0001, step_loss=0.005]Steps: 11%|█ | 215/2000 [05:08<24:38, 1.21it/s, lr=0.0001, step_loss=0.005]Steps: 11%|█ | 215/2000 [05:08<24:38, 1.21it/s, lr=0.0001, step_loss=0.238]Steps: 11%|█ | 216/2000 [05:09<24:02, 1.24it/s, lr=0.0001, step_loss=0.238]Steps: 11%|█ | 216/2000 [05:09<24:02, 1.24it/s, lr=0.0001, step_loss=0.00114]Steps: 11%|█ | 217/2000 [05:09<23:36, 1.26it/s, lr=0.0001, step_loss=0.00114]Steps: 11%|█ | 217/2000 [05:09<23:36, 1.26it/s, lr=0.0001, step_loss=0.0169] Steps: 11%|█ | 218/2000 [05:10<23:18, 1.27it/s, lr=0.0001, step_loss=0.0169]Steps: 11%|█ | 218/2000 [05:10<23:18, 1.27it/s, lr=0.0001, step_loss=0.00102]Steps: 11%|█ | 219/2000 [05:11<23:07, 1.28it/s, lr=0.0001, step_loss=0.00102]Steps: 11%|█ | 219/2000 [05:11<23:07, 1.28it/s, lr=0.0001, step_loss=0.173] Steps: 11%|█ | 220/2000 [05:12<23:00, 1.29it/s, lr=0.0001, step_loss=0.173]Steps: 11%|█ | 220/2000 [05:12<23:00, 1.29it/s, lr=0.0001, step_loss=0.0018]Steps: 11%|█ | 221/2000 [05:12<22:55, 1.29it/s, lr=0.0001, step_loss=0.0018]Steps: 11%|█ | 221/2000 [05:12<22:55, 1.29it/s, lr=0.0001, step_loss=0.0603]Steps: 11%|█ | 222/2000 [05:13<22:48, 1.30it/s, lr=0.0001, step_loss=0.0603]Steps: 11%|█ | 222/2000 [05:13<22:48, 1.30it/s, lr=0.0001, step_loss=0.1] Steps: 11%|█ | 223/2000 [05:14<22:46, 1.30it/s, lr=0.0001, step_loss=0.1]Steps: 11%|█ | 223/2000 [05:14<22:46, 1.30it/s, lr=0.0001, step_loss=0.0537]Steps: 11%|█ | 224/2000 [05:15<22:43, 1.30it/s, lr=0.0001, step_loss=0.0537]11/14/2025 06:14:02 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 224)
Steps: 11%|█ | 224/2000 [05:23<22:43, 1.30it/s, lr=0.0001, step_loss=0.0169]11/14/2025 06:14:02 - INFO - root - ### DEBUG: Finished epoch 6, epoch_steps=32, global_step=224
11/14/2025 06:14:02 - INFO - root - ### DEBUG: Starting epoch 7/63, global_step=224, max_train_steps=2000
Steps: 11%|█▏ | 225/2000 [05:23<1:33:26, 3.16s/it, lr=0.0001, step_loss=0.0169]Steps: 11%|█▏ | 225/2000 [05:23<1:33:26, 3.16s/it, lr=0.0001, step_loss=0.00117]Steps: 11%|█▏ | 226/2000 [05:24<1:12:08, 2.44s/it, lr=0.0001, step_loss=0.00117]Steps: 11%|█▏ | 226/2000 [05:24<1:12:08, 2.44s/it, lr=0.0001, step_loss=0.0495] Steps: 11%|█▏ | 227/2000 [05:25<57:13, 1.94s/it, lr=0.0001, step_loss=0.0495] Steps: 11%|█▏ | 227/2000 [05:25<57:13, 1.94s/it, lr=0.0001, step_loss=0.0457]Steps: 11%|█▏ | 228/2000 [05:26<46:47, 1.58s/it, lr=0.0001, step_loss=0.0457]Steps: 11%|█▏ | 228/2000 [05:26<46:47, 1.58s/it, lr=0.0001, step_loss=0.0336]Steps: 11%|█▏ | 229/2000 [05:26<39:28, 1.34s/it, lr=0.0001, step_loss=0.0336]Steps: 11%|█▏ | 229/2000 [05:27<39:28, 1.34s/it, lr=0.0001, step_loss=0.00347]Steps: 12%|█▏ | 230/2000 [05:27<34:22, 1.17s/it, lr=0.0001, step_loss=0.00347]Steps: 12%|█▏ | 230/2000 [05:27<34:22, 1.17s/it, lr=0.0001, step_loss=0.192] Steps: 12%|█▏ | 231/2000 [05:28<30:46, 1.04s/it, lr=0.0001, step_loss=0.192]Steps: 12%|█▏ | 231/2000 [05:28<30:46, 1.04s/it, lr=0.0001, step_loss=0.0123]Steps: 12%|█▏ | 232/2000 [05:29<28:16, 1.04it/s, lr=0.0001, step_loss=0.0123]Steps: 12%|█▏ | 232/2000 [05:29<28:16, 1.04it/s, lr=0.0001, step_loss=0.00302]Steps: 12%|█▏ | 233/2000 [05:30<26:31, 1.11it/s, lr=0.0001, step_loss=0.00302]Steps: 12%|█▏ | 233/2000 [05:30<26:31, 1.11it/s, lr=0.0001, step_loss=0.00346]Steps: 12%|█▏ | 234/2000 [05:30<25:17, 1.16it/s, lr=0.0001, step_loss=0.00346]Steps: 12%|█▏ | 234/2000 [05:30<25:17, 1.16it/s, lr=0.0001, step_loss=0.096] Steps: 12%|█▏ | 235/2000 [05:31<24:25, 1.20it/s, lr=0.0001, step_loss=0.096]Steps: 12%|█▏ | 235/2000 [05:31<24:25, 1.20it/s, lr=0.0001, step_loss=0.0583]Steps: 12%|█▏ | 236/2000 [05:32<23:48, 1.23it/s, lr=0.0001, step_loss=0.0583]Steps: 12%|█▏ | 236/2000 [05:32<23:48, 1.23it/s, lr=0.0001, step_loss=0.00168]Steps: 12%|█▏ | 237/2000 [05:33<23:23, 1.26it/s, lr=0.0001, step_loss=0.00168]Steps: 12%|█▏ | 237/2000 [05:33<23:23, 1.26it/s, lr=0.0001, step_loss=0.0579] Steps: 12%|█▏ | 238/2000 [05:33<23:05, 1.27it/s, lr=0.0001, step_loss=0.0579]Steps: 12%|█▏ | 238/2000 [05:33<23:05, 1.27it/s, lr=0.0001, step_loss=0.135] Steps: 12%|█▏ | 239/2000 [05:34<22:53, 1.28it/s, lr=0.0001, step_loss=0.135]Steps: 12%|█▏ | 239/2000 [05:34<22:53, 1.28it/s, lr=0.0001, step_loss=0.0825]Steps: 12%|█▏ | 240/2000 [05:35<22:42, 1.29it/s, lr=0.0001, step_loss=0.0825]Steps: 12%|█▏ | 240/2000 [05:35<22:42, 1.29it/s, lr=0.0001, step_loss=0.00814]Steps: 12%|█▏ | 241/2000 [05:36<22:34, 1.30it/s, lr=0.0001, step_loss=0.00814]Steps: 12%|█▏ | 241/2000 [05:36<22:34, 1.30it/s, lr=0.0001, step_loss=0.0239] Steps: 12%|█▏ | 242/2000 [05:36<22:29, 1.30it/s, lr=0.0001, step_loss=0.0239]Steps: 12%|█▏ | 242/2000 [05:36<22:29, 1.30it/s, lr=0.0001, step_loss=0.157] Steps: 12%|█▏ | 243/2000 [05:37<22:24, 1.31it/s, lr=0.0001, step_loss=0.157]Steps: 12%|█▏ | 243/2000 [05:37<22:24, 1.31it/s, lr=0.0001, step_loss=0.00278]Steps: 12%|█▏ | 244/2000 [05:38<22:22, 1.31it/s, lr=0.0001, step_loss=0.00278]Steps: 12%|█▏ | 244/2000 [05:38<22:22, 1.31it/s, lr=0.0001, step_loss=0.00487]Steps: 12%|█▏ | 245/2000 [05:39<22:21, 1.31it/s, lr=0.0001, step_loss=0.00487]Steps: 12%|█▏ | 245/2000 [05:39<22:21, 1.31it/s, lr=0.0001, step_loss=0.00122]Steps: 12%|█▏ | 246/2000 [05:39<22:18, 1.31it/s, lr=0.0001, step_loss=0.00122]Steps: 12%|█▏ | 246/2000 [05:39<22:18, 1.31it/s, lr=0.0001, step_loss=0.246] Steps: 12%|█▏ | 247/2000 [05:40<22:16, 1.31it/s, lr=0.0001, step_loss=0.246]Steps: 12%|█▏ | 247/2000 [05:40<22:16, 1.31it/s, lr=0.0001, step_loss=0.083]Steps: 12%|█▏ | 248/2000 [05:41<22:16, 1.31it/s, lr=0.0001, step_loss=0.083]Steps: 12%|█▏ | 248/2000 [05:41<22:16, 1.31it/s, lr=0.0001, step_loss=0.0286]Steps: 12%|█▏ | 249/2000 [05:42<22:15, 1.31it/s, lr=0.0001, step_loss=0.0286]Steps: 12%|█▏ | 249/2000 [05:42<22:15, 1.31it/s, lr=0.0001, step_loss=0.0292]Steps: 12%|█▎ | 250/2000 [05:43<22:14, 1.31it/s, lr=0.0001, step_loss=0.0292]Steps: 12%|█▎ | 250/2000 [05:43<22:14, 1.31it/s, lr=0.0001, step_loss=0.0443]Steps: 13%|█▎ | 251/2000 [05:43<22:14, 1.31it/s, lr=0.0001, step_loss=0.0443]Steps: 13%|█▎ | 251/2000 [05:43<22:14, 1.31it/s, lr=0.0001, step_loss=0.0203]Steps: 13%|█▎ | 252/2000 [05:44<22:12, 1.31it/s, lr=0.0001, step_loss=0.0203]Steps: 13%|█▎ | 252/2000 [05:44<22:12, 1.31it/s, lr=0.0001, step_loss=0.000595]Steps: 13%|█▎ | 253/2000 [05:45<22:10, 1.31it/s, lr=0.0001, step_loss=0.000595]Steps: 13%|█▎ | 253/2000 [05:45<22:10, 1.31it/s, lr=0.0001, step_loss=0.00148] Steps: 13%|█▎ | 254/2000 [05:46<22:10, 1.31it/s, lr=0.0001, step_loss=0.00148]Steps: 13%|█▎ | 254/2000 [05:46<22:10, 1.31it/s, lr=0.0001, step_loss=0.0723] Steps: 13%|█▎ | 255/2000 [05:46<22:09, 1.31it/s, lr=0.0001, step_loss=0.0723]Steps: 13%|█▎ | 255/2000 [05:46<22:09, 1.31it/s, lr=0.0001, step_loss=0.0751]Steps: 13%|█▎ | 256/2000 [05:47<22:08, 1.31it/s, lr=0.0001, step_loss=0.0751]11/14/2025 06:14:35 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 256)
Steps: 13%|█▎ | 256/2000 [05:55<22:08, 1.31it/s, lr=0.0001, step_loss=0.0332]11/14/2025 06:14:35 - INFO - root - ### DEBUG: Finished epoch 7, epoch_steps=32, global_step=256
11/14/2025 06:14:35 - INFO - root - ### DEBUG: Starting epoch 8/63, global_step=256, max_train_steps=2000
Steps: 13%|█▎ | 257/2000 [05:56<1:31:44, 3.16s/it, lr=0.0001, step_loss=0.0332]Steps: 13%|█▎ | 257/2000 [05:56<1:31:44, 3.16s/it, lr=0.0001, step_loss=0.00349]Steps: 13%|█▎ | 258/2000 [05:57<1:10:48, 2.44s/it, lr=0.0001, step_loss=0.00349]Steps: 13%|█▎ | 258/2000 [05:57<1:10:48, 2.44s/it, lr=0.0001, step_loss=0.00125]Steps: 13%|█▎ | 259/2000 [05:57<56:09, 1.94s/it, lr=0.0001, step_loss=0.00125] Steps: 13%|█▎ | 259/2000 [05:57<56:09, 1.94s/it, lr=0.0001, step_loss=0.0122] Steps: 13%|█▎ | 260/2000 [05:58<45:55, 1.58s/it, lr=0.0001, step_loss=0.0122]Steps: 13%|█▎ | 260/2000 [05:58<45:55, 1.58s/it, lr=0.0001, step_loss=0.000564]Steps: 13%|█▎ | 261/2000 [05:59<38:45, 1.34s/it, lr=0.0001, step_loss=0.000564]Steps: 13%|█▎ | 261/2000 [05:59<38:45, 1.34s/it, lr=0.0001, step_loss=0.0537] Steps: 13%|█▎ | 262/2000 [06:00<33:44, 1.17s/it, lr=0.0001, step_loss=0.0537]Steps: 13%|█▎ | 262/2000 [06:00<33:44, 1.17s/it, lr=0.0001, step_loss=0.00126]Steps: 13%|█▎ | 263/2000 [06:00<30:13, 1.04s/it, lr=0.0001, step_loss=0.00126]Steps: 13%|█▎ | 263/2000 [06:00<30:13, 1.04s/it, lr=0.0001, step_loss=0.00137]Steps: 13%|█▎ | 264/2000 [06:01<27:44, 1.04it/s, lr=0.0001, step_loss=0.00137]Steps: 13%|█▎ | 264/2000 [06:01<27:44, 1.04it/s, lr=0.0001, step_loss=0.0768] Steps: 13%|█▎ | 265/2000 [06:02<26:00, 1.11it/s, lr=0.0001, step_loss=0.0768]Steps: 13%|█▎ | 265/2000 [06:02<26:00, 1.11it/s, lr=0.0001, step_loss=0.000693]Steps: 13%|█▎ | 266/2000 [06:03<24:48, 1.16it/s, lr=0.0001, step_loss=0.000693]Steps: 13%|█▎ | 266/2000 [06:03<24:48, 1.16it/s, lr=0.0001, step_loss=0.00216] Steps: 13%|█▎ | 267/2000 [06:03<23:56, 1.21it/s, lr=0.0001, step_loss=0.00216]Steps: 13%|█▎ | 267/2000 [06:03<23:56, 1.21it/s, lr=0.0001, step_loss=0.00327]Steps: 13%|█▎ | 268/2000 [06:04<23:21, 1.24it/s, lr=0.0001, step_loss=0.00327]Steps: 13%|█▎ | 268/2000 [06:04<23:21, 1.24it/s, lr=0.0001, step_loss=0.00465]Steps: 13%|█▎ | 269/2000 [06:05<22:56, 1.26it/s, lr=0.0001, step_loss=0.00465]Steps: 13%|█▎ | 269/2000 [06:05<22:56, 1.26it/s, lr=0.0001, step_loss=0.0109] Steps: 14%|█▎ | 270/2000 [06:06<22:37, 1.27it/s, lr=0.0001, step_loss=0.0109]Steps: 14%|█▎ | 270/2000 [06:06<22:37, 1.27it/s, lr=0.0001, step_loss=0.0126]Steps: 14%|█▎ | 271/2000 [06:06<22:24, 1.29it/s, lr=0.0001, step_loss=0.0126]Steps: 14%|█▎ | 271/2000 [06:07<22:24, 1.29it/s, lr=0.0001, step_loss=0.117] Steps: 14%|█▎ | 272/2000 [06:07<22:16, 1.29it/s, lr=0.0001, step_loss=0.117]Steps: 14%|█▎ | 272/2000 [06:07<22:16, 1.29it/s, lr=0.0001, step_loss=0.00134]Steps: 14%|█▎ | 273/2000 [06:08<22:09, 1.30it/s, lr=0.0001, step_loss=0.00134]Steps: 14%|█▎ | 273/2000 [06:08<22:09, 1.30it/s, lr=0.0001, step_loss=0.0152] Steps: 14%|█▎ | 274/2000 [06:09<22:03, 1.30it/s, lr=0.0001, step_loss=0.0152]Steps: 14%|█▎ | 274/2000 [06:09<22:03, 1.30it/s, lr=0.0001, step_loss=0.0382]Steps: 14%|█▍ | 275/2000 [06:10<22:00, 1.31it/s, lr=0.0001, step_loss=0.0382]Steps: 14%|█▍ | 275/2000 [06:10<22:00, 1.31it/s, lr=0.0001, step_loss=0.127] Steps: 14%|█▍ | 276/2000 [06:10<21:58, 1.31it/s, lr=0.0001, step_loss=0.127]Steps: 14%|█▍ | 276/2000 [06:10<21:58, 1.31it/s, lr=0.0001, step_loss=0.00171]Steps: 14%|█▍ | 277/2000 [06:11<21:56, 1.31it/s, lr=0.0001, step_loss=0.00171]Steps: 14%|█▍ | 277/2000 [06:11<21:56, 1.31it/s, lr=0.0001, step_loss=0.0252] Steps: 14%|█▍ | 278/2000 [06:12<21:54, 1.31it/s, lr=0.0001, step_loss=0.0252]Steps: 14%|█▍ | 278/2000 [06:12<21:54, 1.31it/s, lr=0.0001, step_loss=0.0162]Steps: 14%|█▍ | 279/2000 [06:13<21:53, 1.31it/s, lr=0.0001, step_loss=0.0162]Steps: 14%|█▍ | 279/2000 [06:13<21:53, 1.31it/s, lr=0.0001, step_loss=0.012] Steps: 14%|█▍ | 280/2000 [06:13<21:51, 1.31it/s, lr=0.0001, step_loss=0.012]Steps: 14%|█▍ | 280/2000 [06:13<21:51, 1.31it/s, lr=0.0001, step_loss=0.00593]Steps: 14%|█▍ | 281/2000 [06:14<21:50, 1.31it/s, lr=0.0001, step_loss=0.00593]Steps: 14%|█▍ | 281/2000 [06:14<21:50, 1.31it/s, lr=0.0001, step_loss=0.0669] Steps: 14%|█▍ | 282/2000 [06:15<21:50, 1.31it/s, lr=0.0001, step_loss=0.0669]Steps: 14%|█▍ | 282/2000 [06:15<21:50, 1.31it/s, lr=0.0001, step_loss=0.000589]Steps: 14%|█▍ | 283/2000 [06:16<21:48, 1.31it/s, lr=0.0001, step_loss=0.000589]Steps: 14%|█▍ | 283/2000 [06:16<21:48, 1.31it/s, lr=0.0001, step_loss=0.00142] Steps: 14%|█▍ | 284/2000 [06:16<21:46, 1.31it/s, lr=0.0001, step_loss=0.00142]Steps: 14%|█▍ | 284/2000 [06:16<21:46, 1.31it/s, lr=0.0001, step_loss=0.00052]Steps: 14%|█▍ | 285/2000 [06:17<21:45, 1.31it/s, lr=0.0001, step_loss=0.00052]Steps: 14%|█▍ | 285/2000 [06:17<21:45, 1.31it/s, lr=0.0001, step_loss=0.0231] Steps: 14%|█▍ | 286/2000 [06:18<21:43, 1.31it/s, lr=0.0001, step_loss=0.0231]Steps: 14%|█▍ | 286/2000 [06:18<21:43, 1.31it/s, lr=0.0001, step_loss=0.00532]Steps: 14%|█▍ | 287/2000 [06:19<21:42, 1.31it/s, lr=0.0001, step_loss=0.00532]Steps: 14%|█▍ | 287/2000 [06:19<21:42, 1.31it/s, lr=0.0001, step_loss=0.0654] Steps: 14%|█▍ | 288/2000 [06:19<21:42, 1.31it/s, lr=0.0001, step_loss=0.0654]11/14/2025 06:15:07 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 288)
Steps: 14%|█▍ | 288/2000 [06:27<21:42, 1.31it/s, lr=0.0001, step_loss=0.00988]11/14/2025 06:15:07 - INFO - root - ### DEBUG: Finished epoch 8, epoch_steps=32, global_step=288
11/14/2025 06:15:07 - INFO - root - ### DEBUG: Starting epoch 9/63, global_step=288, max_train_steps=2000
Steps: 14%|█▍ | 289/2000 [06:28<1:30:06, 3.16s/it, lr=0.0001, step_loss=0.00988]Steps: 14%|█▍ | 289/2000 [06:28<1:30:06, 3.16s/it, lr=0.0001, step_loss=0.00285]Steps: 14%|█▍ | 290/2000 [06:29<1:09:32, 2.44s/it, lr=0.0001, step_loss=0.00285]Steps: 14%|█▍ | 290/2000 [06:29<1:09:32, 2.44s/it, lr=0.0001, step_loss=0.0423] Steps: 15%|█▍ | 291/2000 [06:30<55:09, 1.94s/it, lr=0.0001, step_loss=0.0423] Steps: 15%|█▍ | 291/2000 [06:30<55:09, 1.94s/it, lr=0.0001, step_loss=0.00178]Steps: 15%|█▍ | 292/2000 [06:30<45:05, 1.58s/it, lr=0.0001, step_loss=0.00178]Steps: 15%|█▍ | 292/2000 [06:31<45:05, 1.58s/it, lr=0.0001, step_loss=0.0233] Steps: 15%|█▍ | 293/2000 [06:31<38:02, 1.34s/it, lr=0.0001, step_loss=0.0233]Steps: 15%|█▍ | 293/2000 [06:31<38:02, 1.34s/it, lr=0.0001, step_loss=0.000842]Steps: 15%|█▍ | 294/2000 [06:32<33:06, 1.16s/it, lr=0.0001, step_loss=0.000842]Steps: 15%|█▍ | 294/2000 [06:32<33:06, 1.16s/it, lr=0.0001, step_loss=0.155] Steps: 15%|█▍ | 295/2000 [06:33<29:38, 1.04s/it, lr=0.0001, step_loss=0.155]Steps: 15%|█▍ | 295/2000 [06:33<29:38, 1.04s/it, lr=0.0001, step_loss=0.0015]Steps: 15%|█▍ | 296/2000 [06:34<27:14, 1.04it/s, lr=0.0001, step_loss=0.0015]Steps: 15%|█▍ | 296/2000 [06:34<27:14, 1.04it/s, lr=0.0001, step_loss=0.0918]Steps: 15%|█▍ | 297/2000 [06:34<25:31, 1.11it/s, lr=0.0001, step_loss=0.0918]Steps: 15%|█▍ | 297/2000 [06:34<25:31, 1.11it/s, lr=0.0001, step_loss=0.343] Steps: 15%|█▍ | 298/2000 [06:35<24:19, 1.17it/s, lr=0.0001, step_loss=0.343]Steps: 15%|█▍ | 298/2000 [06:35<24:19, 1.17it/s, lr=0.0001, step_loss=0.195]Steps: 15%|█▍ | 299/2000 [06:36<23:30, 1.21it/s, lr=0.0001, step_loss=0.195]Steps: 15%|█▍ | 299/2000 [06:36<23:30, 1.21it/s, lr=0.0001, step_loss=0.105]Steps: 15%|█▌ | 300/2000 [06:37<22:54, 1.24it/s, lr=0.0001, step_loss=0.105]Steps: 15%|█▌ | 300/2000 [06:37<22:54, 1.24it/s, lr=0.0001, step_loss=0.219]Steps: 15%|█▌ | 301/2000 [06:37<22:29, 1.26it/s, lr=0.0001, step_loss=0.219]Steps: 15%|█▌ | 301/2000 [06:37<22:29, 1.26it/s, lr=0.0001, step_loss=0.0707]Steps: 15%|█▌ | 302/2000 [06:38<22:11, 1.28it/s, lr=0.0001, step_loss=0.0707]Steps: 15%|█▌ | 302/2000 [06:38<22:11, 1.28it/s, lr=0.0001, step_loss=0.00122]Steps: 15%|█▌ | 303/2000 [06:39<21:59, 1.29it/s, lr=0.0001, step_loss=0.00122]Steps: 15%|█▌ | 303/2000 [06:39<21:59, 1.29it/s, lr=0.0001, step_loss=0.0358] Steps: 15%|█▌ | 304/2000 [06:40<21:50, 1.29it/s, lr=0.0001, step_loss=0.0358]Steps: 15%|█▌ | 304/2000 [06:40<21:50, 1.29it/s, lr=0.0001, step_loss=0.00823]Steps: 15%|█▌ | 305/2000 [06:40<21:43, 1.30it/s, lr=0.0001, step_loss=0.00823]Steps: 15%|█▌ | 305/2000 [06:40<21:43, 1.30it/s, lr=0.0001, step_loss=0.000543]Steps: 15%|█▌ | 306/2000 [06:41<21:39, 1.30it/s, lr=0.0001, step_loss=0.000543]Steps: 15%|█▌ | 306/2000 [06:41<21:39, 1.30it/s, lr=0.0001, step_loss=0.0155] Steps: 15%|█▌ | 307/2000 [06:42<21:35, 1.31it/s, lr=0.0001, step_loss=0.0155]Steps: 15%|█▌ | 307/2000 [06:42<21:35, 1.31it/s, lr=0.0001, step_loss=0.00722]Steps: 15%|█▌ | 308/2000 [06:43<21:32, 1.31it/s, lr=0.0001, step_loss=0.00722]Steps: 15%|█▌ | 308/2000 [06:43<21:32, 1.31it/s, lr=0.0001, step_loss=0.00065]Steps: 15%|█▌ | 309/2000 [06:43<21:29, 1.31it/s, lr=0.0001, step_loss=0.00065]Steps: 15%|█▌ | 309/2000 [06:43<21:29, 1.31it/s, lr=0.0001, step_loss=0.111] Steps: 16%|█▌ | 310/2000 [06:44<21:27, 1.31it/s, lr=0.0001, step_loss=0.111]Steps: 16%|█▌ | 310/2000 [06:44<21:27, 1.31it/s, lr=0.0001, step_loss=0.00322]Steps: 16%|█▌ | 311/2000 [06:45<21:26, 1.31it/s, lr=0.0001, step_loss=0.00322]Steps: 16%|█▌ | 311/2000 [06:45<21:26, 1.31it/s, lr=0.0001, step_loss=0.251] Steps: 16%|█▌ | 312/2000 [06:46<21:24, 1.31it/s, lr=0.0001, step_loss=0.251]Steps: 16%|█▌ | 312/2000 [06:46<21:24, 1.31it/s, lr=0.0001, step_loss=0.348]Steps: 16%|█▌ | 313/2000 [06:46<21:23, 1.31it/s, lr=0.0001, step_loss=0.348]Steps: 16%|█▌ | 313/2000 [06:46<21:23, 1.31it/s, lr=0.0001, step_loss=0.000879]Steps: 16%|█▌ | 314/2000 [06:47<21:22, 1.31it/s, lr=0.0001, step_loss=0.000879]Steps: 16%|█▌ | 314/2000 [06:47<21:22, 1.31it/s, lr=0.0001, step_loss=0.0441] Steps: 16%|█▌ | 315/2000 [06:48<21:22, 1.31it/s, lr=0.0001, step_loss=0.0441]Steps: 16%|█▌ | 315/2000 [06:48<21:22, 1.31it/s, lr=0.0001, step_loss=0.0115]Steps: 16%|█▌ | 316/2000 [06:49<21:21, 1.31it/s, lr=0.0001, step_loss=0.0115]Steps: 16%|█▌ | 316/2000 [06:49<21:21, 1.31it/s, lr=0.0001, step_loss=0.000778]Steps: 16%|█▌ | 317/2000 [06:49<21:20, 1.31it/s, lr=0.0001, step_loss=0.000778]Steps: 16%|█▌ | 317/2000 [06:50<21:20, 1.31it/s, lr=0.0001, step_loss=0.011] Steps: 16%|█▌ | 318/2000 [06:50<21:19, 1.31it/s, lr=0.0001, step_loss=0.011]Steps: 16%|█▌ | 318/2000 [06:50<21:19, 1.31it/s, lr=0.0001, step_loss=0.0101]Steps: 16%|█▌ | 319/2000 [06:51<21:18, 1.32it/s, lr=0.0001, step_loss=0.0101]Steps: 16%|█▌ | 319/2000 [06:51<21:18, 1.32it/s, lr=0.0001, step_loss=0.00544]Steps: 16%|█▌ | 320/2000 [06:52<21:17, 1.31it/s, lr=0.0001, step_loss=0.00544]11/14/2025 06:15:40 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 320)
Steps: 16%|█▌ | 320/2000 [07:00<21:17, 1.31it/s, lr=0.0001, step_loss=0.0609] 11/14/2025 06:15:40 - INFO - root - ### DEBUG: Finished epoch 9, epoch_steps=32, global_step=320
11/14/2025 06:15:40 - INFO - root - ### DEBUG: Starting epoch 10/63, global_step=320, max_train_steps=2000
Steps: 16%|█▌ | 321/2000 [07:01<1:30:38, 3.24s/it, lr=0.0001, step_loss=0.0609]Steps: 16%|█▌ | 321/2000 [07:01<1:30:38, 3.24s/it, lr=0.0001, step_loss=0.012] Steps: 16%|█▌ | 322/2000 [07:02<1:09:47, 2.50s/it, lr=0.0001, step_loss=0.012]Steps: 16%|█▌ | 322/2000 [07:02<1:09:47, 2.50s/it, lr=0.0001, step_loss=0.0234]Steps: 16%|█▌ | 323/2000 [07:02<55:12, 1.98s/it, lr=0.0001, step_loss=0.0234] Steps: 16%|█▌ | 323/2000 [07:02<55:12, 1.98s/it, lr=0.0001, step_loss=0.00354]Steps: 16%|█▌ | 324/2000 [07:03<45:00, 1.61s/it, lr=0.0001, step_loss=0.00354]Steps: 16%|█▌ | 324/2000 [07:03<45:00, 1.61s/it, lr=0.0001, step_loss=0.00112]Steps: 16%|█▋ | 325/2000 [07:04<37:52, 1.36s/it, lr=0.0001, step_loss=0.00112]Steps: 16%|█▋ | 325/2000 [07:04<37:52, 1.36s/it, lr=0.0001, step_loss=0.00101]Steps: 16%|█▋ | 326/2000 [07:05<32:51, 1.18s/it, lr=0.0001, step_loss=0.00101]Steps: 16%|█▋ | 326/2000 [07:05<32:51, 1.18s/it, lr=0.0001, step_loss=0.0209] Steps: 16%|█▋ | 327/2000 [07:05<29:20, 1.05s/it, lr=0.0001, step_loss=0.0209]Steps: 16%|█▋ | 327/2000 [07:05<29:20, 1.05s/it, lr=0.0001, step_loss=0.00083]Steps: 16%|█▋ | 328/2000 [07:06<26:53, 1.04it/s, lr=0.0001, step_loss=0.00083]Steps: 16%|█▋ | 328/2000 [07:06<26:53, 1.04it/s, lr=0.0001, step_loss=0.00156]Steps: 16%|█▋ | 329/2000 [07:07<25:10, 1.11it/s, lr=0.0001, step_loss=0.00156]Steps: 16%|█▋ | 329/2000 [07:07<25:10, 1.11it/s, lr=0.0001, step_loss=0.404] Steps: 16%|█▋ | 330/2000 [07:08<23:57, 1.16it/s, lr=0.0001, step_loss=0.404]Steps: 16%|█▋ | 330/2000 [07:08<23:57, 1.16it/s, lr=0.0001, step_loss=0.00049]Steps: 17%|█▋ | 331/2000 [07:08<23:06, 1.20it/s, lr=0.0001, step_loss=0.00049]Steps: 17%|█▋ | 331/2000 [07:08<23:06, 1.20it/s, lr=0.0001, step_loss=0.00178]Steps: 17%|█▋ | 332/2000 [07:09<22:31, 1.23it/s, lr=0.0001, step_loss=0.00178]Steps: 17%|█▋ | 332/2000 [07:09<22:31, 1.23it/s, lr=0.0001, step_loss=0.000638]Steps: 17%|█▋ | 333/2000 [07:10<22:07, 1.26it/s, lr=0.0001, step_loss=0.000638]Steps: 17%|█▋ | 333/2000 [07:10<22:07, 1.26it/s, lr=0.0001, step_loss=0.0316] Steps: 17%|█▋ | 334/2000 [07:11<21:49, 1.27it/s, lr=0.0001, step_loss=0.0316]Steps: 17%|█▋ | 334/2000 [07:11<21:49, 1.27it/s, lr=0.0001, step_loss=0.0014]Steps: 17%|█▋ | 335/2000 [07:11<21:35, 1.28it/s, lr=0.0001, step_loss=0.0014]Steps: 17%|█▋ | 335/2000 [07:11<21:35, 1.28it/s, lr=0.0001, step_loss=0.0887]Steps: 17%|█▋ | 336/2000 [07:12<21:25, 1.29it/s, lr=0.0001, step_loss=0.0887]Steps: 17%|█▋ | 336/2000 [07:12<21:25, 1.29it/s, lr=0.0001, step_loss=0.0167]Steps: 17%|█▋ | 337/2000 [07:13<21:20, 1.30it/s, lr=0.0001, step_loss=0.0167]Steps: 17%|█▋ | 337/2000 [07:13<21:20, 1.30it/s, lr=0.0001, step_loss=0.02] Steps: 17%|█▋ | 338/2000 [07:14<21:14, 1.30it/s, lr=0.0001, step_loss=0.02]Steps: 17%|█▋ | 338/2000 [07:14<21:14, 1.30it/s, lr=0.0001, step_loss=0.158]Steps: 17%|█▋ | 339/2000 [07:14<21:10, 1.31it/s, lr=0.0001, step_loss=0.158]Steps: 17%|█▋ | 339/2000 [07:15<21:10, 1.31it/s, lr=0.0001, step_loss=0.0108]Steps: 17%|█▋ | 340/2000 [07:15<21:08, 1.31it/s, lr=0.0001, step_loss=0.0108]Steps: 17%|█▋ | 340/2000 [07:15<21:08, 1.31it/s, lr=0.0001, step_loss=0.0121]Steps: 17%|█▋ | 341/2000 [07:16<21:06, 1.31it/s, lr=0.0001, step_loss=0.0121]Steps: 17%|█▋ | 341/2000 [07:16<21:06, 1.31it/s, lr=0.0001, step_loss=0.0247]Steps: 17%|█▋ | 342/2000 [07:17<21:04, 1.31it/s, lr=0.0001, step_loss=0.0247]Steps: 17%|█▋ | 342/2000 [07:17<21:04, 1.31it/s, lr=0.0001, step_loss=0.00582]Steps: 17%|█▋ | 343/2000 [07:18<21:02, 1.31it/s, lr=0.0001, step_loss=0.00582]Steps: 17%|█▋ | 343/2000 [07:18<21:02, 1.31it/s, lr=0.0001, step_loss=0.00778]Steps: 17%|█▋ | 344/2000 [07:18<21:01, 1.31it/s, lr=0.0001, step_loss=0.00778]Steps: 17%|█▋ | 344/2000 [07:18<21:01, 1.31it/s, lr=0.0001, step_loss=0.0751] Steps: 17%|█▋ | 345/2000 [07:19<21:01, 1.31it/s, lr=0.0001, step_loss=0.0751]Steps: 17%|█▋ | 345/2000 [07:19<21:01, 1.31it/s, lr=0.0001, step_loss=0.0319]Steps: 17%|█▋ | 346/2000 [07:20<21:01, 1.31it/s, lr=0.0001, step_loss=0.0319]Steps: 17%|█▋ | 346/2000 [07:20<21:01, 1.31it/s, lr=0.0001, step_loss=0.171] Steps: 17%|█▋ | 347/2000 [07:21<20:59, 1.31it/s, lr=0.0001, step_loss=0.171]Steps: 17%|█▋ | 347/2000 [07:21<20:59, 1.31it/s, lr=0.0001, step_loss=0.112]Steps: 17%|█▋ | 348/2000 [07:21<20:59, 1.31it/s, lr=0.0001, step_loss=0.112]Steps: 17%|█▋ | 348/2000 [07:21<20:59, 1.31it/s, lr=0.0001, step_loss=0.0205]Steps: 17%|█▋ | 349/2000 [07:22<20:58, 1.31it/s, lr=0.0001, step_loss=0.0205]Steps: 17%|█▋ | 349/2000 [07:22<20:58, 1.31it/s, lr=0.0001, step_loss=0.000828]Steps: 18%|█▊ | 350/2000 [07:23<20:56, 1.31it/s, lr=0.0001, step_loss=0.000828]Steps: 18%|█▊ | 350/2000 [07:23<20:56, 1.31it/s, lr=0.0001, step_loss=0.206] Steps: 18%|█▊ | 351/2000 [07:24<20:56, 1.31it/s, lr=0.0001, step_loss=0.206]Steps: 18%|█▊ | 351/2000 [07:24<20:56, 1.31it/s, lr=0.0001, step_loss=0.135]Steps: 18%|█▊ | 352/2000 [07:24<20:56, 1.31it/s, lr=0.0001, step_loss=0.135]11/14/2025 06:16:12 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 352)
Steps: 18%|█▊ | 352/2000 [07:33<20:56, 1.31it/s, lr=0.0001, step_loss=0.00125]11/14/2025 06:16:12 - INFO - root - ### DEBUG: Finished epoch 10, epoch_steps=32, global_step=352
11/14/2025 06:16:12 - INFO - root - ### DEBUG: Starting epoch 11/63, global_step=352, max_train_steps=2000
Steps: 18%|█▊ | 353/2000 [07:33<1:27:55, 3.20s/it, lr=0.0001, step_loss=0.00125]Steps: 18%|█▊ | 353/2000 [07:33<1:27:55, 3.20s/it, lr=0.0001, step_loss=0.00526]Steps: 18%|█▊ | 354/2000 [07:34<1:07:47, 2.47s/it, lr=0.0001, step_loss=0.00526]Steps: 18%|█▊ | 354/2000 [07:34<1:07:47, 2.47s/it, lr=0.0001, step_loss=0.0182] Steps: 18%|█▊ | 355/2000 [07:35<53:41, 1.96s/it, lr=0.0001, step_loss=0.0182] Steps: 18%|█▊ | 355/2000 [07:35<53:41, 1.96s/it, lr=0.0001, step_loss=0.0467]Steps: 18%|█▊ | 356/2000 [07:36<43:49, 1.60s/it, lr=0.0001, step_loss=0.0467]Steps: 18%|█▊ | 356/2000 [07:36<43:49, 1.60s/it, lr=0.0001, step_loss=0.377] Steps: 18%|█▊ | 357/2000 [07:36<36:55, 1.35s/it, lr=0.0001, step_loss=0.377]Steps: 18%|█▊ | 357/2000 [07:36<36:55, 1.35s/it, lr=0.0001, step_loss=0.03] Steps: 18%|█▊ | 358/2000 [07:37<32:05, 1.17s/it, lr=0.0001, step_loss=0.03]Steps: 18%|█▊ | 358/2000 [07:37<32:05, 1.17s/it, lr=0.0001, step_loss=0.00119]Steps: 18%|█▊ | 359/2000 [07:38<28:41, 1.05s/it, lr=0.0001, step_loss=0.00119]Steps: 18%|█▊ | 359/2000 [07:38<28:41, 1.05s/it, lr=0.0001, step_loss=0.00878]Steps: 18%|█▊ | 360/2000 [07:39<26:18, 1.04it/s, lr=0.0001, step_loss=0.00878]Steps: 18%|█▊ | 360/2000 [07:39<26:18, 1.04it/s, lr=0.0001, step_loss=0.00109]Steps: 18%|█▊ | 361/2000 [07:39<24:39, 1.11it/s, lr=0.0001, step_loss=0.00109]Steps: 18%|█▊ | 361/2000 [07:39<24:39, 1.11it/s, lr=0.0001, step_loss=0.0019] Steps: 18%|█▊ | 362/2000 [07:40<23:29, 1.16it/s, lr=0.0001, step_loss=0.0019]Steps: 18%|█▊ | 362/2000 [07:40<23:29, 1.16it/s, lr=0.0001, step_loss=0.000678]Steps: 18%|█▊ | 363/2000 [07:41<22:40, 1.20it/s, lr=0.0001, step_loss=0.000678]Steps: 18%|█▊ | 363/2000 [07:41<22:40, 1.20it/s, lr=0.0001, step_loss=0.0194] Steps: 18%|█▊ | 364/2000 [07:42<22:05, 1.23it/s, lr=0.0001, step_loss=0.0194]Steps: 18%|█▊ | 364/2000 [07:42<22:05, 1.23it/s, lr=0.0001, step_loss=0.000545]Steps: 18%|█▊ | 365/2000 [07:42<21:40, 1.26it/s, lr=0.0001, step_loss=0.000545]Steps: 18%|█▊ | 365/2000 [07:42<21:40, 1.26it/s, lr=0.0001, step_loss=0.000652]Steps: 18%|█▊ | 366/2000 [07:43<21:22, 1.27it/s, lr=0.0001, step_loss=0.000652]Steps: 18%|█▊ | 366/2000 [07:43<21:22, 1.27it/s, lr=0.0001, step_loss=0.0106] Steps: 18%|█▊ | 367/2000 [07:44<21:11, 1.28it/s, lr=0.0001, step_loss=0.0106]Steps: 18%|█▊ | 367/2000 [07:44<21:11, 1.28it/s, lr=0.0001, step_loss=0.00379]Steps: 18%|█▊ | 368/2000 [07:45<21:02, 1.29it/s, lr=0.0001, step_loss=0.00379]Steps: 18%|█▊ | 368/2000 [07:45<21:02, 1.29it/s, lr=0.0001, step_loss=0.0402] Steps: 18%|█▊ | 369/2000 [07:45<20:56, 1.30it/s, lr=0.0001, step_loss=0.0402]Steps: 18%|█▊ | 369/2000 [07:46<20:56, 1.30it/s, lr=0.0001, step_loss=0.0176]Steps: 18%|█▊ | 370/2000 [07:46<20:52, 1.30it/s, lr=0.0001, step_loss=0.0176]Steps: 18%|█▊ | 370/2000 [07:46<20:52, 1.30it/s, lr=0.0001, step_loss=0.0166]Steps: 19%|█▊ | 371/2000 [07:47<20:48, 1.30it/s, lr=0.0001, step_loss=0.0166]Steps: 19%|█▊ | 371/2000 [07:47<20:48, 1.30it/s, lr=0.0001, step_loss=0.00445]Steps: 19%|█▊ | 372/2000 [07:48<20:44, 1.31it/s, lr=0.0001, step_loss=0.00445]Steps: 19%|█▊ | 372/2000 [07:48<20:44, 1.31it/s, lr=0.0001, step_loss=0.00157]Steps: 19%|█▊ | 373/2000 [07:49<20:42, 1.31it/s, lr=0.0001, step_loss=0.00157]Steps: 19%|█▊ | 373/2000 [07:49<20:42, 1.31it/s, lr=0.0001, step_loss=0.00189]Steps: 19%|█▊ | 374/2000 [07:49<20:40, 1.31it/s, lr=0.0001, step_loss=0.00189]Steps: 19%|█▊ | 374/2000 [07:49<20:40, 1.31it/s, lr=0.0001, step_loss=0.299] Steps: 19%|█▉ | 375/2000 [07:50<20:38, 1.31it/s, lr=0.0001, step_loss=0.299]Steps: 19%|█▉ | 375/2000 [07:50<20:38, 1.31it/s, lr=0.0001, step_loss=0.0259]Steps: 19%|█▉ | 376/2000 [07:51<20:37, 1.31it/s, lr=0.0001, step_loss=0.0259]Steps: 19%|█▉ | 376/2000 [07:51<20:37, 1.31it/s, lr=0.0001, step_loss=0.00133]Steps: 19%|█▉ | 377/2000 [07:52<20:35, 1.31it/s, lr=0.0001, step_loss=0.00133]Steps: 19%|█▉ | 377/2000 [07:52<20:35, 1.31it/s, lr=0.0001, step_loss=0.0247] Steps: 19%|█▉ | 378/2000 [07:52<20:41, 1.31it/s, lr=0.0001, step_loss=0.0247]Steps: 19%|█▉ | 378/2000 [07:52<20:41, 1.31it/s, lr=0.0001, step_loss=0.0318]Steps: 19%|█▉ | 379/2000 [07:53<20:38, 1.31it/s, lr=0.0001, step_loss=0.0318]Steps: 19%|█▉ | 379/2000 [07:53<20:38, 1.31it/s, lr=0.0001, step_loss=0.0066]Steps: 19%|█▉ | 380/2000 [07:54<20:36, 1.31it/s, lr=0.0001, step_loss=0.0066]Steps: 19%|█▉ | 380/2000 [07:54<20:36, 1.31it/s, lr=0.0001, step_loss=0.0141]Steps: 19%|█▉ | 381/2000 [07:55<20:34, 1.31it/s, lr=0.0001, step_loss=0.0141]Steps: 19%|█▉ | 381/2000 [07:55<20:34, 1.31it/s, lr=0.0001, step_loss=0.000462]Steps: 19%|█▉ | 382/2000 [07:55<20:33, 1.31it/s, lr=0.0001, step_loss=0.000462]Steps: 19%|█▉ | 382/2000 [07:55<20:33, 1.31it/s, lr=0.0001, step_loss=0.277] Steps: 19%|█▉ | 383/2000 [07:56<20:31, 1.31it/s, lr=0.0001, step_loss=0.277]Steps: 19%|█▉ | 383/2000 [07:56<20:31, 1.31it/s, lr=0.0001, step_loss=0.000431]Steps: 19%|█▉ | 384/2000 [07:57<20:30, 1.31it/s, lr=0.0001, step_loss=0.000431]11/14/2025 06:16:45 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 384)
Steps: 19%|█▉ | 384/2000 [08:05<20:30, 1.31it/s, lr=0.0001, step_loss=0.00546] 11/14/2025 06:16:45 - INFO - root - ### DEBUG: Finished epoch 11, epoch_steps=32, global_step=384
11/14/2025 06:16:45 - INFO - root - ### DEBUG: Starting epoch 12/63, global_step=384, max_train_steps=2000
Steps: 19%|█▉ | 385/2000 [08:06<1:26:14, 3.20s/it, lr=0.0001, step_loss=0.00546]Steps: 19%|█▉ | 385/2000 [08:06<1:26:14, 3.20s/it, lr=0.0001, step_loss=0.177] Steps: 19%|█▉ | 386/2000 [08:07<1:06:28, 2.47s/it, lr=0.0001, step_loss=0.177]Steps: 19%|█▉ | 386/2000 [08:07<1:06:28, 2.47s/it, lr=0.0001, step_loss=0.00449]Steps: 19%|█▉ | 387/2000 [08:07<52:38, 1.96s/it, lr=0.0001, step_loss=0.00449] Steps: 19%|█▉ | 387/2000 [08:07<52:38, 1.96s/it, lr=0.0001, step_loss=0.00484]Steps: 19%|█▉ | 388/2000 [08:08<42:57, 1.60s/it, lr=0.0001, step_loss=0.00484]Steps: 19%|█▉ | 388/2000 [08:08<42:57, 1.60s/it, lr=0.0001, step_loss=0.0192] Steps: 19%|█▉ | 389/2000 [08:09<36:11, 1.35s/it, lr=0.0001, step_loss=0.0192]Steps: 19%|█▉ | 389/2000 [08:09<36:11, 1.35s/it, lr=0.0001, step_loss=0.0281]Steps: 20%|█▉ | 390/2000 [08:10<31:27, 1.17s/it, lr=0.0001, step_loss=0.0281]Steps: 20%|█▉ | 390/2000 [08:10<31:27, 1.17s/it, lr=0.0001, step_loss=0.00178]Steps: 20%|█▉ | 391/2000 [08:10<28:07, 1.05s/it, lr=0.0001, step_loss=0.00178]Steps: 20%|█▉ | 391/2000 [08:10<28:07, 1.05s/it, lr=0.0001, step_loss=0.0126] Steps: 20%|█▉ | 392/2000 [08:11<25:48, 1.04it/s, lr=0.0001, step_loss=0.0126]Steps: 20%|█▉ | 392/2000 [08:11<25:48, 1.04it/s, lr=0.0001, step_loss=0.136] Steps: 20%|█▉ | 393/2000 [08:12<24:11, 1.11it/s, lr=0.0001, step_loss=0.136]Steps: 20%|█▉ | 393/2000 [08:12<24:11, 1.11it/s, lr=0.0001, step_loss=0.0211]Steps: 20%|█▉ | 394/2000 [08:13<23:03, 1.16it/s, lr=0.0001, step_loss=0.0211]Steps: 20%|█▉ | 394/2000 [08:13<23:03, 1.16it/s, lr=0.0001, step_loss=0.12] Steps: 20%|█▉ | 395/2000 [08:13<22:17, 1.20it/s, lr=0.0001, step_loss=0.12]Steps: 20%|█▉ | 395/2000 [08:13<22:17, 1.20it/s, lr=0.0001, step_loss=0.00874]Steps: 20%|█▉ | 396/2000 [08:14<21:41, 1.23it/s, lr=0.0001, step_loss=0.00874]Steps: 20%|█▉ | 396/2000 [08:14<21:41, 1.23it/s, lr=0.0001, step_loss=0.229] Steps: 20%|█▉ | 397/2000 [08:15<21:16, 1.26it/s, lr=0.0001, step_loss=0.229]Steps: 20%|█▉ | 397/2000 [08:15<21:16, 1.26it/s, lr=0.0001, step_loss=0.00121]Steps: 20%|█▉ | 398/2000 [08:16<20:59, 1.27it/s, lr=0.0001, step_loss=0.00121]Steps: 20%|█▉ | 398/2000 [08:16<20:59, 1.27it/s, lr=0.0001, step_loss=0.0678] Steps: 20%|█▉ | 399/2000 [08:17<20:47, 1.28it/s, lr=0.0001, step_loss=0.0678]Steps: 20%|█▉ | 399/2000 [08:17<20:47, 1.28it/s, lr=0.0001, step_loss=0.00426]Steps: 20%|██ | 400/2000 [08:17<20:37, 1.29it/s, lr=0.0001, step_loss=0.00426]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.69it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.04it/s][A100%|██████████| 8/8 [00:00<00:00, 32.20it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 43.98it/s][A100%|██████████| 8/8 [00:00<00:00, 32.17it/s]
11/14/2025 06:17:29 - INFO - root - Saved samples to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/samples/sample-400.gif
Steps: 20%|██ | 400/2000 [08:50<20:37, 1.29it/s, lr=0.0001, step_loss=0.128] Steps: 20%|██ | 401/2000 [08:50<4:38:23, 10.45s/it, lr=0.0001, step_loss=0.128]Steps: 20%|██ | 401/2000 [08:50<4:38:23, 10.45s/it, lr=0.0001, step_loss=0.132]Steps: 20%|██ | 402/2000 [08:51<3:20:53, 7.54s/it, lr=0.0001, step_loss=0.132]Steps: 20%|██ | 402/2000 [08:51<3:20:53, 7.54s/it, lr=0.0001, step_loss=0.000625]Steps: 20%|██ | 403/2000 [08:52<2:26:40, 5.51s/it, lr=0.0001, step_loss=0.000625]Steps: 20%|██ | 403/2000 [08:52<2:26:40, 5.51s/it, lr=0.0001, step_loss=0.00239] Steps: 20%|██ | 404/2000 [08:53<1:48:46, 4.09s/it, lr=0.0001, step_loss=0.00239]Steps: 20%|██ | 404/2000 [08:53<1:48:46, 4.09s/it, lr=0.0001, step_loss=0.0028] Steps: 20%|██ | 405/2000 [08:53<1:22:15, 3.09s/it, lr=0.0001, step_loss=0.0028]Steps: 20%|██ | 405/2000 [08:53<1:22:15, 3.09s/it, lr=0.0001, step_loss=0.000564]Steps: 20%|██ | 406/2000 [08:54<1:03:40, 2.40s/it, lr=0.0001, step_loss=0.000564]Steps: 20%|██ | 406/2000 [08:54<1:03:40, 2.40s/it, lr=0.0001, step_loss=0.134] Steps: 20%|██ | 407/2000 [08:55<50:40, 1.91s/it, lr=0.0001, step_loss=0.134] Steps: 20%|██ | 407/2000 [08:55<50:40, 1.91s/it, lr=0.0001, step_loss=0.0132]Steps: 20%|██ | 408/2000 [08:56<41:34, 1.57s/it, lr=0.0001, step_loss=0.0132]Steps: 20%|██ | 408/2000 [08:56<41:34, 1.57s/it, lr=0.0001, step_loss=0.00838]Steps: 20%|██ | 409/2000 [08:56<35:12, 1.33s/it, lr=0.0001, step_loss=0.00838]Steps: 20%|██ | 409/2000 [08:56<35:12, 1.33s/it, lr=0.0001, step_loss=0.00407]Steps: 20%|██ | 410/2000 [08:57<30:46, 1.16s/it, lr=0.0001, step_loss=0.00407]Steps: 20%|██ | 410/2000 [08:57<30:46, 1.16s/it, lr=0.0001, step_loss=0.0011] Steps: 21%|██ | 411/2000 [08:58<27:40, 1.04s/it, lr=0.0001, step_loss=0.0011]Steps: 21%|██ | 411/2000 [08:58<27:40, 1.04s/it, lr=0.0001, step_loss=0.0531]Steps: 21%|██ | 412/2000 [08:59<25:29, 1.04it/s, lr=0.0001, step_loss=0.0531]Steps: 21%|██ | 412/2000 [08:59<25:29, 1.04it/s, lr=0.0001, step_loss=0.133] Steps: 21%|██ | 413/2000 [09:00<23:58, 1.10it/s, lr=0.0001, step_loss=0.133]Steps: 21%|██ | 413/2000 [09:00<23:58, 1.10it/s, lr=0.0001, step_loss=0.00244]Steps: 21%|██ | 414/2000 [09:00<22:51, 1.16it/s, lr=0.0001, step_loss=0.00244]Steps: 21%|██ | 414/2000 [09:00<22:51, 1.16it/s, lr=0.0001, step_loss=0.00899]Steps: 21%|██ | 415/2000 [09:01<22:05, 1.20it/s, lr=0.0001, step_loss=0.00899]Steps: 21%|██ | 415/2000 [09:01<22:05, 1.20it/s, lr=0.0001, step_loss=0.3] Steps: 21%|██ | 416/2000 [09:02<21:32, 1.23it/s, lr=0.0001, step_loss=0.3]11/14/2025 06:17:49 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 416)
Steps: 21%|██ | 416/2000 [09:10<21:32, 1.23it/s, lr=0.0001, step_loss=0.155]11/14/2025 06:17:49 - INFO - root - ### DEBUG: Finished epoch 12, epoch_steps=32, global_step=416
11/14/2025 06:17:49 - INFO - root - ### DEBUG: Starting epoch 13/63, global_step=416, max_train_steps=2000
Steps: 21%|██ | 417/2000 [09:11<1:23:49, 3.18s/it, lr=0.0001, step_loss=0.155]Steps: 21%|██ | 417/2000 [09:11<1:23:49, 3.18s/it, lr=0.0001, step_loss=0.0776]Steps: 21%|██ | 418/2000 [09:11<1:04:44, 2.46s/it, lr=0.0001, step_loss=0.0776]Steps: 21%|██ | 418/2000 [09:11<1:04:44, 2.46s/it, lr=0.0001, step_loss=0.00121]Steps: 21%|██ | 419/2000 [09:12<51:22, 1.95s/it, lr=0.0001, step_loss=0.00121] Steps: 21%|██ | 419/2000 [09:12<51:22, 1.95s/it, lr=0.0001, step_loss=0.0599] Steps: 21%|██ | 420/2000 [09:13<42:02, 1.60s/it, lr=0.0001, step_loss=0.0599]Steps: 21%|██ | 420/2000 [09:13<42:02, 1.60s/it, lr=0.0001, step_loss=0.00327]Steps: 21%|██ | 421/2000 [09:14<35:29, 1.35s/it, lr=0.0001, step_loss=0.00327]Steps: 21%|██ | 421/2000 [09:14<35:29, 1.35s/it, lr=0.0001, step_loss=0.00376]Steps: 21%|██ | 422/2000 [09:14<30:54, 1.18s/it, lr=0.0001, step_loss=0.00376]Steps: 21%|██ | 422/2000 [09:14<30:54, 1.18s/it, lr=0.0001, step_loss=0.000539]Steps: 21%|██ | 423/2000 [09:15<27:41, 1.05s/it, lr=0.0001, step_loss=0.000539]Steps: 21%|██ | 423/2000 [09:15<27:41, 1.05s/it, lr=0.0001, step_loss=0.116] Steps: 21%|██ | 424/2000 [09:16<25:27, 1.03it/s, lr=0.0001, step_loss=0.116]Steps: 21%|██ | 424/2000 [09:16<25:27, 1.03it/s, lr=0.0001, step_loss=0.00208]Steps: 21%|██▏ | 425/2000 [09:17<23:53, 1.10it/s, lr=0.0001, step_loss=0.00208]Steps: 21%|██▏ | 425/2000 [09:17<23:53, 1.10it/s, lr=0.0001, step_loss=0.000588]Steps: 21%|██▏ | 426/2000 [09:17<22:44, 1.15it/s, lr=0.0001, step_loss=0.000588]Steps: 21%|██▏ | 426/2000 [09:17<22:44, 1.15it/s, lr=0.0001, step_loss=0.00428] Steps: 21%|██▏ | 427/2000 [09:18<21:58, 1.19it/s, lr=0.0001, step_loss=0.00428]Steps: 21%|██▏ | 427/2000 [09:18<21:58, 1.19it/s, lr=0.0001, step_loss=0.00146]Steps: 21%|██▏ | 428/2000 [09:19<21:25, 1.22it/s, lr=0.0001, step_loss=0.00146]Steps: 21%|██▏ | 428/2000 [09:19<21:25, 1.22it/s, lr=0.0001, step_loss=0.000773]Steps: 21%|██▏ | 429/2000 [09:20<21:02, 1.24it/s, lr=0.0001, step_loss=0.000773]Steps: 21%|██▏ | 429/2000 [09:20<21:02, 1.24it/s, lr=0.0001, step_loss=0.0225] Steps: 22%|██▏ | 430/2000 [09:21<20:46, 1.26it/s, lr=0.0001, step_loss=0.0225]Steps: 22%|██▏ | 430/2000 [09:21<20:46, 1.26it/s, lr=0.0001, step_loss=0.0277]Steps: 22%|██▏ | 431/2000 [09:21<20:33, 1.27it/s, lr=0.0001, step_loss=0.0277]Steps: 22%|██▏ | 431/2000 [09:21<20:33, 1.27it/s, lr=0.0001, step_loss=0.00118]Steps: 22%|██▏ | 432/2000 [09:22<20:25, 1.28it/s, lr=0.0001, step_loss=0.00118]Steps: 22%|██▏ | 432/2000 [09:22<20:25, 1.28it/s, lr=0.0001, step_loss=0.0761] Steps: 22%|██▏ | 433/2000 [09:23<20:18, 1.29it/s, lr=0.0001, step_loss=0.0761]Steps: 22%|██▏ | 433/2000 [09:23<20:18, 1.29it/s, lr=0.0001, step_loss=0.0186]Steps: 22%|██▏ | 434/2000 [09:24<20:14, 1.29it/s, lr=0.0001, step_loss=0.0186]Steps: 22%|██▏ | 434/2000 [09:24<20:14, 1.29it/s, lr=0.0001, step_loss=0.0152]Steps: 22%|██▏ | 435/2000 [09:24<20:12, 1.29it/s, lr=0.0001, step_loss=0.0152]Steps: 22%|██▏ | 435/2000 [09:24<20:12, 1.29it/s, lr=0.0001, step_loss=0.058] Steps: 22%|██▏ | 436/2000 [09:25<20:09, 1.29it/s, lr=0.0001, step_loss=0.058]Steps: 22%|██▏ | 436/2000 [09:25<20:09, 1.29it/s, lr=0.0001, step_loss=0.0417]Steps: 22%|██▏ | 437/2000 [09:26<20:06, 1.30it/s, lr=0.0001, step_loss=0.0417]Steps: 22%|██▏ | 437/2000 [09:26<20:06, 1.30it/s, lr=0.0001, step_loss=0.0178]Steps: 22%|██▏ | 438/2000 [09:27<20:03, 1.30it/s, lr=0.0001, step_loss=0.0178]Steps: 22%|██▏ | 438/2000 [09:27<20:03, 1.30it/s, lr=0.0001, step_loss=0.122] Steps: 22%|██▏ | 439/2000 [09:27<20:04, 1.30it/s, lr=0.0001, step_loss=0.122]Steps: 22%|██▏ | 439/2000 [09:28<20:04, 1.30it/s, lr=0.0001, step_loss=0.00642]Steps: 22%|██▏ | 440/2000 [09:28<20:03, 1.30it/s, lr=0.0001, step_loss=0.00642]Steps: 22%|██▏ | 440/2000 [09:28<20:03, 1.30it/s, lr=0.0001, step_loss=0.0129] Steps: 22%|██▏ | 441/2000 [09:29<20:02, 1.30it/s, lr=0.0001, step_loss=0.0129]Steps: 22%|██▏ | 441/2000 [09:29<20:02, 1.30it/s, lr=0.0001, step_loss=0.0165]Steps: 22%|██▏ | 442/2000 [09:30<20:01, 1.30it/s, lr=0.0001, step_loss=0.0165]Steps: 22%|██▏ | 442/2000 [09:30<20:01, 1.30it/s, lr=0.0001, step_loss=0.00248]Steps: 22%|██▏ | 443/2000 [09:31<20:00, 1.30it/s, lr=0.0001, step_loss=0.00248]Steps: 22%|██▏ | 443/2000 [09:31<20:00, 1.30it/s, lr=0.0001, step_loss=0.00315]Steps: 22%|██▏ | 444/2000 [09:31<20:00, 1.30it/s, lr=0.0001, step_loss=0.00315]Steps: 22%|██▏ | 444/2000 [09:31<20:00, 1.30it/s, lr=0.0001, step_loss=0.00144]Steps: 22%|██▏ | 445/2000 [09:32<20:00, 1.30it/s, lr=0.0001, step_loss=0.00144]Steps: 22%|██▏ | 445/2000 [09:32<20:00, 1.30it/s, lr=0.0001, step_loss=0.0016] Steps: 22%|██▏ | 446/2000 [09:33<20:44, 1.25it/s, lr=0.0001, step_loss=0.0016]Steps: 22%|██▏ | 446/2000 [09:33<20:44, 1.25it/s, lr=0.0001, step_loss=0.00165]Steps: 22%|██▏ | 447/2000 [09:34<20:32, 1.26it/s, lr=0.0001, step_loss=0.00165]Steps: 22%|██▏ | 447/2000 [09:34<20:32, 1.26it/s, lr=0.0001, step_loss=0.0488] Steps: 22%|██▏ | 448/2000 [09:35<20:21, 1.27it/s, lr=0.0001, step_loss=0.0488]11/14/2025 06:18:22 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 448)
Steps: 22%|██▏ | 448/2000 [09:43<20:21, 1.27it/s, lr=0.0001, step_loss=0.0422]11/14/2025 06:18:22 - INFO - root - ### DEBUG: Finished epoch 13, epoch_steps=32, global_step=448
11/14/2025 06:18:22 - INFO - root - ### DEBUG: Starting epoch 14/63, global_step=448, max_train_steps=2000
Steps: 22%|██▏ | 449/2000 [09:44<1:23:58, 3.25s/it, lr=0.0001, step_loss=0.0422]Steps: 22%|██▏ | 449/2000 [09:44<1:23:58, 3.25s/it, lr=0.0001, step_loss=0.0639]Steps: 22%|██▎ | 450/2000 [09:44<1:04:42, 2.50s/it, lr=0.0001, step_loss=0.0639]Steps: 22%|██▎ | 450/2000 [09:44<1:04:42, 2.50s/it, lr=0.0001, step_loss=0.168] Steps: 23%|██▎ | 451/2000 [09:45<51:13, 1.98s/it, lr=0.0001, step_loss=0.168] Steps: 23%|██▎ | 451/2000 [09:45<51:13, 1.98s/it, lr=0.0001, step_loss=0.0756]Steps: 23%|██▎ | 452/2000 [09:46<41:46, 1.62s/it, lr=0.0001, step_loss=0.0756]Steps: 23%|██▎ | 452/2000 [09:46<41:46, 1.62s/it, lr=0.0001, step_loss=0.0224]Steps: 23%|██▎ | 453/2000 [09:47<35:11, 1.37s/it, lr=0.0001, step_loss=0.0224]Steps: 23%|██▎ | 453/2000 [09:47<35:11, 1.37s/it, lr=0.0001, step_loss=0.00817]Steps: 23%|██▎ | 454/2000 [09:47<30:37, 1.19s/it, lr=0.0001, step_loss=0.00817]Steps: 23%|██▎ | 454/2000 [09:47<30:37, 1.19s/it, lr=0.0001, step_loss=0.155] Steps: 23%|██▎ | 455/2000 [09:48<27:28, 1.07s/it, lr=0.0001, step_loss=0.155]Steps: 23%|██▎ | 455/2000 [09:48<27:28, 1.07s/it, lr=0.0001, step_loss=0.012]Steps: 23%|██▎ | 456/2000 [09:49<25:15, 1.02it/s, lr=0.0001, step_loss=0.012]Steps: 23%|██▎ | 456/2000 [09:49<25:15, 1.02it/s, lr=0.0001, step_loss=0.017]Steps: 23%|██▎ | 457/2000 [09:50<23:36, 1.09it/s, lr=0.0001, step_loss=0.017]Steps: 23%|██▎ | 457/2000 [09:50<23:36, 1.09it/s, lr=0.0001, step_loss=0.00203]Steps: 23%|██▎ | 458/2000 [09:50<22:26, 1.15it/s, lr=0.0001, step_loss=0.00203]Steps: 23%|██▎ | 458/2000 [09:51<22:26, 1.15it/s, lr=0.0001, step_loss=0.0145] Steps: 23%|██▎ | 459/2000 [09:51<21:36, 1.19it/s, lr=0.0001, step_loss=0.0145]Steps: 23%|██▎ | 459/2000 [09:51<21:36, 1.19it/s, lr=0.0001, step_loss=0.00375]Steps: 23%|██▎ | 460/2000 [09:52<21:02, 1.22it/s, lr=0.0001, step_loss=0.00375]Steps: 23%|██▎ | 460/2000 [09:52<21:02, 1.22it/s, lr=0.0001, step_loss=0.0407] Steps: 23%|██▎ | 461/2000 [09:53<20:38, 1.24it/s, lr=0.0001, step_loss=0.0407]Steps: 23%|██▎ | 461/2000 [09:53<20:38, 1.24it/s, lr=0.0001, step_loss=0.409] Steps: 23%|██▎ | 462/2000 [09:54<20:22, 1.26it/s, lr=0.0001, step_loss=0.409]Steps: 23%|██▎ | 462/2000 [09:54<20:22, 1.26it/s, lr=0.0001, step_loss=0.06] Steps: 23%|██▎ | 463/2000 [09:54<20:10, 1.27it/s, lr=0.0001, step_loss=0.06]Steps: 23%|██▎ | 463/2000 [09:54<20:10, 1.27it/s, lr=0.0001, step_loss=0.0142]Steps: 23%|██▎ | 464/2000 [09:55<20:00, 1.28it/s, lr=0.0001, step_loss=0.0142]Steps: 23%|██▎ | 464/2000 [09:55<20:00, 1.28it/s, lr=0.0001, step_loss=0.441] Steps: 23%|██▎ | 465/2000 [09:56<19:53, 1.29it/s, lr=0.0001, step_loss=0.441]Steps: 23%|██▎ | 465/2000 [09:56<19:53, 1.29it/s, lr=0.0001, step_loss=0.000424]Steps: 23%|██▎ | 466/2000 [09:57<19:50, 1.29it/s, lr=0.0001, step_loss=0.000424]Steps: 23%|██▎ | 466/2000 [09:57<19:50, 1.29it/s, lr=0.0001, step_loss=0.153] Steps: 23%|██▎ | 467/2000 [09:57<19:45, 1.29it/s, lr=0.0001, step_loss=0.153]Steps: 23%|██▎ | 467/2000 [09:57<19:45, 1.29it/s, lr=0.0001, step_loss=0.000941]Steps: 23%|██▎ | 468/2000 [09:58<19:42, 1.30it/s, lr=0.0001, step_loss=0.000941]Steps: 23%|██▎ | 468/2000 [09:58<19:42, 1.30it/s, lr=0.0001, step_loss=0.0102] Steps: 23%|██▎ | 469/2000 [09:59<19:40, 1.30it/s, lr=0.0001, step_loss=0.0102]Steps: 23%|██▎ | 469/2000 [09:59<19:40, 1.30it/s, lr=0.0001, step_loss=0.118] Steps: 24%|██▎ | 470/2000 [10:00<19:38, 1.30it/s, lr=0.0001, step_loss=0.118]Steps: 24%|██▎ | 470/2000 [10:00<19:38, 1.30it/s, lr=0.0001, step_loss=0.123]Steps: 24%|██▎ | 471/2000 [10:00<19:37, 1.30it/s, lr=0.0001, step_loss=0.123]Steps: 24%|██▎ | 471/2000 [10:01<19:37, 1.30it/s, lr=0.0001, step_loss=0.00321]Steps: 24%|██▎ | 472/2000 [10:01<19:37, 1.30it/s, lr=0.0001, step_loss=0.00321]Steps: 24%|██▎ | 472/2000 [10:01<19:37, 1.30it/s, lr=0.0001, step_loss=0.000589]Steps: 24%|██▎ | 473/2000 [10:02<19:35, 1.30it/s, lr=0.0001, step_loss=0.000589]Steps: 24%|██▎ | 473/2000 [10:02<19:35, 1.30it/s, lr=0.0001, step_loss=0.001] Steps: 24%|██▎ | 474/2000 [10:03<19:34, 1.30it/s, lr=0.0001, step_loss=0.001]Steps: 24%|██▎ | 474/2000 [10:03<19:34, 1.30it/s, lr=0.0001, step_loss=0.0277]Steps: 24%|██▍ | 475/2000 [10:04<19:33, 1.30it/s, lr=0.0001, step_loss=0.0277]Steps: 24%|██▍ | 475/2000 [10:04<19:33, 1.30it/s, lr=0.0001, step_loss=0.00107]Steps: 24%|██▍ | 476/2000 [10:04<19:33, 1.30it/s, lr=0.0001, step_loss=0.00107]Steps: 24%|██▍ | 476/2000 [10:04<19:33, 1.30it/s, lr=0.0001, step_loss=0.0115] Steps: 24%|██▍ | 477/2000 [10:05<19:34, 1.30it/s, lr=0.0001, step_loss=0.0115]Steps: 24%|██▍ | 477/2000 [10:05<19:34, 1.30it/s, lr=0.0001, step_loss=0.0469]Steps: 24%|██▍ | 478/2000 [10:06<19:32, 1.30it/s, lr=0.0001, step_loss=0.0469]Steps: 24%|██▍ | 478/2000 [10:06<19:32, 1.30it/s, lr=0.0001, step_loss=0.00305]Steps: 24%|██▍ | 479/2000 [10:07<19:31, 1.30it/s, lr=0.0001, step_loss=0.00305]Steps: 24%|██▍ | 479/2000 [10:07<19:31, 1.30it/s, lr=0.0001, step_loss=0.0265] Steps: 24%|██▍ | 480/2000 [10:07<19:29, 1.30it/s, lr=0.0001, step_loss=0.0265]11/14/2025 06:18:55 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 480)
Steps: 24%|██▍ | 480/2000 [10:15<19:29, 1.30it/s, lr=0.0001, step_loss=0.0467]11/14/2025 06:18:55 - INFO - root - ### DEBUG: Finished epoch 14, epoch_steps=32, global_step=480
11/14/2025 06:18:55 - INFO - root - ### DEBUG: Starting epoch 15/63, global_step=480, max_train_steps=2000
Steps: 24%|██▍ | 481/2000 [10:16<1:16:17, 3.01s/it, lr=0.0001, step_loss=0.0467]Steps: 24%|██▍ | 481/2000 [10:16<1:16:17, 3.01s/it, lr=0.0001, step_loss=0.000709]Steps: 24%|██▍ | 482/2000 [10:16<59:13, 2.34s/it, lr=0.0001, step_loss=0.000709] Steps: 24%|██▍ | 482/2000 [10:16<59:13, 2.34s/it, lr=0.0001, step_loss=0.228] Steps: 24%|██▍ | 483/2000 [10:17<47:16, 1.87s/it, lr=0.0001, step_loss=0.228]Steps: 24%|██▍ | 483/2000 [10:17<47:16, 1.87s/it, lr=0.0001, step_loss=0.00339]Steps: 24%|██▍ | 484/2000 [10:18<38:53, 1.54s/it, lr=0.0001, step_loss=0.00339]Steps: 24%|██▍ | 484/2000 [10:18<38:53, 1.54s/it, lr=0.0001, step_loss=0.00549]Steps: 24%|██▍ | 485/2000 [10:19<33:01, 1.31s/it, lr=0.0001, step_loss=0.00549]Steps: 24%|██▍ | 485/2000 [10:19<33:01, 1.31s/it, lr=0.0001, step_loss=0.00197]Steps: 24%|██▍ | 486/2000 [10:20<28:56, 1.15s/it, lr=0.0001, step_loss=0.00197]Steps: 24%|██▍ | 486/2000 [10:20<28:56, 1.15s/it, lr=0.0001, step_loss=0.000546]Steps: 24%|██▍ | 487/2000 [10:20<26:03, 1.03s/it, lr=0.0001, step_loss=0.000546]Steps: 24%|██▍ | 487/2000 [10:20<26:03, 1.03s/it, lr=0.0001, step_loss=0.00236] Steps: 24%|██▍ | 488/2000 [10:21<24:02, 1.05it/s, lr=0.0001, step_loss=0.00236]Steps: 24%|██▍ | 488/2000 [10:21<24:02, 1.05it/s, lr=0.0001, step_loss=0.000922]Steps: 24%|██▍ | 489/2000 [10:22<22:37, 1.11it/s, lr=0.0001, step_loss=0.000922]Steps: 24%|██▍ | 489/2000 [10:22<22:37, 1.11it/s, lr=0.0001, step_loss=0.00971] Steps: 24%|██▍ | 490/2000 [10:23<21:37, 1.16it/s, lr=0.0001, step_loss=0.00971]Steps: 24%|██▍ | 490/2000 [10:23<21:37, 1.16it/s, lr=0.0001, step_loss=0.00535]Steps: 25%|██▍ | 491/2000 [10:23<20:56, 1.20it/s, lr=0.0001, step_loss=0.00535]Steps: 25%|██▍ | 491/2000 [10:23<20:56, 1.20it/s, lr=0.0001, step_loss=0.00346]Steps: 25%|██▍ | 492/2000 [10:24<20:26, 1.23it/s, lr=0.0001, step_loss=0.00346]Steps: 25%|██▍ | 492/2000 [10:24<20:26, 1.23it/s, lr=0.0001, step_loss=0.00166]Steps: 25%|██▍ | 493/2000 [10:25<20:06, 1.25it/s, lr=0.0001, step_loss=0.00166]Steps: 25%|██▍ | 493/2000 [10:25<20:06, 1.25it/s, lr=0.0001, step_loss=0.00052]Steps: 25%|██▍ | 494/2000 [10:26<19:51, 1.26it/s, lr=0.0001, step_loss=0.00052]Steps: 25%|██▍ | 494/2000 [10:26<19:51, 1.26it/s, lr=0.0001, step_loss=0.00211]Steps: 25%|██▍ | 495/2000 [10:26<19:40, 1.28it/s, lr=0.0001, step_loss=0.00211]Steps: 25%|██▍ | 495/2000 [10:26<19:40, 1.28it/s, lr=0.0001, step_loss=0.00406]Steps: 25%|██▍ | 496/2000 [10:27<19:33, 1.28it/s, lr=0.0001, step_loss=0.00406]Steps: 25%|██▍ | 496/2000 [10:27<19:33, 1.28it/s, lr=0.0001, step_loss=0.012] Steps: 25%|██▍ | 497/2000 [10:28<19:26, 1.29it/s, lr=0.0001, step_loss=0.012]Steps: 25%|██▍ | 497/2000 [10:28<19:26, 1.29it/s, lr=0.0001, step_loss=0.000866]Steps: 25%|██▍ | 498/2000 [10:29<19:24, 1.29it/s, lr=0.0001, step_loss=0.000866]Steps: 25%|██▍ | 498/2000 [10:29<19:24, 1.29it/s, lr=0.0001, step_loss=0.0527] Steps: 25%|██▍ | 499/2000 [10:30<19:20, 1.29it/s, lr=0.0001, step_loss=0.0527]Steps: 25%|██▍ | 499/2000 [10:30<19:20, 1.29it/s, lr=0.0001, step_loss=0.0659]Steps: 25%|██▌ | 500/2000 [10:30<19:18, 1.30it/s, lr=0.0001, step_loss=0.0659]11/14/2025 06:19:23 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 500)
Steps: 25%|██▌ | 500/2000 [10:44<19:18, 1.30it/s, lr=0.0001, step_loss=0.00187]Steps: 25%|██▌ | 501/2000 [10:44<1:59:54, 4.80s/it, lr=0.0001, step_loss=0.00187]Steps: 25%|██▌ | 501/2000 [10:45<1:59:54, 4.80s/it, lr=0.0001, step_loss=0.000429]Steps: 25%|██▌ | 502/2000 [10:45<1:29:39, 3.59s/it, lr=0.0001, step_loss=0.000429]Steps: 25%|██▌ | 502/2000 [10:45<1:29:39, 3.59s/it, lr=0.0001, step_loss=0.0395] Steps: 25%|██▌ | 503/2000 [10:46<1:08:29, 2.75s/it, lr=0.0001, step_loss=0.0395]Steps: 25%|██▌ | 503/2000 [10:46<1:08:29, 2.75s/it, lr=0.0001, step_loss=0.00135]Steps: 25%|██▌ | 504/2000 [10:47<53:39, 2.15s/it, lr=0.0001, step_loss=0.00135] Steps: 25%|██▌ | 504/2000 [10:47<53:39, 2.15s/it, lr=0.0001, step_loss=0.00091]Steps: 25%|██▌ | 505/2000 [10:48<43:17, 1.74s/it, lr=0.0001, step_loss=0.00091]Steps: 25%|██▌ | 505/2000 [10:48<43:17, 1.74s/it, lr=0.0001, step_loss=0.000394]Steps: 25%|██▌ | 506/2000 [10:48<36:04, 1.45s/it, lr=0.0001, step_loss=0.000394]Steps: 25%|██▌ | 506/2000 [10:48<36:04, 1.45s/it, lr=0.0001, step_loss=0.000773]Steps: 25%|██▌ | 507/2000 [10:49<31:02, 1.25s/it, lr=0.0001, step_loss=0.000773]Steps: 25%|██▌ | 507/2000 [10:49<31:02, 1.25s/it, lr=0.0001, step_loss=0.317] Steps: 25%|██▌ | 508/2000 [10:50<27:28, 1.10s/it, lr=0.0001, step_loss=0.317]Steps: 25%|██▌ | 508/2000 [10:50<27:28, 1.10s/it, lr=0.0001, step_loss=0.000946]Steps: 25%|██▌ | 509/2000 [10:51<25:00, 1.01s/it, lr=0.0001, step_loss=0.000946]Steps: 25%|██▌ | 509/2000 [10:51<25:00, 1.01s/it, lr=0.0001, step_loss=0.00198] Steps: 26%|██▌ | 510/2000 [10:51<23:15, 1.07it/s, lr=0.0001, step_loss=0.00198]Steps: 26%|██▌ | 510/2000 [10:51<23:15, 1.07it/s, lr=0.0001, step_loss=0.0127] Steps: 26%|██▌ | 511/2000 [10:52<22:01, 1.13it/s, lr=0.0001, step_loss=0.0127]Steps: 26%|██▌ | 511/2000 [10:52<22:01, 1.13it/s, lr=0.0001, step_loss=0.000623]Steps: 26%|██▌ | 512/2000 [10:53<21:12, 1.17it/s, lr=0.0001, step_loss=0.000623]11/14/2025 06:19:41 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 512)
Steps: 26%|██▌ | 512/2000 [11:01<21:12, 1.17it/s, lr=0.0001, step_loss=0.00197] 11/14/2025 06:19:41 - INFO - root - ### DEBUG: Finished epoch 15, epoch_steps=32, global_step=512
11/14/2025 06:19:41 - INFO - root - ### DEBUG: Starting epoch 16/63, global_step=512, max_train_steps=2000
Steps: 26%|██▌ | 513/2000 [11:02<1:19:59, 3.23s/it, lr=0.0001, step_loss=0.00197]Steps: 26%|██▌ | 513/2000 [11:02<1:19:59, 3.23s/it, lr=0.0001, step_loss=0.0539] Steps: 26%|██▌ | 514/2000 [11:03<1:01:40, 2.49s/it, lr=0.0001, step_loss=0.0539]Steps: 26%|██▌ | 514/2000 [11:03<1:01:40, 2.49s/it, lr=0.0001, step_loss=0.00183]Steps: 26%|██▌ | 515/2000 [11:03<48:52, 1.97s/it, lr=0.0001, step_loss=0.00183] Steps: 26%|██▌ | 515/2000 [11:03<48:52, 1.97s/it, lr=0.0001, step_loss=0.406] Steps: 26%|██▌ | 516/2000 [11:04<39:53, 1.61s/it, lr=0.0001, step_loss=0.406]Steps: 26%|██▌ | 516/2000 [11:04<39:53, 1.61s/it, lr=0.0001, step_loss=0.00917]Steps: 26%|██▌ | 517/2000 [11:05<33:37, 1.36s/it, lr=0.0001, step_loss=0.00917]Steps: 26%|██▌ | 517/2000 [11:05<33:37, 1.36s/it, lr=0.0001, step_loss=0.044] Steps: 26%|██▌ | 518/2000 [11:06<29:16, 1.18s/it, lr=0.0001, step_loss=0.044]Steps: 26%|██▌ | 518/2000 [11:06<29:16, 1.18s/it, lr=0.0001, step_loss=0.000539]Steps: 26%|██▌ | 519/2000 [11:06<26:12, 1.06s/it, lr=0.0001, step_loss=0.000539]Steps: 26%|██▌ | 519/2000 [11:06<26:12, 1.06s/it, lr=0.0001, step_loss=0.218] Steps: 26%|██▌ | 520/2000 [11:07<24:03, 1.03it/s, lr=0.0001, step_loss=0.218]Steps: 26%|██▌ | 520/2000 [11:07<24:03, 1.03it/s, lr=0.0001, step_loss=0.0351]Steps: 26%|██▌ | 521/2000 [11:08<22:33, 1.09it/s, lr=0.0001, step_loss=0.0351]Steps: 26%|██▌ | 521/2000 [11:08<22:33, 1.09it/s, lr=0.0001, step_loss=0.0158]Steps: 26%|██▌ | 522/2000 [11:09<21:29, 1.15it/s, lr=0.0001, step_loss=0.0158]Steps: 26%|██▌ | 522/2000 [11:09<21:29, 1.15it/s, lr=0.0001, step_loss=0.115] Steps: 26%|██▌ | 523/2000 [11:09<20:44, 1.19it/s, lr=0.0001, step_loss=0.115]Steps: 26%|██▌ | 523/2000 [11:10<20:44, 1.19it/s, lr=0.0001, step_loss=0.0197]Steps: 26%|██▌ | 524/2000 [11:10<20:14, 1.22it/s, lr=0.0001, step_loss=0.0197]Steps: 26%|██▌ | 524/2000 [11:10<20:14, 1.22it/s, lr=0.0001, step_loss=0.0297]Steps: 26%|██▋ | 525/2000 [11:11<19:50, 1.24it/s, lr=0.0001, step_loss=0.0297]Steps: 26%|██▋ | 525/2000 [11:11<19:50, 1.24it/s, lr=0.0001, step_loss=0.19] Steps: 26%|██▋ | 526/2000 [11:12<19:33, 1.26it/s, lr=0.0001, step_loss=0.19]Steps: 26%|██▋ | 526/2000 [11:12<19:33, 1.26it/s, lr=0.0001, step_loss=0.000595]Steps: 26%|██▋ | 527/2000 [11:13<19:21, 1.27it/s, lr=0.0001, step_loss=0.000595]Steps: 26%|██▋ | 527/2000 [11:13<19:21, 1.27it/s, lr=0.0001, step_loss=0.0054] Steps: 26%|██▋ | 528/2000 [11:13<19:11, 1.28it/s, lr=0.0001, step_loss=0.0054]Steps: 26%|██▋ | 528/2000 [11:13<19:11, 1.28it/s, lr=0.0001, step_loss=0.0296]Steps: 26%|██▋ | 529/2000 [11:14<19:04, 1.28it/s, lr=0.0001, step_loss=0.0296]Steps: 26%|██▋ | 529/2000 [11:14<19:04, 1.28it/s, lr=0.0001, step_loss=0.00118]Steps: 26%|██▋ | 530/2000 [11:15<18:59, 1.29it/s, lr=0.0001, step_loss=0.00118]Steps: 26%|██▋ | 530/2000 [11:15<18:59, 1.29it/s, lr=0.0001, step_loss=0.00086]Steps: 27%|██▋ | 531/2000 [11:16<18:56, 1.29it/s, lr=0.0001, step_loss=0.00086]Steps: 27%|██▋ | 531/2000 [11:16<18:56, 1.29it/s, lr=0.0001, step_loss=0.203] Steps: 27%|██▋ | 532/2000 [11:16<18:54, 1.29it/s, lr=0.0001, step_loss=0.203]Steps: 27%|██▋ | 532/2000 [11:16<18:54, 1.29it/s, lr=0.0001, step_loss=0.0264]Steps: 27%|██▋ | 533/2000 [11:17<18:51, 1.30it/s, lr=0.0001, step_loss=0.0264]Steps: 27%|██▋ | 533/2000 [11:17<18:51, 1.30it/s, lr=0.0001, step_loss=0.00207]Steps: 27%|██▋ | 534/2000 [11:18<18:49, 1.30it/s, lr=0.0001, step_loss=0.00207]Steps: 27%|██▋ | 534/2000 [11:18<18:49, 1.30it/s, lr=0.0001, step_loss=0.205] Steps: 27%|██▋ | 535/2000 [11:19<18:48, 1.30it/s, lr=0.0001, step_loss=0.205]Steps: 27%|██▋ | 535/2000 [11:19<18:48, 1.30it/s, lr=0.0001, step_loss=0.00069]Steps: 27%|██▋ | 536/2000 [11:19<18:48, 1.30it/s, lr=0.0001, step_loss=0.00069]Steps: 27%|██▋ | 536/2000 [11:20<18:48, 1.30it/s, lr=0.0001, step_loss=0.00473]Steps: 27%|██▋ | 537/2000 [11:20<18:47, 1.30it/s, lr=0.0001, step_loss=0.00473]Steps: 27%|██▋ | 537/2000 [11:20<18:47, 1.30it/s, lr=0.0001, step_loss=0.00386]Steps: 27%|██▋ | 538/2000 [11:21<18:45, 1.30it/s, lr=0.0001, step_loss=0.00386]Steps: 27%|██▋ | 538/2000 [11:21<18:45, 1.30it/s, lr=0.0001, step_loss=0.0101] Steps: 27%|██▋ | 539/2000 [11:22<18:45, 1.30it/s, lr=0.0001, step_loss=0.0101]Steps: 27%|██▋ | 539/2000 [11:22<18:45, 1.30it/s, lr=0.0001, step_loss=0.0707]Steps: 27%|██▋ | 540/2000 [11:23<18:43, 1.30it/s, lr=0.0001, step_loss=0.0707]Steps: 27%|██▋ | 540/2000 [11:23<18:43, 1.30it/s, lr=0.0001, step_loss=0.0161]Steps: 27%|██▋ | 541/2000 [11:23<18:42, 1.30it/s, lr=0.0001, step_loss=0.0161]Steps: 27%|██▋ | 541/2000 [11:23<18:42, 1.30it/s, lr=0.0001, step_loss=0.0481]Steps: 27%|██▋ | 542/2000 [11:24<18:41, 1.30it/s, lr=0.0001, step_loss=0.0481]Steps: 27%|██▋ | 542/2000 [11:24<18:41, 1.30it/s, lr=0.0001, step_loss=0.00429]Steps: 27%|██▋ | 543/2000 [11:25<18:40, 1.30it/s, lr=0.0001, step_loss=0.00429]Steps: 27%|██▋ | 543/2000 [11:25<18:40, 1.30it/s, lr=0.0001, step_loss=0.0311] Steps: 27%|██▋ | 544/2000 [11:26<18:39, 1.30it/s, lr=0.0001, step_loss=0.0311]11/14/2025 06:20:13 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 544)
Steps: 27%|██▋ | 544/2000 [11:33<18:39, 1.30it/s, lr=0.0001, step_loss=0.0193]11/14/2025 06:20:13 - INFO - root - ### DEBUG: Finished epoch 16, epoch_steps=32, global_step=544
11/14/2025 06:20:13 - INFO - root - ### DEBUG: Starting epoch 17/63, global_step=544, max_train_steps=2000
Steps: 27%|██▋ | 545/2000 [11:34<1:13:48, 3.04s/it, lr=0.0001, step_loss=0.0193]Steps: 27%|██▋ | 545/2000 [11:34<1:13:48, 3.04s/it, lr=0.0001, step_loss=0.00773]Steps: 27%|██▋ | 546/2000 [11:35<57:14, 2.36s/it, lr=0.0001, step_loss=0.00773] Steps: 27%|██▋ | 546/2000 [11:35<57:14, 2.36s/it, lr=0.0001, step_loss=0.00253]Steps: 27%|██▋ | 547/2000 [11:36<45:37, 1.88s/it, lr=0.0001, step_loss=0.00253]Steps: 27%|██▋ | 547/2000 [11:36<45:37, 1.88s/it, lr=0.0001, step_loss=0.0513] Steps: 27%|██▋ | 548/2000 [11:36<37:30, 1.55s/it, lr=0.0001, step_loss=0.0513]Steps: 27%|██▋ | 548/2000 [11:36<37:30, 1.55s/it, lr=0.0001, step_loss=0.00109]Steps: 27%|██▋ | 549/2000 [11:37<31:48, 1.32s/it, lr=0.0001, step_loss=0.00109]Steps: 27%|██▋ | 549/2000 [11:37<31:48, 1.32s/it, lr=0.0001, step_loss=0.00161]Steps: 28%|██▊ | 550/2000 [11:38<27:49, 1.15s/it, lr=0.0001, step_loss=0.00161]Steps: 28%|██▊ | 550/2000 [11:38<27:49, 1.15s/it, lr=0.0001, step_loss=0.000859]Steps: 28%|██▊ | 551/2000 [11:39<25:02, 1.04s/it, lr=0.0001, step_loss=0.000859]Steps: 28%|██▊ | 551/2000 [11:39<25:02, 1.04s/it, lr=0.0001, step_loss=0.0496] Steps: 28%|██▊ | 552/2000 [11:39<23:06, 1.04it/s, lr=0.0001, step_loss=0.0496]Steps: 28%|██▊ | 552/2000 [11:39<23:06, 1.04it/s, lr=0.0001, step_loss=0.002] Steps: 28%|██▊ | 553/2000 [11:40<21:44, 1.11it/s, lr=0.0001, step_loss=0.002]Steps: 28%|██▊ | 553/2000 [11:40<21:44, 1.11it/s, lr=0.0001, step_loss=0.0204]Steps: 28%|██▊ | 554/2000 [11:41<20:46, 1.16it/s, lr=0.0001, step_loss=0.0204]Steps: 28%|██▊ | 554/2000 [11:41<20:46, 1.16it/s, lr=0.0001, step_loss=0.0271]Steps: 28%|██▊ | 555/2000 [11:42<20:05, 1.20it/s, lr=0.0001, step_loss=0.0271]Steps: 28%|██▊ | 555/2000 [11:42<20:05, 1.20it/s, lr=0.0001, step_loss=0.0453]Steps: 28%|██▊ | 556/2000 [11:42<19:36, 1.23it/s, lr=0.0001, step_loss=0.0453]Steps: 28%|██▊ | 556/2000 [11:42<19:36, 1.23it/s, lr=0.0001, step_loss=0.0154]Steps: 28%|██▊ | 557/2000 [11:43<19:17, 1.25it/s, lr=0.0001, step_loss=0.0154]Steps: 28%|██▊ | 557/2000 [11:43<19:17, 1.25it/s, lr=0.0001, step_loss=0.18] Steps: 28%|██▊ | 558/2000 [11:44<19:02, 1.26it/s, lr=0.0001, step_loss=0.18]Steps: 28%|██▊ | 558/2000 [11:44<19:02, 1.26it/s, lr=0.0001, step_loss=0.0409]Steps: 28%|██▊ | 559/2000 [11:45<18:51, 1.27it/s, lr=0.0001, step_loss=0.0409]Steps: 28%|██▊ | 559/2000 [11:45<18:51, 1.27it/s, lr=0.0001, step_loss=0.00103]Steps: 28%|██▊ | 560/2000 [11:46<18:43, 1.28it/s, lr=0.0001, step_loss=0.00103]Steps: 28%|██▊ | 560/2000 [11:46<18:43, 1.28it/s, lr=0.0001, step_loss=0.0153] Steps: 28%|██▊ | 561/2000 [11:46<18:38, 1.29it/s, lr=0.0001, step_loss=0.0153]Steps: 28%|██▊ | 561/2000 [11:46<18:38, 1.29it/s, lr=0.0001, step_loss=0.0161]Steps: 28%|██▊ | 562/2000 [11:47<18:35, 1.29it/s, lr=0.0001, step_loss=0.0161]Steps: 28%|██▊ | 562/2000 [11:47<18:35, 1.29it/s, lr=0.0001, step_loss=0.0139]Steps: 28%|██▊ | 563/2000 [11:48<18:31, 1.29it/s, lr=0.0001, step_loss=0.0139]Steps: 28%|██▊ | 563/2000 [11:48<18:31, 1.29it/s, lr=0.0001, step_loss=0.00125]Steps: 28%|██▊ | 564/2000 [11:49<18:29, 1.29it/s, lr=0.0001, step_loss=0.00125]Steps: 28%|██▊ | 564/2000 [11:49<18:29, 1.29it/s, lr=0.0001, step_loss=0.0212] Steps: 28%|██▊ | 565/2000 [11:49<18:27, 1.30it/s, lr=0.0001, step_loss=0.0212]Steps: 28%|██▊ | 565/2000 [11:49<18:27, 1.30it/s, lr=0.0001, step_loss=0.00399]Steps: 28%|██▊ | 566/2000 [11:50<18:25, 1.30it/s, lr=0.0001, step_loss=0.00399]Steps: 28%|██▊ | 566/2000 [11:50<18:25, 1.30it/s, lr=0.0001, step_loss=0.00526]Steps: 28%|██▊ | 567/2000 [11:51<18:24, 1.30it/s, lr=0.0001, step_loss=0.00526]Steps: 28%|██▊ | 567/2000 [11:51<18:24, 1.30it/s, lr=0.0001, step_loss=0.00834]Steps: 28%|██▊ | 568/2000 [11:52<18:25, 1.30it/s, lr=0.0001, step_loss=0.00834]Steps: 28%|██▊ | 568/2000 [11:52<18:25, 1.30it/s, lr=0.0001, step_loss=0.0496] Steps: 28%|██▊ | 569/2000 [11:52<18:25, 1.29it/s, lr=0.0001, step_loss=0.0496]Steps: 28%|██▊ | 569/2000 [11:53<18:25, 1.29it/s, lr=0.0001, step_loss=0.0513]Steps: 28%|██▊ | 570/2000 [11:53<18:22, 1.30it/s, lr=0.0001, step_loss=0.0513]Steps: 28%|██▊ | 570/2000 [11:53<18:22, 1.30it/s, lr=0.0001, step_loss=0.00508]Steps: 29%|██▊ | 571/2000 [11:54<18:21, 1.30it/s, lr=0.0001, step_loss=0.00508]Steps: 29%|██▊ | 571/2000 [11:54<18:21, 1.30it/s, lr=0.0001, step_loss=0.0259] Steps: 29%|██▊ | 572/2000 [11:55<18:21, 1.30it/s, lr=0.0001, step_loss=0.0259]Steps: 29%|██▊ | 572/2000 [11:55<18:21, 1.30it/s, lr=0.0001, step_loss=0.0109]Steps: 29%|██▊ | 573/2000 [11:56<18:21, 1.30it/s, lr=0.0001, step_loss=0.0109]Steps: 29%|██▊ | 573/2000 [11:56<18:21, 1.30it/s, lr=0.0001, step_loss=0.0898]Steps: 29%|██▊ | 574/2000 [11:56<18:21, 1.30it/s, lr=0.0001, step_loss=0.0898]Steps: 29%|██▊ | 574/2000 [11:56<18:21, 1.30it/s, lr=0.0001, step_loss=0.0108]Steps: 29%|██▉ | 575/2000 [11:57<18:18, 1.30it/s, lr=0.0001, step_loss=0.0108]Steps: 29%|██▉ | 575/2000 [11:57<18:18, 1.30it/s, lr=0.0001, step_loss=0.000769]Steps: 29%|██▉ | 576/2000 [11:58<18:17, 1.30it/s, lr=0.0001, step_loss=0.000769]11/14/2025 06:20:45 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 576)
Steps: 29%|██▉ | 576/2000 [12:05<18:17, 1.30it/s, lr=0.0001, step_loss=0.00223] 11/14/2025 06:20:45 - INFO - root - ### DEBUG: Finished epoch 17, epoch_steps=32, global_step=576
11/14/2025 06:20:45 - INFO - root - ### DEBUG: Starting epoch 18/63, global_step=576, max_train_steps=2000
Steps: 29%|██▉ | 577/2000 [12:06<1:12:12, 3.04s/it, lr=0.0001, step_loss=0.00223]Steps: 29%|██▉ | 577/2000 [12:06<1:12:12, 3.04s/it, lr=0.0001, step_loss=0.0437] Steps: 29%|██▉ | 578/2000 [12:07<55:59, 2.36s/it, lr=0.0001, step_loss=0.0437] Steps: 29%|██▉ | 578/2000 [12:07<55:59, 2.36s/it, lr=0.0001, step_loss=0.00303]Steps: 29%|██▉ | 579/2000 [12:08<44:37, 1.88s/it, lr=0.0001, step_loss=0.00303]Steps: 29%|██▉ | 579/2000 [12:08<44:37, 1.88s/it, lr=0.0001, step_loss=0.000592]Steps: 29%|██▉ | 580/2000 [12:09<36:40, 1.55s/it, lr=0.0001, step_loss=0.000592]Steps: 29%|██▉ | 580/2000 [12:09<36:40, 1.55s/it, lr=0.0001, step_loss=0.134] Steps: 29%|██▉ | 581/2000 [12:09<31:07, 1.32s/it, lr=0.0001, step_loss=0.134]Steps: 29%|██▉ | 581/2000 [12:09<31:07, 1.32s/it, lr=0.0001, step_loss=0.114]Steps: 29%|██▉ | 582/2000 [12:10<27:13, 1.15s/it, lr=0.0001, step_loss=0.114]Steps: 29%|██▉ | 582/2000 [12:10<27:13, 1.15s/it, lr=0.0001, step_loss=0.0492]Steps: 29%|██▉ | 583/2000 [12:11<24:30, 1.04s/it, lr=0.0001, step_loss=0.0492]Steps: 29%|██▉ | 583/2000 [12:11<24:30, 1.04s/it, lr=0.0001, step_loss=0.0203]Steps: 29%|██▉ | 584/2000 [12:12<22:36, 1.04it/s, lr=0.0001, step_loss=0.0203]Steps: 29%|██▉ | 584/2000 [12:12<22:36, 1.04it/s, lr=0.0001, step_loss=0.00778]Steps: 29%|██▉ | 585/2000 [12:12<21:14, 1.11it/s, lr=0.0001, step_loss=0.00778]Steps: 29%|██▉ | 585/2000 [12:12<21:14, 1.11it/s, lr=0.0001, step_loss=0.000575]Steps: 29%|██▉ | 586/2000 [12:13<20:18, 1.16it/s, lr=0.0001, step_loss=0.000575]Steps: 29%|██▉ | 586/2000 [12:13<20:18, 1.16it/s, lr=0.0001, step_loss=0.0997] Steps: 29%|██▉ | 587/2000 [12:14<19:39, 1.20it/s, lr=0.0001, step_loss=0.0997]Steps: 29%|██▉ | 587/2000 [12:14<19:39, 1.20it/s, lr=0.0001, step_loss=0.0696]Steps: 29%|██▉ | 588/2000 [12:15<19:10, 1.23it/s, lr=0.0001, step_loss=0.0696]Steps: 29%|██▉ | 588/2000 [12:15<19:10, 1.23it/s, lr=0.0001, step_loss=0.00126]Steps: 29%|██▉ | 589/2000 [12:15<18:50, 1.25it/s, lr=0.0001, step_loss=0.00126]Steps: 29%|██▉ | 589/2000 [12:15<18:50, 1.25it/s, lr=0.0001, step_loss=0.07] Steps: 30%|██▉ | 590/2000 [12:16<18:35, 1.26it/s, lr=0.0001, step_loss=0.07]Steps: 30%|██▉ | 590/2000 [12:16<18:35, 1.26it/s, lr=0.0001, step_loss=0.177]Steps: 30%|██▉ | 591/2000 [12:17<18:26, 1.27it/s, lr=0.0001, step_loss=0.177]Steps: 30%|██▉ | 591/2000 [12:17<18:26, 1.27it/s, lr=0.0001, step_loss=0.0743]Steps: 30%|██▉ | 592/2000 [12:18<18:20, 1.28it/s, lr=0.0001, step_loss=0.0743]Steps: 30%|██▉ | 592/2000 [12:18<18:20, 1.28it/s, lr=0.0001, step_loss=0.0012]Steps: 30%|██▉ | 593/2000 [12:19<18:14, 1.29it/s, lr=0.0001, step_loss=0.0012]Steps: 30%|██▉ | 593/2000 [12:19<18:14, 1.29it/s, lr=0.0001, step_loss=0.0182]Steps: 30%|██▉ | 594/2000 [12:19<18:11, 1.29it/s, lr=0.0001, step_loss=0.0182]Steps: 30%|██▉ | 594/2000 [12:19<18:11, 1.29it/s, lr=0.0001, step_loss=0.00501]Steps: 30%|██▉ | 595/2000 [12:20<18:07, 1.29it/s, lr=0.0001, step_loss=0.00501]Steps: 30%|██▉ | 595/2000 [12:20<18:07, 1.29it/s, lr=0.0001, step_loss=0.0149] Steps: 30%|██▉ | 596/2000 [12:21<18:06, 1.29it/s, lr=0.0001, step_loss=0.0149]Steps: 30%|██▉ | 596/2000 [12:21<18:06, 1.29it/s, lr=0.0001, step_loss=0.00904]Steps: 30%|██▉ | 597/2000 [12:22<18:03, 1.29it/s, lr=0.0001, step_loss=0.00904]Steps: 30%|██▉ | 597/2000 [12:22<18:03, 1.29it/s, lr=0.0001, step_loss=0.0814] Steps: 30%|██▉ | 598/2000 [12:22<18:01, 1.30it/s, lr=0.0001, step_loss=0.0814]Steps: 30%|██▉ | 598/2000 [12:22<18:01, 1.30it/s, lr=0.0001, step_loss=0.0235]Steps: 30%|██▉ | 599/2000 [12:23<17:59, 1.30it/s, lr=0.0001, step_loss=0.0235]Steps: 30%|██▉ | 599/2000 [12:23<17:59, 1.30it/s, lr=0.0001, step_loss=0.0439]Steps: 30%|███ | 600/2000 [12:24<17:58, 1.30it/s, lr=0.0001, step_loss=0.0439]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.01it/s][A100%|██████████| 8/8 [00:00<00:00, 32.18it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.69it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.69it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.02it/s][A100%|██████████| 8/8 [00:00<00:00, 32.19it/s]
11/14/2025 06:21:36 - INFO - root - Saved samples to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/samples/sample-600.gif
Steps: 30%|███ | 600/2000 [12:56<17:58, 1.30it/s, lr=0.0001, step_loss=0.0102]Steps: 30%|███ | 601/2000 [12:57<4:03:29, 10.44s/it, lr=0.0001, step_loss=0.0102]Steps: 30%|███ | 601/2000 [12:57<4:03:29, 10.44s/it, lr=0.0001, step_loss=0.0875]Steps: 30%|███ | 602/2000 [12:58<2:55:42, 7.54s/it, lr=0.0001, step_loss=0.0875]Steps: 30%|███ | 602/2000 [12:58<2:55:42, 7.54s/it, lr=0.0001, step_loss=0.0167]Steps: 30%|███ | 603/2000 [12:58<2:08:16, 5.51s/it, lr=0.0001, step_loss=0.0167]Steps: 30%|███ | 603/2000 [12:59<2:08:16, 5.51s/it, lr=0.0001, step_loss=0.0664]Steps: 30%|███ | 604/2000 [12:59<1:35:08, 4.09s/it, lr=0.0001, step_loss=0.0664]Steps: 30%|███ | 604/2000 [12:59<1:35:08, 4.09s/it, lr=0.0001, step_loss=0.0698]Steps: 30%|███ | 605/2000 [13:00<1:11:54, 3.09s/it, lr=0.0001, step_loss=0.0698]Steps: 30%|███ | 605/2000 [13:00<1:11:54, 3.09s/it, lr=0.0001, step_loss=0.00485]Steps: 30%|███ | 606/2000 [13:01<55:40, 2.40s/it, lr=0.0001, step_loss=0.00485] Steps: 30%|███ | 606/2000 [13:01<55:40, 2.40s/it, lr=0.0001, step_loss=0.0792] Steps: 30%|███ | 607/2000 [13:02<44:20, 1.91s/it, lr=0.0001, step_loss=0.0792]Steps: 30%|███ | 607/2000 [13:02<44:20, 1.91s/it, lr=0.0001, step_loss=0.00228]Steps: 30%|███ | 608/2000 [13:02<36:24, 1.57s/it, lr=0.0001, step_loss=0.00228]11/14/2025 06:21:50 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 608)
Steps: 30%|███ | 608/2000 [13:10<36:24, 1.57s/it, lr=0.0001, step_loss=0.14] 11/14/2025 06:21:50 - INFO - root - ### DEBUG: Finished epoch 18, epoch_steps=32, global_step=608
11/14/2025 06:21:50 - INFO - root - ### DEBUG: Starting epoch 19/63, global_step=608, max_train_steps=2000
Steps: 30%|███ | 609/2000 [13:11<1:26:31, 3.73s/it, lr=0.0001, step_loss=0.14]Steps: 30%|███ | 609/2000 [13:11<1:26:31, 3.73s/it, lr=0.0001, step_loss=0.00146]Steps: 30%|███ | 610/2000 [13:12<1:05:52, 2.84s/it, lr=0.0001, step_loss=0.00146]Steps: 30%|███ | 610/2000 [13:12<1:05:52, 2.84s/it, lr=0.0001, step_loss=0.009] Steps: 31%|███ | 611/2000 [13:13<51:26, 2.22s/it, lr=0.0001, step_loss=0.009] Steps: 31%|███ | 611/2000 [13:13<51:26, 2.22s/it, lr=0.0001, step_loss=0.0384]Steps: 31%|███ | 612/2000 [13:13<41:20, 1.79s/it, lr=0.0001, step_loss=0.0384]Steps: 31%|███ | 612/2000 [13:13<41:20, 1.79s/it, lr=0.0001, step_loss=0.000672]Steps: 31%|███ | 613/2000 [13:14<34:16, 1.48s/it, lr=0.0001, step_loss=0.000672]Steps: 31%|███ | 613/2000 [13:14<34:16, 1.48s/it, lr=0.0001, step_loss=0.0154] Steps: 31%|███ | 614/2000 [13:15<29:17, 1.27s/it, lr=0.0001, step_loss=0.0154]Steps: 31%|███ | 614/2000 [13:15<29:17, 1.27s/it, lr=0.0001, step_loss=0.018] Steps: 31%|███ | 615/2000 [13:16<25:50, 1.12s/it, lr=0.0001, step_loss=0.018]Steps: 31%|███ | 615/2000 [13:16<25:50, 1.12s/it, lr=0.0001, step_loss=0.157]Steps: 31%|███ | 616/2000 [13:17<23:25, 1.02s/it, lr=0.0001, step_loss=0.157]Steps: 31%|███ | 616/2000 [13:17<23:25, 1.02s/it, lr=0.0001, step_loss=0.000581]Steps: 31%|███ | 617/2000 [13:17<21:42, 1.06it/s, lr=0.0001, step_loss=0.000581]Steps: 31%|███ | 617/2000 [13:17<21:42, 1.06it/s, lr=0.0001, step_loss=0.0847] Steps: 31%|███ | 618/2000 [13:18<20:31, 1.12it/s, lr=0.0001, step_loss=0.0847]Steps: 31%|███ | 618/2000 [13:18<20:31, 1.12it/s, lr=0.0001, step_loss=0.00058]Steps: 31%|███ | 619/2000 [13:19<19:40, 1.17it/s, lr=0.0001, step_loss=0.00058]Steps: 31%|███ | 619/2000 [13:19<19:40, 1.17it/s, lr=0.0001, step_loss=0.0211] Steps: 31%|███ | 620/2000 [13:20<19:04, 1.21it/s, lr=0.0001, step_loss=0.0211]Steps: 31%|███ | 620/2000 [13:20<19:04, 1.21it/s, lr=0.0001, step_loss=0.0168]Steps: 31%|███ | 621/2000 [13:20<18:39, 1.23it/s, lr=0.0001, step_loss=0.0168]Steps: 31%|███ | 621/2000 [13:20<18:39, 1.23it/s, lr=0.0001, step_loss=0.0845]Steps: 31%|███ | 622/2000 [13:21<18:20, 1.25it/s, lr=0.0001, step_loss=0.0845]Steps: 31%|███ | 622/2000 [13:21<18:20, 1.25it/s, lr=0.0001, step_loss=0.0048]Steps: 31%|███ | 623/2000 [13:22<18:08, 1.26it/s, lr=0.0001, step_loss=0.0048]Steps: 31%|███ | 623/2000 [13:22<18:08, 1.26it/s, lr=0.0001, step_loss=0.0039]Steps: 31%|███ | 624/2000 [13:23<17:58, 1.28it/s, lr=0.0001, step_loss=0.0039]Steps: 31%|███ | 624/2000 [13:23<17:58, 1.28it/s, lr=0.0001, step_loss=0.00736]Steps: 31%|███▏ | 625/2000 [13:23<17:52, 1.28it/s, lr=0.0001, step_loss=0.00736]Steps: 31%|███▏ | 625/2000 [13:23<17:52, 1.28it/s, lr=0.0001, step_loss=0.00913]Steps: 31%|███▏ | 626/2000 [13:24<17:48, 1.29it/s, lr=0.0001, step_loss=0.00913]Steps: 31%|███▏ | 626/2000 [13:24<17:48, 1.29it/s, lr=0.0001, step_loss=0.0538] Steps: 31%|███▏ | 627/2000 [13:25<17:43, 1.29it/s, lr=0.0001, step_loss=0.0538]Steps: 31%|███▏ | 627/2000 [13:25<17:43, 1.29it/s, lr=0.0001, step_loss=0.00482]Steps: 31%|███▏ | 628/2000 [13:26<17:40, 1.29it/s, lr=0.0001, step_loss=0.00482]Steps: 31%|███▏ | 628/2000 [13:26<17:40, 1.29it/s, lr=0.0001, step_loss=0.00108]Steps: 31%|███▏ | 629/2000 [13:27<17:39, 1.29it/s, lr=0.0001, step_loss=0.00108]Steps: 31%|███▏ | 629/2000 [13:27<17:39, 1.29it/s, lr=0.0001, step_loss=0.00631]Steps: 32%|███▏ | 630/2000 [13:27<17:38, 1.29it/s, lr=0.0001, step_loss=0.00631]Steps: 32%|███▏ | 630/2000 [13:27<17:38, 1.29it/s, lr=0.0001, step_loss=0.0052] Steps: 32%|███▏ | 631/2000 [13:28<17:37, 1.29it/s, lr=0.0001, step_loss=0.0052]Steps: 32%|███▏ | 631/2000 [13:28<17:37, 1.29it/s, lr=0.0001, step_loss=0.0612]Steps: 32%|███▏ | 632/2000 [13:29<17:35, 1.30it/s, lr=0.0001, step_loss=0.0612]Steps: 32%|███▏ | 632/2000 [13:29<17:35, 1.30it/s, lr=0.0001, step_loss=0.0159]Steps: 32%|███▏ | 633/2000 [13:30<17:35, 1.30it/s, lr=0.0001, step_loss=0.0159]Steps: 32%|███▏ | 633/2000 [13:30<17:35, 1.30it/s, lr=0.0001, step_loss=0.000538]Steps: 32%|███▏ | 634/2000 [13:30<17:33, 1.30it/s, lr=0.0001, step_loss=0.000538]Steps: 32%|███▏ | 634/2000 [13:30<17:33, 1.30it/s, lr=0.0001, step_loss=0.00119] Steps: 32%|███▏ | 635/2000 [13:31<17:32, 1.30it/s, lr=0.0001, step_loss=0.00119]Steps: 32%|███▏ | 635/2000 [13:31<17:32, 1.30it/s, lr=0.0001, step_loss=0.00342]Steps: 32%|███▏ | 636/2000 [13:32<17:32, 1.30it/s, lr=0.0001, step_loss=0.00342]Steps: 32%|███▏ | 636/2000 [13:32<17:32, 1.30it/s, lr=0.0001, step_loss=0.000453]Steps: 32%|███▏ | 637/2000 [13:33<17:31, 1.30it/s, lr=0.0001, step_loss=0.000453]Steps: 32%|███▏ | 637/2000 [13:33<17:31, 1.30it/s, lr=0.0001, step_loss=0.012] Steps: 32%|███▏ | 638/2000 [13:33<17:30, 1.30it/s, lr=0.0001, step_loss=0.012]Steps: 32%|███▏ | 638/2000 [13:34<17:30, 1.30it/s, lr=0.0001, step_loss=0.162]Steps: 32%|███▏ | 639/2000 [13:34<17:28, 1.30it/s, lr=0.0001, step_loss=0.162]Steps: 32%|███▏ | 639/2000 [13:34<17:28, 1.30it/s, lr=0.0001, step_loss=0.0625]Steps: 32%|███▏ | 640/2000 [13:35<17:27, 1.30it/s, lr=0.0001, step_loss=0.0625]11/14/2025 06:22:23 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 640)
Steps: 32%|███▏ | 640/2000 [13:43<17:27, 1.30it/s, lr=0.0001, step_loss=0.21] 11/14/2025 06:22:23 - INFO - root - ### DEBUG: Finished epoch 19, epoch_steps=32, global_step=640
11/14/2025 06:22:23 - INFO - root - ### DEBUG: Starting epoch 20/63, global_step=640, max_train_steps=2000
Steps: 32%|███▏ | 641/2000 [13:44<1:11:14, 3.15s/it, lr=0.0001, step_loss=0.21]Steps: 32%|███▏ | 641/2000 [13:44<1:11:14, 3.15s/it, lr=0.0001, step_loss=0.0799]Steps: 32%|███▏ | 642/2000 [13:44<55:04, 2.43s/it, lr=0.0001, step_loss=0.0799] Steps: 32%|███▏ | 642/2000 [13:45<55:04, 2.43s/it, lr=0.0001, step_loss=0.0358]Steps: 32%|███▏ | 643/2000 [13:45<43:44, 1.93s/it, lr=0.0001, step_loss=0.0358]Steps: 32%|███▏ | 643/2000 [13:45<43:44, 1.93s/it, lr=0.0001, step_loss=0.00542]Steps: 32%|███▏ | 644/2000 [13:46<35:47, 1.58s/it, lr=0.0001, step_loss=0.00542]Steps: 32%|███▏ | 644/2000 [13:46<35:47, 1.58s/it, lr=0.0001, step_loss=0.094] Steps: 32%|███▏ | 645/2000 [13:47<30:15, 1.34s/it, lr=0.0001, step_loss=0.094]Steps: 32%|███▏ | 645/2000 [13:47<30:15, 1.34s/it, lr=0.0001, step_loss=0.00546]Steps: 32%|███▏ | 646/2000 [13:48<26:22, 1.17s/it, lr=0.0001, step_loss=0.00546]Steps: 32%|███▏ | 646/2000 [13:48<26:22, 1.17s/it, lr=0.0001, step_loss=0.00988]Steps: 32%|███▏ | 647/2000 [13:48<23:39, 1.05s/it, lr=0.0001, step_loss=0.00988]Steps: 32%|███▏ | 647/2000 [13:48<23:39, 1.05s/it, lr=0.0001, step_loss=0.0805] Steps: 32%|███▏ | 648/2000 [13:49<21:44, 1.04it/s, lr=0.0001, step_loss=0.0805]Steps: 32%|███▏ | 648/2000 [13:49<21:44, 1.04it/s, lr=0.0001, step_loss=0.075] Steps: 32%|███▏ | 649/2000 [13:50<20:25, 1.10it/s, lr=0.0001, step_loss=0.075]Steps: 32%|███▏ | 649/2000 [13:50<20:25, 1.10it/s, lr=0.0001, step_loss=0.00314]Steps: 32%|███▎ | 650/2000 [13:51<19:29, 1.15it/s, lr=0.0001, step_loss=0.00314]Steps: 32%|███▎ | 650/2000 [13:51<19:29, 1.15it/s, lr=0.0001, step_loss=0.0181] Steps: 33%|███▎ | 651/2000 [13:51<18:49, 1.19it/s, lr=0.0001, step_loss=0.0181]Steps: 33%|███▎ | 651/2000 [13:51<18:49, 1.19it/s, lr=0.0001, step_loss=0.044] Steps: 33%|███▎ | 652/2000 [13:52<18:22, 1.22it/s, lr=0.0001, step_loss=0.044]Steps: 33%|███▎ | 652/2000 [13:52<18:22, 1.22it/s, lr=0.0001, step_loss=0.0109]Steps: 33%|███▎ | 653/2000 [13:53<18:02, 1.24it/s, lr=0.0001, step_loss=0.0109]Steps: 33%|███▎ | 653/2000 [13:53<18:02, 1.24it/s, lr=0.0001, step_loss=0.329] Steps: 33%|███▎ | 654/2000 [13:54<17:48, 1.26it/s, lr=0.0001, step_loss=0.329]Steps: 33%|███▎ | 654/2000 [13:54<17:48, 1.26it/s, lr=0.0001, step_loss=0.000611]Steps: 33%|███▎ | 655/2000 [13:54<17:37, 1.27it/s, lr=0.0001, step_loss=0.000611]Steps: 33%|███▎ | 655/2000 [13:55<17:37, 1.27it/s, lr=0.0001, step_loss=0.00632] Steps: 33%|███▎ | 656/2000 [13:55<17:29, 1.28it/s, lr=0.0001, step_loss=0.00632]Steps: 33%|███▎ | 656/2000 [13:55<17:29, 1.28it/s, lr=0.0001, step_loss=0.00081]Steps: 33%|███▎ | 657/2000 [13:56<17:25, 1.28it/s, lr=0.0001, step_loss=0.00081]Steps: 33%|███▎ | 657/2000 [13:56<17:25, 1.28it/s, lr=0.0001, step_loss=0.237] Steps: 33%|███▎ | 658/2000 [13:57<17:22, 1.29it/s, lr=0.0001, step_loss=0.237]Steps: 33%|███▎ | 658/2000 [13:57<17:22, 1.29it/s, lr=0.0001, step_loss=0.0845]Steps: 33%|███▎ | 659/2000 [13:58<17:19, 1.29it/s, lr=0.0001, step_loss=0.0845]Steps: 33%|███▎ | 659/2000 [13:58<17:19, 1.29it/s, lr=0.0001, step_loss=0.0307]Steps: 33%|███▎ | 660/2000 [13:58<17:16, 1.29it/s, lr=0.0001, step_loss=0.0307]Steps: 33%|███▎ | 660/2000 [13:58<17:16, 1.29it/s, lr=0.0001, step_loss=0.0298]Steps: 33%|███▎ | 661/2000 [13:59<17:14, 1.29it/s, lr=0.0001, step_loss=0.0298]Steps: 33%|███▎ | 661/2000 [13:59<17:14, 1.29it/s, lr=0.0001, step_loss=0.000845]Steps: 33%|███▎ | 662/2000 [14:00<17:13, 1.30it/s, lr=0.0001, step_loss=0.000845]Steps: 33%|███▎ | 662/2000 [14:00<17:13, 1.30it/s, lr=0.0001, step_loss=0.012] Steps: 33%|███▎ | 663/2000 [14:01<17:10, 1.30it/s, lr=0.0001, step_loss=0.012]Steps: 33%|███▎ | 663/2000 [14:01<17:10, 1.30it/s, lr=0.0001, step_loss=0.0081]Steps: 33%|███▎ | 664/2000 [14:01<17:10, 1.30it/s, lr=0.0001, step_loss=0.0081]Steps: 33%|███▎ | 664/2000 [14:01<17:10, 1.30it/s, lr=0.0001, step_loss=0.0126]Steps: 33%|███▎ | 665/2000 [14:02<17:09, 1.30it/s, lr=0.0001, step_loss=0.0126]Steps: 33%|███▎ | 665/2000 [14:02<17:09, 1.30it/s, lr=0.0001, step_loss=0.156] Steps: 33%|███▎ | 666/2000 [14:03<17:09, 1.30it/s, lr=0.0001, step_loss=0.156]Steps: 33%|███▎ | 666/2000 [14:03<17:09, 1.30it/s, lr=0.0001, step_loss=0.0172]Steps: 33%|███▎ | 667/2000 [14:04<17:08, 1.30it/s, lr=0.0001, step_loss=0.0172]Steps: 33%|███▎ | 667/2000 [14:04<17:08, 1.30it/s, lr=0.0001, step_loss=0.000528]Steps: 33%|███▎ | 668/2000 [14:05<17:08, 1.30it/s, lr=0.0001, step_loss=0.000528]Steps: 33%|███▎ | 668/2000 [14:05<17:08, 1.30it/s, lr=0.0001, step_loss=0.047] Steps: 33%|███▎ | 669/2000 [14:05<17:08, 1.29it/s, lr=0.0001, step_loss=0.047]Steps: 33%|███▎ | 669/2000 [14:05<17:08, 1.29it/s, lr=0.0001, step_loss=0.000722]Steps: 34%|███▎ | 670/2000 [14:06<17:08, 1.29it/s, lr=0.0001, step_loss=0.000722]Steps: 34%|███▎ | 670/2000 [14:06<17:08, 1.29it/s, lr=0.0001, step_loss=0.0189] Steps: 34%|███▎ | 671/2000 [14:07<17:07, 1.29it/s, lr=0.0001, step_loss=0.0189]Steps: 34%|███▎ | 671/2000 [14:07<17:07, 1.29it/s, lr=0.0001, step_loss=0.00475]Steps: 34%|███▎ | 672/2000 [14:08<17:05, 1.30it/s, lr=0.0001, step_loss=0.00475]11/14/2025 06:22:55 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 672)
Steps: 34%|███▎ | 672/2000 [14:15<17:05, 1.30it/s, lr=0.0001, step_loss=0.0159] 11/14/2025 06:22:55 - INFO - root - ### DEBUG: Finished epoch 20, epoch_steps=32, global_step=672
11/14/2025 06:22:55 - INFO - root - ### DEBUG: Starting epoch 21/63, global_step=672, max_train_steps=2000
Steps: 34%|███▎ | 673/2000 [14:16<1:07:16, 3.04s/it, lr=0.0001, step_loss=0.0159]Steps: 34%|███▎ | 673/2000 [14:16<1:07:16, 3.04s/it, lr=0.0001, step_loss=0.0108]Steps: 34%|███▎ | 674/2000 [14:17<52:08, 2.36s/it, lr=0.0001, step_loss=0.0108] Steps: 34%|███▎ | 674/2000 [14:17<52:08, 2.36s/it, lr=0.0001, step_loss=0.00166]Steps: 34%|███▍ | 675/2000 [14:17<41:34, 1.88s/it, lr=0.0001, step_loss=0.00166]Steps: 34%|███▍ | 675/2000 [14:18<41:34, 1.88s/it, lr=0.0001, step_loss=0.00113]Steps: 34%|███▍ | 676/2000 [14:18<34:10, 1.55s/it, lr=0.0001, step_loss=0.00113]Steps: 34%|███▍ | 676/2000 [14:18<34:10, 1.55s/it, lr=0.0001, step_loss=0.347] Steps: 34%|███▍ | 677/2000 [14:19<29:00, 1.32s/it, lr=0.0001, step_loss=0.347]Steps: 34%|███▍ | 677/2000 [14:19<29:00, 1.32s/it, lr=0.0001, step_loss=0.0113]Steps: 34%|███▍ | 678/2000 [14:20<25:23, 1.15s/it, lr=0.0001, step_loss=0.0113]Steps: 34%|███▍ | 678/2000 [14:20<25:23, 1.15s/it, lr=0.0001, step_loss=0.0302]Steps: 34%|███▍ | 679/2000 [14:21<22:50, 1.04s/it, lr=0.0001, step_loss=0.0302]Steps: 34%|███▍ | 679/2000 [14:21<22:50, 1.04s/it, lr=0.0001, step_loss=0.0999]Steps: 34%|███▍ | 680/2000 [14:21<21:02, 1.05it/s, lr=0.0001, step_loss=0.0999]Steps: 34%|███▍ | 680/2000 [14:21<21:02, 1.05it/s, lr=0.0001, step_loss=0.0286]Steps: 34%|███▍ | 681/2000 [14:22<19:48, 1.11it/s, lr=0.0001, step_loss=0.0286]Steps: 34%|███▍ | 681/2000 [14:22<19:48, 1.11it/s, lr=0.0001, step_loss=0.128] Steps: 34%|███▍ | 682/2000 [14:23<18:55, 1.16it/s, lr=0.0001, step_loss=0.128]Steps: 34%|███▍ | 682/2000 [14:23<18:55, 1.16it/s, lr=0.0001, step_loss=0.0647]Steps: 34%|███▍ | 683/2000 [14:24<18:19, 1.20it/s, lr=0.0001, step_loss=0.0647]Steps: 34%|███▍ | 683/2000 [14:24<18:19, 1.20it/s, lr=0.0001, step_loss=0.0028]Steps: 34%|███▍ | 684/2000 [14:24<17:53, 1.23it/s, lr=0.0001, step_loss=0.0028]Steps: 34%|███▍ | 684/2000 [14:24<17:53, 1.23it/s, lr=0.0001, step_loss=0.0279]Steps: 34%|███▍ | 685/2000 [14:25<17:34, 1.25it/s, lr=0.0001, step_loss=0.0279]Steps: 34%|███▍ | 685/2000 [14:25<17:34, 1.25it/s, lr=0.0001, step_loss=0.000516]Steps: 34%|███▍ | 686/2000 [14:26<17:20, 1.26it/s, lr=0.0001, step_loss=0.000516]Steps: 34%|███▍ | 686/2000 [14:26<17:20, 1.26it/s, lr=0.0001, step_loss=0.0541] Steps: 34%|███▍ | 687/2000 [14:27<17:11, 1.27it/s, lr=0.0001, step_loss=0.0541]Steps: 34%|███▍ | 687/2000 [14:27<17:11, 1.27it/s, lr=0.0001, step_loss=0.0158]Steps: 34%|███▍ | 688/2000 [14:27<17:04, 1.28it/s, lr=0.0001, step_loss=0.0158]Steps: 34%|███▍ | 688/2000 [14:28<17:04, 1.28it/s, lr=0.0001, step_loss=0.000933]Steps: 34%|███▍ | 689/2000 [14:28<16:58, 1.29it/s, lr=0.0001, step_loss=0.000933]Steps: 34%|███▍ | 689/2000 [14:28<16:58, 1.29it/s, lr=0.0001, step_loss=0.12] Steps: 34%|███▍ | 690/2000 [14:29<16:55, 1.29it/s, lr=0.0001, step_loss=0.12]Steps: 34%|███▍ | 690/2000 [14:29<16:55, 1.29it/s, lr=0.0001, step_loss=0.121]Steps: 35%|███▍ | 691/2000 [14:30<16:53, 1.29it/s, lr=0.0001, step_loss=0.121]Steps: 35%|███▍ | 691/2000 [14:30<16:53, 1.29it/s, lr=0.0001, step_loss=0.00396]Steps: 35%|███▍ | 692/2000 [14:31<16:50, 1.29it/s, lr=0.0001, step_loss=0.00396]Steps: 35%|███▍ | 692/2000 [14:31<16:50, 1.29it/s, lr=0.0001, step_loss=0.00429]Steps: 35%|███▍ | 693/2000 [14:31<16:48, 1.30it/s, lr=0.0001, step_loss=0.00429]Steps: 35%|███▍ | 693/2000 [14:31<16:48, 1.30it/s, lr=0.0001, step_loss=0.162] Steps: 35%|███▍ | 694/2000 [14:32<16:48, 1.30it/s, lr=0.0001, step_loss=0.162]Steps: 35%|███▍ | 694/2000 [14:32<16:48, 1.30it/s, lr=0.0001, step_loss=0.316]Steps: 35%|███▍ | 695/2000 [14:33<16:47, 1.30it/s, lr=0.0001, step_loss=0.316]Steps: 35%|███▍ | 695/2000 [14:33<16:47, 1.30it/s, lr=0.0001, step_loss=0.000837]Steps: 35%|███▍ | 696/2000 [14:34<16:46, 1.30it/s, lr=0.0001, step_loss=0.000837]Steps: 35%|███▍ | 696/2000 [14:34<16:46, 1.30it/s, lr=0.0001, step_loss=0.0858] Steps: 35%|███▍ | 697/2000 [14:34<16:44, 1.30it/s, lr=0.0001, step_loss=0.0858]Steps: 35%|███▍ | 697/2000 [14:34<16:44, 1.30it/s, lr=0.0001, step_loss=0.0117]Steps: 35%|███▍ | 698/2000 [14:35<16:44, 1.30it/s, lr=0.0001, step_loss=0.0117]Steps: 35%|███▍ | 698/2000 [14:35<16:44, 1.30it/s, lr=0.0001, step_loss=0.0763]Steps: 35%|███▍ | 699/2000 [14:36<16:42, 1.30it/s, lr=0.0001, step_loss=0.0763]Steps: 35%|███▍ | 699/2000 [14:36<16:42, 1.30it/s, lr=0.0001, step_loss=0.0179]Steps: 35%|███▌ | 700/2000 [14:37<16:41, 1.30it/s, lr=0.0001, step_loss=0.0179]Steps: 35%|███▌ | 700/2000 [14:37<16:41, 1.30it/s, lr=0.0001, step_loss=0.105] Steps: 35%|███▌ | 701/2000 [14:38<16:40, 1.30it/s, lr=0.0001, step_loss=0.105]Steps: 35%|███▌ | 701/2000 [14:38<16:40, 1.30it/s, lr=0.0001, step_loss=0.00316]Steps: 35%|███▌ | 702/2000 [14:38<16:40, 1.30it/s, lr=0.0001, step_loss=0.00316]Steps: 35%|███▌ | 702/2000 [14:38<16:40, 1.30it/s, lr=0.0001, step_loss=0.091] Steps: 35%|███▌ | 703/2000 [14:39<16:39, 1.30it/s, lr=0.0001, step_loss=0.091]Steps: 35%|███▌ | 703/2000 [14:39<16:39, 1.30it/s, lr=0.0001, step_loss=0.0144]Steps: 35%|███▌ | 704/2000 [14:40<16:38, 1.30it/s, lr=0.0001, step_loss=0.0144]11/14/2025 06:23:27 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 704)
Steps: 35%|███▌ | 704/2000 [14:47<16:38, 1.30it/s, lr=0.0001, step_loss=0.00129]11/14/2025 06:23:27 - INFO - root - ### DEBUG: Finished epoch 21, epoch_steps=32, global_step=704
11/14/2025 06:23:27 - INFO - root - ### DEBUG: Starting epoch 22/63, global_step=704, max_train_steps=2000
Steps: 35%|███▌ | 705/2000 [14:48<1:04:51, 3.00s/it, lr=0.0001, step_loss=0.00129]Steps: 35%|███▌ | 705/2000 [14:48<1:04:51, 3.00s/it, lr=0.0001, step_loss=0.000522]Steps: 35%|███▌ | 706/2000 [14:49<50:46, 2.35s/it, lr=0.0001, step_loss=0.000522] Steps: 35%|███▌ | 706/2000 [14:49<50:46, 2.35s/it, lr=0.0001, step_loss=0.000469]Steps: 35%|███▌ | 707/2000 [14:50<40:30, 1.88s/it, lr=0.0001, step_loss=0.000469]Steps: 35%|███▌ | 707/2000 [14:50<40:30, 1.88s/it, lr=0.0001, step_loss=0.131] Steps: 35%|███▌ | 708/2000 [14:50<33:19, 1.55s/it, lr=0.0001, step_loss=0.131]Steps: 35%|███▌ | 708/2000 [14:50<33:19, 1.55s/it, lr=0.0001, step_loss=0.000718]Steps: 35%|███▌ | 709/2000 [14:51<28:16, 1.31s/it, lr=0.0001, step_loss=0.000718]Steps: 35%|███▌ | 709/2000 [14:51<28:16, 1.31s/it, lr=0.0001, step_loss=0.0103] Steps: 36%|███▌ | 710/2000 [14:52<24:45, 1.15s/it, lr=0.0001, step_loss=0.0103]Steps: 36%|███▌ | 710/2000 [14:52<24:45, 1.15s/it, lr=0.0001, step_loss=0.168] Steps: 36%|███▌ | 711/2000 [14:53<22:17, 1.04s/it, lr=0.0001, step_loss=0.168]Steps: 36%|███▌ | 711/2000 [14:53<22:17, 1.04s/it, lr=0.0001, step_loss=0.0612]Steps: 36%|███▌ | 712/2000 [14:54<20:33, 1.04it/s, lr=0.0001, step_loss=0.0612]Steps: 36%|███▌ | 712/2000 [14:54<20:33, 1.04it/s, lr=0.0001, step_loss=0.00189]Steps: 36%|███▌ | 713/2000 [14:54<19:20, 1.11it/s, lr=0.0001, step_loss=0.00189]Steps: 36%|███▌ | 713/2000 [14:54<19:20, 1.11it/s, lr=0.0001, step_loss=0.00352]Steps: 36%|███▌ | 714/2000 [14:55<18:28, 1.16it/s, lr=0.0001, step_loss=0.00352]Steps: 36%|███▌ | 714/2000 [14:55<18:28, 1.16it/s, lr=0.0001, step_loss=0.217] Steps: 36%|███▌ | 715/2000 [14:56<17:53, 1.20it/s, lr=0.0001, step_loss=0.217]Steps: 36%|███▌ | 715/2000 [14:56<17:53, 1.20it/s, lr=0.0001, step_loss=0.288]Steps: 36%|███▌ | 716/2000 [14:57<17:27, 1.23it/s, lr=0.0001, step_loss=0.288]Steps: 36%|███▌ | 716/2000 [14:57<17:27, 1.23it/s, lr=0.0001, step_loss=0.0709]Steps: 36%|███▌ | 717/2000 [14:57<17:10, 1.25it/s, lr=0.0001, step_loss=0.0709]Steps: 36%|███▌ | 717/2000 [14:57<17:10, 1.25it/s, lr=0.0001, step_loss=0.264] Steps: 36%|███▌ | 718/2000 [14:58<16:56, 1.26it/s, lr=0.0001, step_loss=0.264]Steps: 36%|███▌ | 718/2000 [14:58<16:56, 1.26it/s, lr=0.0001, step_loss=0.00122]Steps: 36%|███▌ | 719/2000 [14:59<16:47, 1.27it/s, lr=0.0001, step_loss=0.00122]Steps: 36%|███▌ | 719/2000 [14:59<16:47, 1.27it/s, lr=0.0001, step_loss=0.00887]Steps: 36%|███▌ | 720/2000 [15:00<16:39, 1.28it/s, lr=0.0001, step_loss=0.00887]Steps: 36%|███▌ | 720/2000 [15:00<16:39, 1.28it/s, lr=0.0001, step_loss=0.0106] Steps: 36%|███▌ | 721/2000 [15:00<16:34, 1.29it/s, lr=0.0001, step_loss=0.0106]Steps: 36%|███▌ | 721/2000 [15:00<16:34, 1.29it/s, lr=0.0001, step_loss=0.0561]Steps: 36%|███▌ | 722/2000 [15:01<16:31, 1.29it/s, lr=0.0001, step_loss=0.0561]Steps: 36%|███▌ | 722/2000 [15:01<16:31, 1.29it/s, lr=0.0001, step_loss=0.0797]Steps: 36%|███▌ | 723/2000 [15:02<16:28, 1.29it/s, lr=0.0001, step_loss=0.0797]Steps: 36%|███▌ | 723/2000 [15:02<16:28, 1.29it/s, lr=0.0001, step_loss=0.00196]Steps: 36%|███▌ | 724/2000 [15:03<16:25, 1.29it/s, lr=0.0001, step_loss=0.00196]Steps: 36%|███▌ | 724/2000 [15:03<16:25, 1.29it/s, lr=0.0001, step_loss=0.0924] Steps: 36%|███▋ | 725/2000 [15:04<16:23, 1.30it/s, lr=0.0001, step_loss=0.0924]Steps: 36%|███▋ | 725/2000 [15:04<16:23, 1.30it/s, lr=0.0001, step_loss=0.00928]Steps: 36%|███▋ | 726/2000 [15:04<16:23, 1.30it/s, lr=0.0001, step_loss=0.00928]Steps: 36%|███▋ | 726/2000 [15:04<16:23, 1.30it/s, lr=0.0001, step_loss=0.0491] Steps: 36%|███▋ | 727/2000 [15:05<16:22, 1.30it/s, lr=0.0001, step_loss=0.0491]Steps: 36%|███▋ | 727/2000 [15:05<16:22, 1.30it/s, lr=0.0001, step_loss=0.000876]Steps: 36%|███▋ | 728/2000 [15:06<16:20, 1.30it/s, lr=0.0001, step_loss=0.000876]Steps: 36%|███▋ | 728/2000 [15:06<16:20, 1.30it/s, lr=0.0001, step_loss=0.000562]Steps: 36%|███▋ | 729/2000 [15:07<16:20, 1.30it/s, lr=0.0001, step_loss=0.000562]Steps: 36%|███▋ | 729/2000 [15:07<16:20, 1.30it/s, lr=0.0001, step_loss=0.00093] Steps: 36%|███▋ | 730/2000 [15:07<16:18, 1.30it/s, lr=0.0001, step_loss=0.00093]Steps: 36%|███▋ | 730/2000 [15:07<16:18, 1.30it/s, lr=0.0001, step_loss=0.000913]Steps: 37%|███▋ | 731/2000 [15:08<16:17, 1.30it/s, lr=0.0001, step_loss=0.000913]Steps: 37%|███▋ | 731/2000 [15:08<16:17, 1.30it/s, lr=0.0001, step_loss=0.0312] Steps: 37%|███▋ | 732/2000 [15:09<16:19, 1.30it/s, lr=0.0001, step_loss=0.0312]Steps: 37%|███▋ | 732/2000 [15:09<16:19, 1.30it/s, lr=0.0001, step_loss=0.00104]Steps: 37%|███▋ | 733/2000 [15:10<16:18, 1.29it/s, lr=0.0001, step_loss=0.00104]Steps: 37%|███▋ | 733/2000 [15:10<16:18, 1.29it/s, lr=0.0001, step_loss=0.0573] Steps: 37%|███▋ | 734/2000 [15:10<16:17, 1.29it/s, lr=0.0001, step_loss=0.0573]Steps: 37%|███▋ | 734/2000 [15:10<16:17, 1.29it/s, lr=0.0001, step_loss=0.000848]Steps: 37%|███▋ | 735/2000 [15:11<16:17, 1.29it/s, lr=0.0001, step_loss=0.000848]Steps: 37%|███▋ | 735/2000 [15:11<16:17, 1.29it/s, lr=0.0001, step_loss=0.00309] Steps: 37%|███▋ | 736/2000 [15:12<16:16, 1.29it/s, lr=0.0001, step_loss=0.00309]11/14/2025 06:23:59 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 736)
Steps: 37%|███▋ | 736/2000 [15:20<16:16, 1.29it/s, lr=0.0001, step_loss=0.358] 11/14/2025 06:23:59 - INFO - root - ### DEBUG: Finished epoch 22, epoch_steps=32, global_step=736
11/14/2025 06:23:59 - INFO - root - ### DEBUG: Starting epoch 23/63, global_step=736, max_train_steps=2000
Steps: 37%|███▋ | 737/2000 [15:20<1:04:13, 3.05s/it, lr=0.0001, step_loss=0.358]Steps: 37%|███▋ | 737/2000 [15:20<1:04:13, 3.05s/it, lr=0.0001, step_loss=0.0612]Steps: 37%|███▋ | 738/2000 [15:21<49:47, 2.37s/it, lr=0.0001, step_loss=0.0612] Steps: 37%|███▋ | 738/2000 [15:21<49:47, 2.37s/it, lr=0.0001, step_loss=0.141] Steps: 37%|███▋ | 739/2000 [15:22<39:40, 1.89s/it, lr=0.0001, step_loss=0.141]Steps: 37%|███▋ | 739/2000 [15:22<39:40, 1.89s/it, lr=0.0001, step_loss=0.0102]Steps: 37%|███▋ | 740/2000 [15:23<32:35, 1.55s/it, lr=0.0001, step_loss=0.0102]Steps: 37%|███▋ | 740/2000 [15:23<32:35, 1.55s/it, lr=0.0001, step_loss=0.00747]Steps: 37%|███▋ | 741/2000 [15:23<27:39, 1.32s/it, lr=0.0001, step_loss=0.00747]Steps: 37%|███▋ | 741/2000 [15:23<27:39, 1.32s/it, lr=0.0001, step_loss=0.000815]Steps: 37%|███▋ | 742/2000 [15:24<24:12, 1.15s/it, lr=0.0001, step_loss=0.000815]Steps: 37%|███▋ | 742/2000 [15:24<24:12, 1.15s/it, lr=0.0001, step_loss=0.00071] Steps: 37%|███▋ | 743/2000 [15:25<21:45, 1.04s/it, lr=0.0001, step_loss=0.00071]Steps: 37%|███▋ | 743/2000 [15:25<21:45, 1.04s/it, lr=0.0001, step_loss=0.11] Steps: 37%|███▋ | 744/2000 [15:26<20:04, 1.04it/s, lr=0.0001, step_loss=0.11]Steps: 37%|███▋ | 744/2000 [15:26<20:04, 1.04it/s, lr=0.0001, step_loss=0.0056]Steps: 37%|███▋ | 745/2000 [15:27<18:52, 1.11it/s, lr=0.0001, step_loss=0.0056]Steps: 37%|███▋ | 745/2000 [15:27<18:52, 1.11it/s, lr=0.0001, step_loss=0.0717]Steps: 37%|███▋ | 746/2000 [15:27<18:05, 1.16it/s, lr=0.0001, step_loss=0.0717]Steps: 37%|███▋ | 746/2000 [15:27<18:05, 1.16it/s, lr=0.0001, step_loss=0.0975]Steps: 37%|███▋ | 747/2000 [15:28<17:28, 1.20it/s, lr=0.0001, step_loss=0.0975]Steps: 37%|███▋ | 747/2000 [15:28<17:28, 1.20it/s, lr=0.0001, step_loss=0.0139]Steps: 37%|███▋ | 748/2000 [15:29<17:01, 1.23it/s, lr=0.0001, step_loss=0.0139]Steps: 37%|███▋ | 748/2000 [15:29<17:01, 1.23it/s, lr=0.0001, step_loss=0.00146]Steps: 37%|███▋ | 749/2000 [15:30<16:43, 1.25it/s, lr=0.0001, step_loss=0.00146]Steps: 37%|███▋ | 749/2000 [15:30<16:43, 1.25it/s, lr=0.0001, step_loss=0.0504] Steps: 38%|███▊ | 750/2000 [15:30<16:32, 1.26it/s, lr=0.0001, step_loss=0.0504]Steps: 38%|███▊ | 750/2000 [15:30<16:32, 1.26it/s, lr=0.0001, step_loss=0.056] Steps: 38%|███▊ | 751/2000 [15:31<16:22, 1.27it/s, lr=0.0001, step_loss=0.056]Steps: 38%|███▊ | 751/2000 [15:31<16:22, 1.27it/s, lr=0.0001, step_loss=0.0438]Steps: 38%|███▊ | 752/2000 [15:32<16:16, 1.28it/s, lr=0.0001, step_loss=0.0438]Steps: 38%|███▊ | 752/2000 [15:32<16:16, 1.28it/s, lr=0.0001, step_loss=0.00692]Steps: 38%|███▊ | 753/2000 [15:33<16:11, 1.28it/s, lr=0.0001, step_loss=0.00692]Steps: 38%|███▊ | 753/2000 [15:33<16:11, 1.28it/s, lr=0.0001, step_loss=0.00094]Steps: 38%|███▊ | 754/2000 [15:33<16:08, 1.29it/s, lr=0.0001, step_loss=0.00094]Steps: 38%|███▊ | 754/2000 [15:34<16:08, 1.29it/s, lr=0.0001, step_loss=0.00365]Steps: 38%|███▊ | 755/2000 [15:34<16:06, 1.29it/s, lr=0.0001, step_loss=0.00365]Steps: 38%|███▊ | 755/2000 [15:34<16:06, 1.29it/s, lr=0.0001, step_loss=0.000393]Steps: 38%|███▊ | 756/2000 [15:35<16:04, 1.29it/s, lr=0.0001, step_loss=0.000393]Steps: 38%|███▊ | 756/2000 [15:35<16:04, 1.29it/s, lr=0.0001, step_loss=0.000751]Steps: 38%|███▊ | 757/2000 [15:36<16:03, 1.29it/s, lr=0.0001, step_loss=0.000751]Steps: 38%|███▊ | 757/2000 [15:36<16:03, 1.29it/s, lr=0.0001, step_loss=0.172] Steps: 38%|███▊ | 758/2000 [15:37<16:02, 1.29it/s, lr=0.0001, step_loss=0.172]Steps: 38%|███▊ | 758/2000 [15:37<16:02, 1.29it/s, lr=0.0001, step_loss=0.0179]Steps: 38%|███▊ | 759/2000 [15:37<16:00, 1.29it/s, lr=0.0001, step_loss=0.0179]Steps: 38%|███▊ | 759/2000 [15:37<16:00, 1.29it/s, lr=0.0001, step_loss=0.0322]Steps: 38%|███▊ | 760/2000 [15:38<15:59, 1.29it/s, lr=0.0001, step_loss=0.0322]Steps: 38%|███▊ | 760/2000 [15:38<15:59, 1.29it/s, lr=0.0001, step_loss=0.405] Steps: 38%|███▊ | 761/2000 [15:39<15:58, 1.29it/s, lr=0.0001, step_loss=0.405]Steps: 38%|███▊ | 761/2000 [15:39<15:58, 1.29it/s, lr=0.0001, step_loss=0.0189]Steps: 38%|███▊ | 762/2000 [15:40<15:57, 1.29it/s, lr=0.0001, step_loss=0.0189]Steps: 38%|███▊ | 762/2000 [15:40<15:57, 1.29it/s, lr=0.0001, step_loss=0.00069]Steps: 38%|███▊ | 763/2000 [15:40<15:55, 1.29it/s, lr=0.0001, step_loss=0.00069]Steps: 38%|███▊ | 763/2000 [15:40<15:55, 1.29it/s, lr=0.0001, step_loss=0.157] Steps: 38%|███▊ | 764/2000 [15:41<15:55, 1.29it/s, lr=0.0001, step_loss=0.157]Steps: 38%|███▊ | 764/2000 [15:41<15:55, 1.29it/s, lr=0.0001, step_loss=0.166]Steps: 38%|███▊ | 765/2000 [15:42<15:54, 1.29it/s, lr=0.0001, step_loss=0.166]Steps: 38%|███▊ | 765/2000 [15:42<15:54, 1.29it/s, lr=0.0001, step_loss=0.000861]Steps: 38%|███▊ | 766/2000 [15:43<15:53, 1.29it/s, lr=0.0001, step_loss=0.000861]Steps: 38%|███▊ | 766/2000 [15:43<15:53, 1.29it/s, lr=0.0001, step_loss=0.0443] Steps: 38%|███▊ | 767/2000 [15:44<15:51, 1.30it/s, lr=0.0001, step_loss=0.0443]Steps: 38%|███▊ | 767/2000 [15:44<15:51, 1.30it/s, lr=0.0001, step_loss=0.129] Steps: 38%|███▊ | 768/2000 [15:44<15:50, 1.30it/s, lr=0.0001, step_loss=0.129]11/14/2025 06:24:32 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 768)
Steps: 38%|███▊ | 768/2000 [15:52<15:50, 1.30it/s, lr=0.0001, step_loss=0.00553]11/14/2025 06:24:32 - INFO - root - ### DEBUG: Finished epoch 23, epoch_steps=32, global_step=768
11/14/2025 06:24:32 - INFO - root - ### DEBUG: Starting epoch 24/63, global_step=768, max_train_steps=2000
Steps: 38%|███▊ | 769/2000 [15:53<1:02:25, 3.04s/it, lr=0.0001, step_loss=0.00553]Steps: 38%|███▊ | 769/2000 [15:53<1:02:25, 3.04s/it, lr=0.0001, step_loss=0.00083]Steps: 38%|███▊ | 770/2000 [15:53<48:24, 2.36s/it, lr=0.0001, step_loss=0.00083] Steps: 38%|███▊ | 770/2000 [15:53<48:24, 2.36s/it, lr=0.0001, step_loss=0.0113] Steps: 39%|███▊ | 771/2000 [15:54<38:34, 1.88s/it, lr=0.0001, step_loss=0.0113]Steps: 39%|███▊ | 771/2000 [15:54<38:34, 1.88s/it, lr=0.0001, step_loss=0.0017]Steps: 39%|███▊ | 772/2000 [15:55<31:43, 1.55s/it, lr=0.0001, step_loss=0.0017]Steps: 39%|███▊ | 772/2000 [15:55<31:43, 1.55s/it, lr=0.0001, step_loss=0.00146]Steps: 39%|███▊ | 773/2000 [15:56<26:55, 1.32s/it, lr=0.0001, step_loss=0.00146]Steps: 39%|███▊ | 773/2000 [15:56<26:55, 1.32s/it, lr=0.0001, step_loss=0.0259] Steps: 39%|███▊ | 774/2000 [15:56<23:32, 1.15s/it, lr=0.0001, step_loss=0.0259]Steps: 39%|███▊ | 774/2000 [15:57<23:32, 1.15s/it, lr=0.0001, step_loss=0.00294]Steps: 39%|███▉ | 775/2000 [15:57<21:11, 1.04s/it, lr=0.0001, step_loss=0.00294]Steps: 39%|███▉ | 775/2000 [15:57<21:11, 1.04s/it, lr=0.0001, step_loss=0.0673] Steps: 39%|███▉ | 776/2000 [15:58<19:32, 1.04it/s, lr=0.0001, step_loss=0.0673]Steps: 39%|███▉ | 776/2000 [15:58<19:32, 1.04it/s, lr=0.0001, step_loss=0.015] Steps: 39%|███▉ | 777/2000 [15:59<18:23, 1.11it/s, lr=0.0001, step_loss=0.015]Steps: 39%|███▉ | 777/2000 [15:59<18:23, 1.11it/s, lr=0.0001, step_loss=0.0587]Steps: 39%|███▉ | 778/2000 [16:00<17:33, 1.16it/s, lr=0.0001, step_loss=0.0587]Steps: 39%|███▉ | 778/2000 [16:00<17:33, 1.16it/s, lr=0.0001, step_loss=0.00321]Steps: 39%|███▉ | 779/2000 [16:00<16:58, 1.20it/s, lr=0.0001, step_loss=0.00321]Steps: 39%|███▉ | 779/2000 [16:00<16:58, 1.20it/s, lr=0.0001, step_loss=0.00298]Steps: 39%|███▉ | 780/2000 [16:01<16:33, 1.23it/s, lr=0.0001, step_loss=0.00298]Steps: 39%|███▉ | 780/2000 [16:01<16:33, 1.23it/s, lr=0.0001, step_loss=0.000398]Steps: 39%|███▉ | 781/2000 [16:02<16:16, 1.25it/s, lr=0.0001, step_loss=0.000398]Steps: 39%|███▉ | 781/2000 [16:02<16:16, 1.25it/s, lr=0.0001, step_loss=0.00356] Steps: 39%|███▉ | 782/2000 [16:03<16:04, 1.26it/s, lr=0.0001, step_loss=0.00356]Steps: 39%|███▉ | 782/2000 [16:03<16:04, 1.26it/s, lr=0.0001, step_loss=0.00138]Steps: 39%|███▉ | 783/2000 [16:03<15:55, 1.27it/s, lr=0.0001, step_loss=0.00138]Steps: 39%|███▉ | 783/2000 [16:03<15:55, 1.27it/s, lr=0.0001, step_loss=0.00154]Steps: 39%|███▉ | 784/2000 [16:04<15:49, 1.28it/s, lr=0.0001, step_loss=0.00154]Steps: 39%|███▉ | 784/2000 [16:04<15:49, 1.28it/s, lr=0.0001, step_loss=0.00578]Steps: 39%|███▉ | 785/2000 [16:05<15:44, 1.29it/s, lr=0.0001, step_loss=0.00578]Steps: 39%|███▉ | 785/2000 [16:05<15:44, 1.29it/s, lr=0.0001, step_loss=0.000563]Steps: 39%|███▉ | 786/2000 [16:06<15:41, 1.29it/s, lr=0.0001, step_loss=0.000563]Steps: 39%|███▉ | 786/2000 [16:06<15:41, 1.29it/s, lr=0.0001, step_loss=0.168] Steps: 39%|███▉ | 787/2000 [16:07<15:38, 1.29it/s, lr=0.0001, step_loss=0.168]Steps: 39%|███▉ | 787/2000 [16:07<15:38, 1.29it/s, lr=0.0001, step_loss=0.00151]Steps: 39%|███▉ | 788/2000 [16:07<15:37, 1.29it/s, lr=0.0001, step_loss=0.00151]Steps: 39%|███▉ | 788/2000 [16:07<15:37, 1.29it/s, lr=0.0001, step_loss=0.0701] Steps: 39%|███▉ | 789/2000 [16:08<15:35, 1.29it/s, lr=0.0001, step_loss=0.0701]Steps: 39%|███▉ | 789/2000 [16:08<15:35, 1.29it/s, lr=0.0001, step_loss=0.0568]Steps: 40%|███▉ | 790/2000 [16:09<15:33, 1.30it/s, lr=0.0001, step_loss=0.0568]Steps: 40%|███▉ | 790/2000 [16:09<15:33, 1.30it/s, lr=0.0001, step_loss=0.0237]Steps: 40%|███▉ | 791/2000 [16:10<15:31, 1.30it/s, lr=0.0001, step_loss=0.0237]Steps: 40%|███▉ | 791/2000 [16:10<15:31, 1.30it/s, lr=0.0001, step_loss=0.0104]Steps: 40%|███▉ | 792/2000 [16:10<15:31, 1.30it/s, lr=0.0001, step_loss=0.0104]Steps: 40%|███▉ | 792/2000 [16:10<15:31, 1.30it/s, lr=0.0001, step_loss=0.0102]Steps: 40%|███▉ | 793/2000 [16:11<15:30, 1.30it/s, lr=0.0001, step_loss=0.0102]Steps: 40%|███▉ | 793/2000 [16:11<15:30, 1.30it/s, lr=0.0001, step_loss=0.000429]Steps: 40%|███▉ | 794/2000 [16:12<15:28, 1.30it/s, lr=0.0001, step_loss=0.000429]Steps: 40%|███▉ | 794/2000 [16:12<15:28, 1.30it/s, lr=0.0001, step_loss=0.00697] Steps: 40%|███▉ | 795/2000 [16:13<15:28, 1.30it/s, lr=0.0001, step_loss=0.00697]Steps: 40%|███▉ | 795/2000 [16:13<15:28, 1.30it/s, lr=0.0001, step_loss=0.117] Steps: 40%|███▉ | 796/2000 [16:13<15:28, 1.30it/s, lr=0.0001, step_loss=0.117]Steps: 40%|███▉ | 796/2000 [16:13<15:28, 1.30it/s, lr=0.0001, step_loss=0.00134]Steps: 40%|███▉ | 797/2000 [16:14<15:27, 1.30it/s, lr=0.0001, step_loss=0.00134]Steps: 40%|███▉ | 797/2000 [16:14<15:27, 1.30it/s, lr=0.0001, step_loss=0.00105]Steps: 40%|███▉ | 798/2000 [16:15<15:27, 1.30it/s, lr=0.0001, step_loss=0.00105]Steps: 40%|███▉ | 798/2000 [16:15<15:27, 1.30it/s, lr=0.0001, step_loss=0.000803]Steps: 40%|███▉ | 799/2000 [16:16<15:26, 1.30it/s, lr=0.0001, step_loss=0.000803]Steps: 40%|███▉ | 799/2000 [16:16<15:26, 1.30it/s, lr=0.0001, step_loss=0.00139] Steps: 40%|████ | 800/2000 [16:17<15:26, 1.30it/s, lr=0.0001, step_loss=0.00139]11/14/2025 06:25:11 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 800)
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:09, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.05it/s][A100%|██████████| 8/8 [00:00<00:00, 32.22it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.01it/s][A100%|██████████| 8/8 [00:00<00:00, 32.18it/s]
11/14/2025 06:25:44 - INFO - root - Saved samples to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/samples/sample-800.gif
Steps: 40%|████ | 800/2000 [17:04<15:26, 1.30it/s, lr=0.0001, step_loss=0.0579] 11/14/2025 06:25:44 - INFO - root - ### DEBUG: Finished epoch 24, epoch_steps=32, global_step=800
11/14/2025 06:25:44 - INFO - root - ### DEBUG: Starting epoch 25/63, global_step=800, max_train_steps=2000
Steps: 40%|████ | 801/2000 [17:05<5:00:28, 15.04s/it, lr=0.0001, step_loss=0.0579]Steps: 40%|████ | 801/2000 [17:05<5:00:28, 15.04s/it, lr=0.0001, step_loss=0.0011]Steps: 40%|████ | 802/2000 [17:06<3:34:43, 10.75s/it, lr=0.0001, step_loss=0.0011]Steps: 40%|████ | 802/2000 [17:06<3:34:43, 10.75s/it, lr=0.0001, step_loss=0.0929]Steps: 40%|████ | 803/2000 [17:06<2:34:46, 7.76s/it, lr=0.0001, step_loss=0.0929]Steps: 40%|████ | 803/2000 [17:06<2:34:46, 7.76s/it, lr=0.0001, step_loss=0.00216]Steps: 40%|████ | 804/2000 [17:07<1:52:48, 5.66s/it, lr=0.0001, step_loss=0.00216]Steps: 40%|████ | 804/2000 [17:07<1:52:48, 5.66s/it, lr=0.0001, step_loss=0.000366]Steps: 40%|████ | 805/2000 [17:08<1:23:29, 4.19s/it, lr=0.0001, step_loss=0.000366]Steps: 40%|████ | 805/2000 [17:08<1:23:29, 4.19s/it, lr=0.0001, step_loss=0.0665] Steps: 40%|████ | 806/2000 [17:09<1:02:57, 3.16s/it, lr=0.0001, step_loss=0.0665]Steps: 40%|████ | 806/2000 [17:09<1:02:57, 3.16s/it, lr=0.0001, step_loss=0.00522]Steps: 40%|████ | 807/2000 [17:09<48:36, 2.44s/it, lr=0.0001, step_loss=0.00522] Steps: 40%|████ | 807/2000 [17:09<48:36, 2.44s/it, lr=0.0001, step_loss=0.016] Steps: 40%|████ | 808/2000 [17:10<38:33, 1.94s/it, lr=0.0001, step_loss=0.016]Steps: 40%|████ | 808/2000 [17:10<38:33, 1.94s/it, lr=0.0001, step_loss=0.103]Steps: 40%|████ | 809/2000 [17:11<31:31, 1.59s/it, lr=0.0001, step_loss=0.103]Steps: 40%|████ | 809/2000 [17:11<31:31, 1.59s/it, lr=0.0001, step_loss=0.115]Steps: 40%|████ | 810/2000 [17:12<26:35, 1.34s/it, lr=0.0001, step_loss=0.115]Steps: 40%|████ | 810/2000 [17:12<26:35, 1.34s/it, lr=0.0001, step_loss=0.0397]Steps: 41%|████ | 811/2000 [17:13<23:11, 1.17s/it, lr=0.0001, step_loss=0.0397]Steps: 41%|████ | 811/2000 [17:13<23:11, 1.17s/it, lr=0.0001, step_loss=0.0458]Steps: 41%|████ | 812/2000 [17:13<20:44, 1.05s/it, lr=0.0001, step_loss=0.0458]Steps: 41%|████ | 812/2000 [17:13<20:44, 1.05s/it, lr=0.0001, step_loss=0.0072]Steps: 41%|████ | 813/2000 [17:14<19:02, 1.04it/s, lr=0.0001, step_loss=0.0072]Steps: 41%|████ | 813/2000 [17:14<19:02, 1.04it/s, lr=0.0001, step_loss=0.0896]Steps: 41%|████ | 814/2000 [17:15<17:51, 1.11it/s, lr=0.0001, step_loss=0.0896]Steps: 41%|████ | 814/2000 [17:15<17:51, 1.11it/s, lr=0.0001, step_loss=0.022] Steps: 41%|████ | 815/2000 [17:16<17:00, 1.16it/s, lr=0.0001, step_loss=0.022]Steps: 41%|████ | 815/2000 [17:16<17:00, 1.16it/s, lr=0.0001, step_loss=0.0232]Steps: 41%|████ | 816/2000 [17:16<16:25, 1.20it/s, lr=0.0001, step_loss=0.0232]Steps: 41%|████ | 816/2000 [17:16<16:25, 1.20it/s, lr=0.0001, step_loss=0.0057]Steps: 41%|████ | 817/2000 [17:17<16:00, 1.23it/s, lr=0.0001, step_loss=0.0057]Steps: 41%|████ | 817/2000 [17:17<16:00, 1.23it/s, lr=0.0001, step_loss=0.0826]Steps: 41%|████ | 818/2000 [17:18<15:44, 1.25it/s, lr=0.0001, step_loss=0.0826]Steps: 41%|████ | 818/2000 [17:18<15:44, 1.25it/s, lr=0.0001, step_loss=0.0178]Steps: 41%|████ | 819/2000 [17:19<15:31, 1.27it/s, lr=0.0001, step_loss=0.0178]Steps: 41%|████ | 819/2000 [17:19<15:31, 1.27it/s, lr=0.0001, step_loss=0.152] Steps: 41%|████ | 820/2000 [17:19<15:20, 1.28it/s, lr=0.0001, step_loss=0.152]Steps: 41%|████ | 820/2000 [17:19<15:20, 1.28it/s, lr=0.0001, step_loss=0.103]Steps: 41%|████ | 821/2000 [17:20<15:13, 1.29it/s, lr=0.0001, step_loss=0.103]Steps: 41%|████ | 821/2000 [17:20<15:13, 1.29it/s, lr=0.0001, step_loss=0.0397]Steps: 41%|████ | 822/2000 [17:21<15:11, 1.29it/s, lr=0.0001, step_loss=0.0397]Steps: 41%|████ | 822/2000 [17:21<15:11, 1.29it/s, lr=0.0001, step_loss=0.000717]Steps: 41%|████ | 823/2000 [17:22<15:07, 1.30it/s, lr=0.0001, step_loss=0.000717]Steps: 41%|████ | 823/2000 [17:22<15:07, 1.30it/s, lr=0.0001, step_loss=0.00425] Steps: 41%|████ | 824/2000 [17:22<15:04, 1.30it/s, lr=0.0001, step_loss=0.00425]Steps: 41%|████ | 824/2000 [17:22<15:04, 1.30it/s, lr=0.0001, step_loss=0.00915]Steps: 41%|████▏ | 825/2000 [17:23<15:04, 1.30it/s, lr=0.0001, step_loss=0.00915]Steps: 41%|████▏ | 825/2000 [17:23<15:04, 1.30it/s, lr=0.0001, step_loss=0.0718] Steps: 41%|████▏ | 826/2000 [17:24<15:03, 1.30it/s, lr=0.0001, step_loss=0.0718]Steps: 41%|████▏ | 826/2000 [17:24<15:03, 1.30it/s, lr=0.0001, step_loss=0.000832]Steps: 41%|████▏ | 827/2000 [17:25<15:02, 1.30it/s, lr=0.0001, step_loss=0.000832]Steps: 41%|████▏ | 827/2000 [17:25<15:02, 1.30it/s, lr=0.0001, step_loss=0.122] Steps: 41%|████▏ | 828/2000 [17:26<15:03, 1.30it/s, lr=0.0001, step_loss=0.122]Steps: 41%|████▏ | 828/2000 [17:26<15:03, 1.30it/s, lr=0.0001, step_loss=0.00803]Steps: 41%|████▏ | 829/2000 [17:26<14:59, 1.30it/s, lr=0.0001, step_loss=0.00803]Steps: 41%|████▏ | 829/2000 [17:26<14:59, 1.30it/s, lr=0.0001, step_loss=0.135] Steps: 42%|████▏ | 830/2000 [17:27<14:56, 1.30it/s, lr=0.0001, step_loss=0.135]Steps: 42%|████▏ | 830/2000 [17:27<14:56, 1.30it/s, lr=0.0001, step_loss=0.00494]Steps: 42%|████▏ | 831/2000 [17:28<14:55, 1.31it/s, lr=0.0001, step_loss=0.00494]Steps: 42%|████▏ | 831/2000 [17:28<14:55, 1.31it/s, lr=0.0001, step_loss=0.00287]Steps: 42%|████▏ | 832/2000 [17:29<14:52, 1.31it/s, lr=0.0001, step_loss=0.00287]11/14/2025 06:26:16 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 832)
Steps: 42%|████▏ | 832/2000 [17:36<14:52, 1.31it/s, lr=0.0001, step_loss=0.000985]11/14/2025 06:26:16 - INFO - root - ### DEBUG: Finished epoch 25, epoch_steps=32, global_step=832
11/14/2025 06:26:16 - INFO - root - ### DEBUG: Starting epoch 26/63, global_step=832, max_train_steps=2000
Steps: 42%|████▏ | 833/2000 [17:37<58:47, 3.02s/it, lr=0.0001, step_loss=0.000985]Steps: 42%|████▏ | 833/2000 [17:37<58:47, 3.02s/it, lr=0.0001, step_loss=0.0484] Steps: 42%|████▏ | 834/2000 [17:38<45:32, 2.34s/it, lr=0.0001, step_loss=0.0484]Steps: 42%|████▏ | 834/2000 [17:38<45:32, 2.34s/it, lr=0.0001, step_loss=0.000933]Steps: 42%|████▏ | 835/2000 [17:38<36:17, 1.87s/it, lr=0.0001, step_loss=0.000933]Steps: 42%|████▏ | 835/2000 [17:38<36:17, 1.87s/it, lr=0.0001, step_loss=0.00133] Steps: 42%|████▏ | 836/2000 [17:39<29:49, 1.54s/it, lr=0.0001, step_loss=0.00133]Steps: 42%|████▏ | 836/2000 [17:39<29:49, 1.54s/it, lr=0.0001, step_loss=0.172] Steps: 42%|████▏ | 837/2000 [17:40<25:17, 1.30s/it, lr=0.0001, step_loss=0.172]Steps: 42%|████▏ | 837/2000 [17:40<25:17, 1.30s/it, lr=0.0001, step_loss=0.0468]Steps: 42%|████▏ | 838/2000 [17:41<22:06, 1.14s/it, lr=0.0001, step_loss=0.0468]Steps: 42%|████▏ | 838/2000 [17:41<22:06, 1.14s/it, lr=0.0001, step_loss=0.00491]Steps: 42%|████▏ | 839/2000 [17:41<19:53, 1.03s/it, lr=0.0001, step_loss=0.00491]Steps: 42%|████▏ | 839/2000 [17:41<19:53, 1.03s/it, lr=0.0001, step_loss=0.126] Steps: 42%|████▏ | 840/2000 [17:42<18:20, 1.05it/s, lr=0.0001, step_loss=0.126]Steps: 42%|████▏ | 840/2000 [17:42<18:20, 1.05it/s, lr=0.0001, step_loss=0.00804]Steps: 42%|████▏ | 841/2000 [17:43<17:14, 1.12it/s, lr=0.0001, step_loss=0.00804]Steps: 42%|████▏ | 841/2000 [17:43<17:14, 1.12it/s, lr=0.0001, step_loss=0.0238] Steps: 42%|████▏ | 842/2000 [17:44<16:27, 1.17it/s, lr=0.0001, step_loss=0.0238]Steps: 42%|████▏ | 842/2000 [17:44<16:27, 1.17it/s, lr=0.0001, step_loss=0.00521]Steps: 42%|████▏ | 843/2000 [17:44<15:55, 1.21it/s, lr=0.0001, step_loss=0.00521]Steps: 42%|████▏ | 843/2000 [17:45<15:55, 1.21it/s, lr=0.0001, step_loss=0.135] Steps: 42%|████▏ | 844/2000 [17:45<15:31, 1.24it/s, lr=0.0001, step_loss=0.135]Steps: 42%|████▏ | 844/2000 [17:45<15:31, 1.24it/s, lr=0.0001, step_loss=0.00163]Steps: 42%|████▏ | 845/2000 [17:46<15:15, 1.26it/s, lr=0.0001, step_loss=0.00163]Steps: 42%|████▏ | 845/2000 [17:46<15:15, 1.26it/s, lr=0.0001, step_loss=0.0463] Steps: 42%|████▏ | 846/2000 [17:47<15:03, 1.28it/s, lr=0.0001, step_loss=0.0463]Steps: 42%|████▏ | 846/2000 [17:47<15:03, 1.28it/s, lr=0.0001, step_loss=0.00141]Steps: 42%|████▏ | 847/2000 [17:48<14:55, 1.29it/s, lr=0.0001, step_loss=0.00141]Steps: 42%|████▏ | 847/2000 [17:48<14:55, 1.29it/s, lr=0.0001, step_loss=0.0104] Steps: 42%|████▏ | 848/2000 [17:48<14:48, 1.30it/s, lr=0.0001, step_loss=0.0104]Steps: 42%|████▏ | 848/2000 [17:48<14:48, 1.30it/s, lr=0.0001, step_loss=0.0995]Steps: 42%|████▏ | 849/2000 [17:49<14:44, 1.30it/s, lr=0.0001, step_loss=0.0995]Steps: 42%|████▏ | 849/2000 [17:49<14:44, 1.30it/s, lr=0.0001, step_loss=0.00193]Steps: 42%|████▎ | 850/2000 [17:50<14:40, 1.31it/s, lr=0.0001, step_loss=0.00193]Steps: 42%|████▎ | 850/2000 [17:50<14:40, 1.31it/s, lr=0.0001, step_loss=0.00134]Steps: 43%|████▎ | 851/2000 [17:51<14:38, 1.31it/s, lr=0.0001, step_loss=0.00134]Steps: 43%|████▎ | 851/2000 [17:51<14:38, 1.31it/s, lr=0.0001, step_loss=0.00144]Steps: 43%|████▎ | 852/2000 [17:51<14:36, 1.31it/s, lr=0.0001, step_loss=0.00144]Steps: 43%|████▎ | 852/2000 [17:51<14:36, 1.31it/s, lr=0.0001, step_loss=0.0122] Steps: 43%|████▎ | 853/2000 [17:52<14:34, 1.31it/s, lr=0.0001, step_loss=0.0122]Steps: 43%|████▎ | 853/2000 [17:52<14:34, 1.31it/s, lr=0.0001, step_loss=0.00245]Steps: 43%|████▎ | 854/2000 [17:53<14:33, 1.31it/s, lr=0.0001, step_loss=0.00245]Steps: 43%|████▎ | 854/2000 [17:53<14:33, 1.31it/s, lr=0.0001, step_loss=0.236] Steps: 43%|████▎ | 855/2000 [17:54<14:32, 1.31it/s, lr=0.0001, step_loss=0.236]Steps: 43%|████▎ | 855/2000 [17:54<14:32, 1.31it/s, lr=0.0001, step_loss=0.000367]Steps: 43%|████▎ | 856/2000 [17:54<14:31, 1.31it/s, lr=0.0001, step_loss=0.000367]Steps: 43%|████▎ | 856/2000 [17:54<14:31, 1.31it/s, lr=0.0001, step_loss=0.00104] Steps: 43%|████▎ | 857/2000 [17:55<14:31, 1.31it/s, lr=0.0001, step_loss=0.00104]Steps: 43%|████▎ | 857/2000 [17:55<14:31, 1.31it/s, lr=0.0001, step_loss=0.0208] Steps: 43%|████▎ | 858/2000 [17:56<14:30, 1.31it/s, lr=0.0001, step_loss=0.0208]Steps: 43%|████▎ | 858/2000 [17:56<14:30, 1.31it/s, lr=0.0001, step_loss=0.0103]Steps: 43%|████▎ | 859/2000 [17:57<14:29, 1.31it/s, lr=0.0001, step_loss=0.0103]Steps: 43%|████▎ | 859/2000 [17:57<14:29, 1.31it/s, lr=0.0001, step_loss=0.00507]Steps: 43%|████▎ | 860/2000 [17:57<14:28, 1.31it/s, lr=0.0001, step_loss=0.00507]Steps: 43%|████▎ | 860/2000 [17:57<14:28, 1.31it/s, lr=0.0001, step_loss=0.000537]Steps: 43%|████▎ | 861/2000 [17:58<14:27, 1.31it/s, lr=0.0001, step_loss=0.000537]Steps: 43%|████▎ | 861/2000 [17:58<14:27, 1.31it/s, lr=0.0001, step_loss=0.12] Steps: 43%|████▎ | 862/2000 [17:59<14:26, 1.31it/s, lr=0.0001, step_loss=0.12]Steps: 43%|████▎ | 862/2000 [17:59<14:26, 1.31it/s, lr=0.0001, step_loss=0.00084]Steps: 43%|████▎ | 863/2000 [18:00<14:26, 1.31it/s, lr=0.0001, step_loss=0.00084]Steps: 43%|████▎ | 863/2000 [18:00<14:26, 1.31it/s, lr=0.0001, step_loss=0.0451] Steps: 43%|████▎ | 864/2000 [18:00<14:26, 1.31it/s, lr=0.0001, step_loss=0.0451]11/14/2025 06:26:47 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 864)
Steps: 43%|████▎ | 864/2000 [18:08<14:26, 1.31it/s, lr=0.0001, step_loss=0.22] 11/14/2025 06:26:47 - INFO - root - ### DEBUG: Finished epoch 26, epoch_steps=32, global_step=864
11/14/2025 06:26:47 - INFO - root - ### DEBUG: Starting epoch 27/63, global_step=864, max_train_steps=2000
Steps: 43%|████▎ | 865/2000 [18:09<55:41, 2.94s/it, lr=0.0001, step_loss=0.22]Steps: 43%|████▎ | 865/2000 [18:09<55:41, 2.94s/it, lr=0.0001, step_loss=0.0472]Steps: 43%|████▎ | 866/2000 [18:09<43:16, 2.29s/it, lr=0.0001, step_loss=0.0472]Steps: 43%|████▎ | 866/2000 [18:09<43:16, 2.29s/it, lr=0.0001, step_loss=0.0121]Steps: 43%|████▎ | 867/2000 [18:10<34:34, 1.83s/it, lr=0.0001, step_loss=0.0121]Steps: 43%|████▎ | 867/2000 [18:10<34:34, 1.83s/it, lr=0.0001, step_loss=0.00154]Steps: 43%|████▎ | 868/2000 [18:11<28:29, 1.51s/it, lr=0.0001, step_loss=0.00154]Steps: 43%|████▎ | 868/2000 [18:11<28:29, 1.51s/it, lr=0.0001, step_loss=0.00574]Steps: 43%|████▎ | 869/2000 [18:12<24:13, 1.29s/it, lr=0.0001, step_loss=0.00574]Steps: 43%|████▎ | 869/2000 [18:12<24:13, 1.29s/it, lr=0.0001, step_loss=0.0226] Steps: 44%|████▎ | 870/2000 [18:12<21:15, 1.13s/it, lr=0.0001, step_loss=0.0226]Steps: 44%|████▎ | 870/2000 [18:12<21:15, 1.13s/it, lr=0.0001, step_loss=0.0448]Steps: 44%|████▎ | 871/2000 [18:13<19:09, 1.02s/it, lr=0.0001, step_loss=0.0448]Steps: 44%|████▎ | 871/2000 [18:13<19:09, 1.02s/it, lr=0.0001, step_loss=0.00739]Steps: 44%|████▎ | 872/2000 [18:14<17:41, 1.06it/s, lr=0.0001, step_loss=0.00739]Steps: 44%|████▎ | 872/2000 [18:14<17:41, 1.06it/s, lr=0.0001, step_loss=0.00339]Steps: 44%|████▎ | 873/2000 [18:15<16:39, 1.13it/s, lr=0.0001, step_loss=0.00339]Steps: 44%|████▎ | 873/2000 [18:15<16:39, 1.13it/s, lr=0.0001, step_loss=0.027] Steps: 44%|████▎ | 874/2000 [18:15<15:56, 1.18it/s, lr=0.0001, step_loss=0.027]Steps: 44%|████▎ | 874/2000 [18:15<15:56, 1.18it/s, lr=0.0001, step_loss=0.00226]Steps: 44%|████▍ | 875/2000 [18:16<15:26, 1.21it/s, lr=0.0001, step_loss=0.00226]Steps: 44%|████▍ | 875/2000 [18:16<15:26, 1.21it/s, lr=0.0001, step_loss=0.000755]Steps: 44%|████▍ | 876/2000 [18:17<15:04, 1.24it/s, lr=0.0001, step_loss=0.000755]Steps: 44%|████▍ | 876/2000 [18:17<15:04, 1.24it/s, lr=0.0001, step_loss=0.00113] Steps: 44%|████▍ | 877/2000 [18:18<14:48, 1.26it/s, lr=0.0001, step_loss=0.00113]Steps: 44%|████▍ | 877/2000 [18:18<14:48, 1.26it/s, lr=0.0001, step_loss=0.00284]Steps: 44%|████▍ | 878/2000 [18:18<14:38, 1.28it/s, lr=0.0001, step_loss=0.00284]Steps: 44%|████▍ | 878/2000 [18:18<14:38, 1.28it/s, lr=0.0001, step_loss=0.000966]Steps: 44%|████▍ | 879/2000 [18:19<14:31, 1.29it/s, lr=0.0001, step_loss=0.000966]Steps: 44%|████▍ | 879/2000 [18:19<14:31, 1.29it/s, lr=0.0001, step_loss=0.173] Steps: 44%|████▍ | 880/2000 [18:20<14:26, 1.29it/s, lr=0.0001, step_loss=0.173]Steps: 44%|████▍ | 880/2000 [18:20<14:26, 1.29it/s, lr=0.0001, step_loss=0.00136]Steps: 44%|████▍ | 881/2000 [18:21<14:25, 1.29it/s, lr=0.0001, step_loss=0.00136]Steps: 44%|████▍ | 881/2000 [18:21<14:25, 1.29it/s, lr=0.0001, step_loss=0.00474]Steps: 44%|████▍ | 882/2000 [18:21<14:22, 1.30it/s, lr=0.0001, step_loss=0.00474]Steps: 44%|████▍ | 882/2000 [18:22<14:22, 1.30it/s, lr=0.0001, step_loss=0.00171]Steps: 44%|████▍ | 883/2000 [18:22<14:18, 1.30it/s, lr=0.0001, step_loss=0.00171]Steps: 44%|████▍ | 883/2000 [18:22<14:18, 1.30it/s, lr=0.0001, step_loss=0.00141]Steps: 44%|████▍ | 884/2000 [18:23<14:15, 1.31it/s, lr=0.0001, step_loss=0.00141]Steps: 44%|████▍ | 884/2000 [18:23<14:15, 1.31it/s, lr=0.0001, step_loss=0.00442]Steps: 44%|████▍ | 885/2000 [18:24<14:13, 1.31it/s, lr=0.0001, step_loss=0.00442]Steps: 44%|████▍ | 885/2000 [18:24<14:13, 1.31it/s, lr=0.0001, step_loss=0.127] Steps: 44%|████▍ | 886/2000 [18:25<14:12, 1.31it/s, lr=0.0001, step_loss=0.127]Steps: 44%|████▍ | 886/2000 [18:25<14:12, 1.31it/s, lr=0.0001, step_loss=0.0646]Steps: 44%|████▍ | 887/2000 [18:25<14:11, 1.31it/s, lr=0.0001, step_loss=0.0646]Steps: 44%|████▍ | 887/2000 [18:25<14:11, 1.31it/s, lr=0.0001, step_loss=0.0132]Steps: 44%|████▍ | 888/2000 [18:26<14:10, 1.31it/s, lr=0.0001, step_loss=0.0132]Steps: 44%|████▍ | 888/2000 [18:26<14:10, 1.31it/s, lr=0.0001, step_loss=0.00387]Steps: 44%|████▍ | 889/2000 [18:27<14:09, 1.31it/s, lr=0.0001, step_loss=0.00387]Steps: 44%|████▍ | 889/2000 [18:27<14:09, 1.31it/s, lr=0.0001, step_loss=0.00277]Steps: 44%|████▍ | 890/2000 [18:28<14:08, 1.31it/s, lr=0.0001, step_loss=0.00277]Steps: 44%|████▍ | 890/2000 [18:28<14:08, 1.31it/s, lr=0.0001, step_loss=0.00688]Steps: 45%|████▍ | 891/2000 [18:28<14:08, 1.31it/s, lr=0.0001, step_loss=0.00688]Steps: 45%|████▍ | 891/2000 [18:28<14:08, 1.31it/s, lr=0.0001, step_loss=0.0284] Steps: 45%|████▍ | 892/2000 [18:29<14:07, 1.31it/s, lr=0.0001, step_loss=0.0284]Steps: 45%|████▍ | 892/2000 [18:29<14:07, 1.31it/s, lr=0.0001, step_loss=0.0382]Steps: 45%|████▍ | 893/2000 [18:30<14:05, 1.31it/s, lr=0.0001, step_loss=0.0382]Steps: 45%|████▍ | 893/2000 [18:30<14:05, 1.31it/s, lr=0.0001, step_loss=0.115] Steps: 45%|████▍ | 894/2000 [18:31<14:04, 1.31it/s, lr=0.0001, step_loss=0.115]Steps: 45%|████▍ | 894/2000 [18:31<14:04, 1.31it/s, lr=0.0001, step_loss=0.0186]Steps: 45%|████▍ | 895/2000 [18:31<14:03, 1.31it/s, lr=0.0001, step_loss=0.0186]Steps: 45%|████▍ | 895/2000 [18:31<14:03, 1.31it/s, lr=0.0001, step_loss=0.0245]Steps: 45%|████▍ | 896/2000 [18:32<14:02, 1.31it/s, lr=0.0001, step_loss=0.0245]11/14/2025 06:27:19 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 896)
Steps: 45%|████▍ | 896/2000 [18:40<14:02, 1.31it/s, lr=0.0001, step_loss=0.181] 11/14/2025 06:27:19 - INFO - root - ### DEBUG: Finished epoch 27, epoch_steps=32, global_step=896
11/14/2025 06:27:19 - INFO - root - ### DEBUG: Starting epoch 28/63, global_step=896, max_train_steps=2000
Steps: 45%|████▍ | 897/2000 [18:40<55:42, 3.03s/it, lr=0.0001, step_loss=0.181]Steps: 45%|████▍ | 897/2000 [18:41<55:42, 3.03s/it, lr=0.0001, step_loss=0.000694]Steps: 45%|████▍ | 898/2000 [18:41<43:09, 2.35s/it, lr=0.0001, step_loss=0.000694]Steps: 45%|████▍ | 898/2000 [18:41<43:09, 2.35s/it, lr=0.0001, step_loss=0.00225] Steps: 45%|████▍ | 899/2000 [18:42<34:22, 1.87s/it, lr=0.0001, step_loss=0.00225]Steps: 45%|████▍ | 899/2000 [18:42<34:22, 1.87s/it, lr=0.0001, step_loss=0.00617]Steps: 45%|████▌ | 900/2000 [18:43<28:13, 1.54s/it, lr=0.0001, step_loss=0.00617]Steps: 45%|████▌ | 900/2000 [18:43<28:13, 1.54s/it, lr=0.0001, step_loss=0.126] Steps: 45%|████▌ | 901/2000 [18:44<23:55, 1.31s/it, lr=0.0001, step_loss=0.126]Steps: 45%|████▌ | 901/2000 [18:44<23:55, 1.31s/it, lr=0.0001, step_loss=0.0544]Steps: 45%|████▌ | 902/2000 [18:44<20:54, 1.14s/it, lr=0.0001, step_loss=0.0544]Steps: 45%|████▌ | 902/2000 [18:44<20:54, 1.14s/it, lr=0.0001, step_loss=0.00088]Steps: 45%|████▌ | 903/2000 [18:45<18:47, 1.03s/it, lr=0.0001, step_loss=0.00088]Steps: 45%|████▌ | 903/2000 [18:45<18:47, 1.03s/it, lr=0.0001, step_loss=0.00469]Steps: 45%|████▌ | 904/2000 [18:46<17:18, 1.06it/s, lr=0.0001, step_loss=0.00469]Steps: 45%|████▌ | 904/2000 [18:46<17:18, 1.06it/s, lr=0.0001, step_loss=0.137] Steps: 45%|████▌ | 905/2000 [18:47<16:16, 1.12it/s, lr=0.0001, step_loss=0.137]Steps: 45%|████▌ | 905/2000 [18:47<16:16, 1.12it/s, lr=0.0001, step_loss=0.0183]Steps: 45%|████▌ | 906/2000 [18:47<15:32, 1.17it/s, lr=0.0001, step_loss=0.0183]Steps: 45%|████▌ | 906/2000 [18:47<15:32, 1.17it/s, lr=0.0001, step_loss=0.000807]Steps: 45%|████▌ | 907/2000 [18:48<15:01, 1.21it/s, lr=0.0001, step_loss=0.000807]Steps: 45%|████▌ | 907/2000 [18:48<15:01, 1.21it/s, lr=0.0001, step_loss=0.12] Steps: 45%|████▌ | 908/2000 [18:49<14:39, 1.24it/s, lr=0.0001, step_loss=0.12]Steps: 45%|████▌ | 908/2000 [18:49<14:39, 1.24it/s, lr=0.0001, step_loss=0.208]Steps: 45%|████▌ | 909/2000 [18:50<14:24, 1.26it/s, lr=0.0001, step_loss=0.208]Steps: 45%|████▌ | 909/2000 [18:50<14:24, 1.26it/s, lr=0.0001, step_loss=0.000674]Steps: 46%|████▌ | 910/2000 [18:50<14:13, 1.28it/s, lr=0.0001, step_loss=0.000674]Steps: 46%|████▌ | 910/2000 [18:50<14:13, 1.28it/s, lr=0.0001, step_loss=0.00172] Steps: 46%|████▌ | 911/2000 [18:51<14:05, 1.29it/s, lr=0.0001, step_loss=0.00172]Steps: 46%|████▌ | 911/2000 [18:51<14:05, 1.29it/s, lr=0.0001, step_loss=0.0932] Steps: 46%|████▌ | 912/2000 [18:52<13:59, 1.30it/s, lr=0.0001, step_loss=0.0932]Steps: 46%|████▌ | 912/2000 [18:52<13:59, 1.30it/s, lr=0.0001, step_loss=0.000509]Steps: 46%|████▌ | 913/2000 [18:53<13:55, 1.30it/s, lr=0.0001, step_loss=0.000509]Steps: 46%|████▌ | 913/2000 [18:53<13:55, 1.30it/s, lr=0.0001, step_loss=0.000701]Steps: 46%|████▌ | 914/2000 [18:53<13:52, 1.30it/s, lr=0.0001, step_loss=0.000701]Steps: 46%|████▌ | 914/2000 [18:53<13:52, 1.30it/s, lr=0.0001, step_loss=0.00888] Steps: 46%|████▌ | 915/2000 [18:54<13:50, 1.31it/s, lr=0.0001, step_loss=0.00888]Steps: 46%|████▌ | 915/2000 [18:54<13:50, 1.31it/s, lr=0.0001, step_loss=0.0232] Steps: 46%|████▌ | 916/2000 [18:55<13:47, 1.31it/s, lr=0.0001, step_loss=0.0232]Steps: 46%|████▌ | 916/2000 [18:55<13:47, 1.31it/s, lr=0.0001, step_loss=0.0559]Steps: 46%|████▌ | 917/2000 [18:56<13:46, 1.31it/s, lr=0.0001, step_loss=0.0559]Steps: 46%|████▌ | 917/2000 [18:56<13:46, 1.31it/s, lr=0.0001, step_loss=0.00571]Steps: 46%|████▌ | 918/2000 [18:56<13:45, 1.31it/s, lr=0.0001, step_loss=0.00571]Steps: 46%|████▌ | 918/2000 [18:57<13:45, 1.31it/s, lr=0.0001, step_loss=0.0386] Steps: 46%|████▌ | 919/2000 [18:57<13:44, 1.31it/s, lr=0.0001, step_loss=0.0386]Steps: 46%|████▌ | 919/2000 [18:57<13:44, 1.31it/s, lr=0.0001, step_loss=0.00152]Steps: 46%|████▌ | 920/2000 [18:58<13:43, 1.31it/s, lr=0.0001, step_loss=0.00152]Steps: 46%|████▌ | 920/2000 [18:58<13:43, 1.31it/s, lr=0.0001, step_loss=0.000719]Steps: 46%|████▌ | 921/2000 [18:59<13:42, 1.31it/s, lr=0.0001, step_loss=0.000719]Steps: 46%|████▌ | 921/2000 [18:59<13:42, 1.31it/s, lr=0.0001, step_loss=0.0017] Steps: 46%|████▌ | 922/2000 [19:00<13:41, 1.31it/s, lr=0.0001, step_loss=0.0017]Steps: 46%|████▌ | 922/2000 [19:00<13:41, 1.31it/s, lr=0.0001, step_loss=0.0673]Steps: 46%|████▌ | 923/2000 [19:00<13:40, 1.31it/s, lr=0.0001, step_loss=0.0673]Steps: 46%|████▌ | 923/2000 [19:00<13:40, 1.31it/s, lr=0.0001, step_loss=0.0381]Steps: 46%|████▌ | 924/2000 [19:01<13:39, 1.31it/s, lr=0.0001, step_loss=0.0381]Steps: 46%|████▌ | 924/2000 [19:01<13:39, 1.31it/s, lr=0.0001, step_loss=0.277] Steps: 46%|████▋ | 925/2000 [19:02<13:38, 1.31it/s, lr=0.0001, step_loss=0.277]Steps: 46%|████▋ | 925/2000 [19:02<13:38, 1.31it/s, lr=0.0001, step_loss=0.00154]Steps: 46%|████▋ | 926/2000 [19:03<13:38, 1.31it/s, lr=0.0001, step_loss=0.00154]Steps: 46%|████▋ | 926/2000 [19:03<13:38, 1.31it/s, lr=0.0001, step_loss=0.0617] Steps: 46%|████▋ | 927/2000 [19:03<13:37, 1.31it/s, lr=0.0001, step_loss=0.0617]Steps: 46%|████▋ | 927/2000 [19:03<13:37, 1.31it/s, lr=0.0001, step_loss=0.0935]Steps: 46%|████▋ | 928/2000 [19:04<13:35, 1.31it/s, lr=0.0001, step_loss=0.0935]11/14/2025 06:27:51 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 928)
Steps: 46%|████▋ | 928/2000 [19:11<13:35, 1.31it/s, lr=0.0001, step_loss=0.00667]11/14/2025 06:27:51 - INFO - root - ### DEBUG: Finished epoch 28, epoch_steps=32, global_step=928
11/14/2025 06:27:51 - INFO - root - ### DEBUG: Starting epoch 29/63, global_step=928, max_train_steps=2000
Steps: 46%|████▋ | 929/2000 [19:12<51:57, 2.91s/it, lr=0.0001, step_loss=0.00667]Steps: 46%|████▋ | 929/2000 [19:12<51:57, 2.91s/it, lr=0.0001, step_loss=0.104] Steps: 46%|████▋ | 930/2000 [19:13<40:25, 2.27s/it, lr=0.0001, step_loss=0.104]Steps: 46%|████▋ | 930/2000 [19:13<40:25, 2.27s/it, lr=0.0001, step_loss=0.0513]Steps: 47%|████▋ | 931/2000 [19:14<32:21, 1.82s/it, lr=0.0001, step_loss=0.0513]Steps: 47%|████▋ | 931/2000 [19:14<32:21, 1.82s/it, lr=0.0001, step_loss=0.0156]Steps: 47%|████▋ | 932/2000 [19:14<26:42, 1.50s/it, lr=0.0001, step_loss=0.0156]Steps: 47%|████▋ | 932/2000 [19:14<26:42, 1.50s/it, lr=0.0001, step_loss=0.00531]Steps: 47%|████▋ | 933/2000 [19:15<22:44, 1.28s/it, lr=0.0001, step_loss=0.00531]Steps: 47%|████▋ | 933/2000 [19:15<22:44, 1.28s/it, lr=0.0001, step_loss=0.00135]Steps: 47%|████▋ | 934/2000 [19:16<19:57, 1.12s/it, lr=0.0001, step_loss=0.00135]Steps: 47%|████▋ | 934/2000 [19:16<19:57, 1.12s/it, lr=0.0001, step_loss=0.0118] Steps: 47%|████▋ | 935/2000 [19:17<18:01, 1.02s/it, lr=0.0001, step_loss=0.0118]Steps: 47%|████▋ | 935/2000 [19:17<18:01, 1.02s/it, lr=0.0001, step_loss=0.00426]Steps: 47%|████▋ | 936/2000 [19:17<16:39, 1.06it/s, lr=0.0001, step_loss=0.00426]Steps: 47%|████▋ | 936/2000 [19:17<16:39, 1.06it/s, lr=0.0001, step_loss=0.00497]Steps: 47%|████▋ | 937/2000 [19:18<15:41, 1.13it/s, lr=0.0001, step_loss=0.00497]Steps: 47%|████▋ | 937/2000 [19:18<15:41, 1.13it/s, lr=0.0001, step_loss=0.00893]Steps: 47%|████▋ | 938/2000 [19:19<15:01, 1.18it/s, lr=0.0001, step_loss=0.00893]Steps: 47%|████▋ | 938/2000 [19:19<15:01, 1.18it/s, lr=0.0001, step_loss=0.429] Steps: 47%|████▋ | 939/2000 [19:20<14:33, 1.21it/s, lr=0.0001, step_loss=0.429]Steps: 47%|████▋ | 939/2000 [19:20<14:33, 1.21it/s, lr=0.0001, step_loss=0.0016]Steps: 47%|████▋ | 940/2000 [19:20<14:13, 1.24it/s, lr=0.0001, step_loss=0.0016]Steps: 47%|████▋ | 940/2000 [19:20<14:13, 1.24it/s, lr=0.0001, step_loss=0.151] Steps: 47%|████▋ | 941/2000 [19:21<13:59, 1.26it/s, lr=0.0001, step_loss=0.151]Steps: 47%|████▋ | 941/2000 [19:21<13:59, 1.26it/s, lr=0.0001, step_loss=0.000991]Steps: 47%|████▋ | 942/2000 [19:22<13:48, 1.28it/s, lr=0.0001, step_loss=0.000991]Steps: 47%|████▋ | 942/2000 [19:22<13:48, 1.28it/s, lr=0.0001, step_loss=0.000809]Steps: 47%|████▋ | 943/2000 [19:23<13:40, 1.29it/s, lr=0.0001, step_loss=0.000809]Steps: 47%|████▋ | 943/2000 [19:23<13:40, 1.29it/s, lr=0.0001, step_loss=0.00417] Steps: 47%|████▋ | 944/2000 [19:23<13:35, 1.30it/s, lr=0.0001, step_loss=0.00417]Steps: 47%|████▋ | 944/2000 [19:23<13:35, 1.30it/s, lr=0.0001, step_loss=0.0103] Steps: 47%|████▋ | 945/2000 [19:24<13:31, 1.30it/s, lr=0.0001, step_loss=0.0103]Steps: 47%|████▋ | 945/2000 [19:24<13:31, 1.30it/s, lr=0.0001, step_loss=0.0535]Steps: 47%|████▋ | 946/2000 [19:25<13:28, 1.30it/s, lr=0.0001, step_loss=0.0535]Steps: 47%|████▋ | 946/2000 [19:25<13:28, 1.30it/s, lr=0.0001, step_loss=0.0888]Steps: 47%|████▋ | 947/2000 [19:26<13:25, 1.31it/s, lr=0.0001, step_loss=0.0888]Steps: 47%|████▋ | 947/2000 [19:26<13:25, 1.31it/s, lr=0.0001, step_loss=0.00387]Steps: 47%|████▋ | 948/2000 [19:27<13:23, 1.31it/s, lr=0.0001, step_loss=0.00387]Steps: 47%|████▋ | 948/2000 [19:27<13:23, 1.31it/s, lr=0.0001, step_loss=0.00155]Steps: 47%|████▋ | 949/2000 [19:27<13:22, 1.31it/s, lr=0.0001, step_loss=0.00155]Steps: 47%|████▋ | 949/2000 [19:27<13:22, 1.31it/s, lr=0.0001, step_loss=0.00273]Steps: 48%|████▊ | 950/2000 [19:28<13:20, 1.31it/s, lr=0.0001, step_loss=0.00273]Steps: 48%|████▊ | 950/2000 [19:28<13:20, 1.31it/s, lr=0.0001, step_loss=0.0232] Steps: 48%|████▊ | 951/2000 [19:29<13:19, 1.31it/s, lr=0.0001, step_loss=0.0232]Steps: 48%|████▊ | 951/2000 [19:29<13:19, 1.31it/s, lr=0.0001, step_loss=0.00396]Steps: 48%|████▊ | 952/2000 [19:30<13:18, 1.31it/s, lr=0.0001, step_loss=0.00396]Steps: 48%|████▊ | 952/2000 [19:30<13:18, 1.31it/s, lr=0.0001, step_loss=0.014] Steps: 48%|████▊ | 953/2000 [19:30<13:17, 1.31it/s, lr=0.0001, step_loss=0.014]Steps: 48%|████▊ | 953/2000 [19:30<13:17, 1.31it/s, lr=0.0001, step_loss=0.0228]Steps: 48%|████▊ | 954/2000 [19:31<13:16, 1.31it/s, lr=0.0001, step_loss=0.0228]Steps: 48%|████▊ | 954/2000 [19:31<13:16, 1.31it/s, lr=0.0001, step_loss=0.0548]Steps: 48%|████▊ | 955/2000 [19:32<13:15, 1.31it/s, lr=0.0001, step_loss=0.0548]Steps: 48%|████▊ | 955/2000 [19:32<13:15, 1.31it/s, lr=0.0001, step_loss=0.00111]Steps: 48%|████▊ | 956/2000 [19:33<13:15, 1.31it/s, lr=0.0001, step_loss=0.00111]Steps: 48%|████▊ | 956/2000 [19:33<13:15, 1.31it/s, lr=0.0001, step_loss=0.00086]Steps: 48%|████▊ | 957/2000 [19:33<13:14, 1.31it/s, lr=0.0001, step_loss=0.00086]Steps: 48%|████▊ | 957/2000 [19:33<13:14, 1.31it/s, lr=0.0001, step_loss=0.000919]Steps: 48%|████▊ | 958/2000 [19:34<13:13, 1.31it/s, lr=0.0001, step_loss=0.000919]Steps: 48%|████▊ | 958/2000 [19:34<13:13, 1.31it/s, lr=0.0001, step_loss=0.00198] Steps: 48%|████▊ | 959/2000 [19:35<13:12, 1.31it/s, lr=0.0001, step_loss=0.00198]Steps: 48%|████▊ | 959/2000 [19:35<13:12, 1.31it/s, lr=0.0001, step_loss=0.0531] Steps: 48%|████▊ | 960/2000 [19:36<13:11, 1.31it/s, lr=0.0001, step_loss=0.0531]11/14/2025 06:28:22 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 960)
Steps: 48%|████▊ | 960/2000 [19:43<13:11, 1.31it/s, lr=0.0001, step_loss=0.0178]11/14/2025 06:28:22 - INFO - root - ### DEBUG: Finished epoch 29, epoch_steps=32, global_step=960
11/14/2025 06:28:22 - INFO - root - ### DEBUG: Starting epoch 30/63, global_step=960, max_train_steps=2000
Steps: 48%|████▊ | 961/2000 [19:44<50:27, 2.91s/it, lr=0.0001, step_loss=0.0178]Steps: 48%|████▊ | 961/2000 [19:44<50:27, 2.91s/it, lr=0.0001, step_loss=0.155] Steps: 48%|████▊ | 962/2000 [19:44<39:14, 2.27s/it, lr=0.0001, step_loss=0.155]Steps: 48%|████▊ | 962/2000 [19:44<39:14, 2.27s/it, lr=0.0001, step_loss=0.0164]Steps: 48%|████▊ | 963/2000 [19:45<31:24, 1.82s/it, lr=0.0001, step_loss=0.0164]Steps: 48%|████▊ | 963/2000 [19:45<31:24, 1.82s/it, lr=0.0001, step_loss=0.0957]Steps: 48%|████▊ | 964/2000 [19:46<25:54, 1.50s/it, lr=0.0001, step_loss=0.0957]Steps: 48%|████▊ | 964/2000 [19:46<25:54, 1.50s/it, lr=0.0001, step_loss=0.19] Steps: 48%|████▊ | 965/2000 [19:47<22:02, 1.28s/it, lr=0.0001, step_loss=0.19]Steps: 48%|████▊ | 965/2000 [19:47<22:02, 1.28s/it, lr=0.0001, step_loss=0.00704]Steps: 48%|████▊ | 966/2000 [19:47<19:21, 1.12s/it, lr=0.0001, step_loss=0.00704]Steps: 48%|████▊ | 966/2000 [19:47<19:21, 1.12s/it, lr=0.0001, step_loss=0.0222] Steps: 48%|████▊ | 967/2000 [19:48<17:28, 1.02s/it, lr=0.0001, step_loss=0.0222]Steps: 48%|████▊ | 967/2000 [19:48<17:28, 1.02s/it, lr=0.0001, step_loss=0.0503]Steps: 48%|████▊ | 968/2000 [19:49<16:09, 1.06it/s, lr=0.0001, step_loss=0.0503]Steps: 48%|████▊ | 968/2000 [19:49<16:09, 1.06it/s, lr=0.0001, step_loss=0.00547]Steps: 48%|████▊ | 969/2000 [19:50<15:13, 1.13it/s, lr=0.0001, step_loss=0.00547]Steps: 48%|████▊ | 969/2000 [19:50<15:13, 1.13it/s, lr=0.0001, step_loss=0.0583] Steps: 48%|████▊ | 970/2000 [19:50<14:34, 1.18it/s, lr=0.0001, step_loss=0.0583]Steps: 48%|████▊ | 970/2000 [19:50<14:34, 1.18it/s, lr=0.0001, step_loss=0.00859]Steps: 49%|████▊ | 971/2000 [19:51<14:07, 1.21it/s, lr=0.0001, step_loss=0.00859]Steps: 49%|████▊ | 971/2000 [19:51<14:07, 1.21it/s, lr=0.0001, step_loss=0.262] Steps: 49%|████▊ | 972/2000 [19:52<13:47, 1.24it/s, lr=0.0001, step_loss=0.262]Steps: 49%|████▊ | 972/2000 [19:52<13:47, 1.24it/s, lr=0.0001, step_loss=0.00998]Steps: 49%|████▊ | 973/2000 [19:53<13:32, 1.26it/s, lr=0.0001, step_loss=0.00998]Steps: 49%|████▊ | 973/2000 [19:53<13:32, 1.26it/s, lr=0.0001, step_loss=0.0992] Steps: 49%|████▊ | 974/2000 [19:53<13:22, 1.28it/s, lr=0.0001, step_loss=0.0992]Steps: 49%|████▊ | 974/2000 [19:54<13:22, 1.28it/s, lr=0.0001, step_loss=0.0622]Steps: 49%|████▉ | 975/2000 [19:54<13:16, 1.29it/s, lr=0.0001, step_loss=0.0622]Steps: 49%|████▉ | 975/2000 [19:54<13:16, 1.29it/s, lr=0.0001, step_loss=0.018] Steps: 49%|████▉ | 976/2000 [19:55<13:10, 1.30it/s, lr=0.0001, step_loss=0.018]Steps: 49%|████▉ | 976/2000 [19:55<13:10, 1.30it/s, lr=0.0001, step_loss=0.0021]Steps: 49%|████▉ | 977/2000 [19:56<13:07, 1.30it/s, lr=0.0001, step_loss=0.0021]Steps: 49%|████▉ | 977/2000 [19:56<13:07, 1.30it/s, lr=0.0001, step_loss=0.0905]Steps: 49%|████▉ | 978/2000 [19:57<13:04, 1.30it/s, lr=0.0001, step_loss=0.0905]Steps: 49%|████▉ | 978/2000 [19:57<13:04, 1.30it/s, lr=0.0001, step_loss=0.426] Steps: 49%|████▉ | 979/2000 [19:57<13:02, 1.31it/s, lr=0.0001, step_loss=0.426]Steps: 49%|████▉ | 979/2000 [19:57<13:02, 1.31it/s, lr=0.0001, step_loss=0.036]Steps: 49%|████▉ | 980/2000 [19:58<13:00, 1.31it/s, lr=0.0001, step_loss=0.036]Steps: 49%|████▉ | 980/2000 [19:58<13:00, 1.31it/s, lr=0.0001, step_loss=0.0252]Steps: 49%|████▉ | 981/2000 [19:59<12:58, 1.31it/s, lr=0.0001, step_loss=0.0252]Steps: 49%|████▉ | 981/2000 [19:59<12:58, 1.31it/s, lr=0.0001, step_loss=0.0133]Steps: 49%|████▉ | 982/2000 [20:00<12:56, 1.31it/s, lr=0.0001, step_loss=0.0133]Steps: 49%|████▉ | 982/2000 [20:00<12:56, 1.31it/s, lr=0.0001, step_loss=0.00115]Steps: 49%|████▉ | 983/2000 [20:00<12:55, 1.31it/s, lr=0.0001, step_loss=0.00115]Steps: 49%|████▉ | 983/2000 [20:00<12:55, 1.31it/s, lr=0.0001, step_loss=0.00236]Steps: 49%|████▉ | 984/2000 [20:01<12:53, 1.31it/s, lr=0.0001, step_loss=0.00236]Steps: 49%|████▉ | 984/2000 [20:01<12:53, 1.31it/s, lr=0.0001, step_loss=0.0102] Steps: 49%|████▉ | 985/2000 [20:02<13:13, 1.28it/s, lr=0.0001, step_loss=0.0102]Steps: 49%|████▉ | 985/2000 [20:02<13:13, 1.28it/s, lr=0.0001, step_loss=0.0354]Steps: 49%|████▉ | 986/2000 [20:03<13:07, 1.29it/s, lr=0.0001, step_loss=0.0354]Steps: 49%|████▉ | 986/2000 [20:03<13:07, 1.29it/s, lr=0.0001, step_loss=0.0891]Steps: 49%|████▉ | 987/2000 [20:03<13:02, 1.30it/s, lr=0.0001, step_loss=0.0891]Steps: 49%|████▉ | 987/2000 [20:03<13:02, 1.30it/s, lr=0.0001, step_loss=0.0231]Steps: 49%|████▉ | 988/2000 [20:04<12:58, 1.30it/s, lr=0.0001, step_loss=0.0231]Steps: 49%|████▉ | 988/2000 [20:04<12:58, 1.30it/s, lr=0.0001, step_loss=0.0563]Steps: 49%|████▉ | 989/2000 [20:05<12:55, 1.30it/s, lr=0.0001, step_loss=0.0563]Steps: 49%|████▉ | 989/2000 [20:05<12:55, 1.30it/s, lr=0.0001, step_loss=0.01] Steps: 50%|████▉ | 990/2000 [20:06<12:52, 1.31it/s, lr=0.0001, step_loss=0.01]Steps: 50%|████▉ | 990/2000 [20:06<12:52, 1.31it/s, lr=0.0001, step_loss=0.0203]Steps: 50%|████▉ | 991/2000 [20:06<12:50, 1.31it/s, lr=0.0001, step_loss=0.0203]Steps: 50%|████▉ | 991/2000 [20:07<12:50, 1.31it/s, lr=0.0001, step_loss=0.0492]Steps: 50%|████▉ | 992/2000 [20:07<12:49, 1.31it/s, lr=0.0001, step_loss=0.0492]11/14/2025 06:28:56 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 992)
Steps: 50%|████▉ | 992/2000 [20:16<12:49, 1.31it/s, lr=0.0001, step_loss=0.0969]11/14/2025 06:28:56 - INFO - root - ### DEBUG: Finished epoch 30, epoch_steps=32, global_step=992
11/14/2025 06:28:56 - INFO - root - ### DEBUG: Starting epoch 31/63, global_step=992, max_train_steps=2000
Steps: 50%|████▉ | 993/2000 [20:17<58:10, 3.47s/it, lr=0.0001, step_loss=0.0969]Steps: 50%|████▉ | 993/2000 [20:17<58:10, 3.47s/it, lr=0.0001, step_loss=0.408] Steps: 50%|████▉ | 994/2000 [20:18<44:31, 2.66s/it, lr=0.0001, step_loss=0.408]Steps: 50%|████▉ | 994/2000 [20:18<44:31, 2.66s/it, lr=0.0001, step_loss=0.0236]Steps: 50%|████▉ | 995/2000 [20:19<34:57, 2.09s/it, lr=0.0001, step_loss=0.0236]Steps: 50%|████▉ | 995/2000 [20:19<34:57, 2.09s/it, lr=0.0001, step_loss=0.00132]Steps: 50%|████▉ | 996/2000 [20:19<28:16, 1.69s/it, lr=0.0001, step_loss=0.00132]Steps: 50%|████▉ | 996/2000 [20:19<28:16, 1.69s/it, lr=0.0001, step_loss=0.0704] Steps: 50%|████▉ | 997/2000 [20:20<23:35, 1.41s/it, lr=0.0001, step_loss=0.0704]Steps: 50%|████▉ | 997/2000 [20:20<23:35, 1.41s/it, lr=0.0001, step_loss=0.00739]Steps: 50%|████▉ | 998/2000 [20:21<20:18, 1.22s/it, lr=0.0001, step_loss=0.00739]Steps: 50%|████▉ | 998/2000 [20:21<20:18, 1.22s/it, lr=0.0001, step_loss=0.00501]Steps: 50%|████▉ | 999/2000 [20:22<18:01, 1.08s/it, lr=0.0001, step_loss=0.00501]Steps: 50%|████▉ | 999/2000 [20:22<18:01, 1.08s/it, lr=0.0001, step_loss=0.0175] Steps: 50%|█████ | 1000/2000 [20:22<16:24, 1.02it/s, lr=0.0001, step_loss=0.0175]11/14/2025 06:29:16 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1000)
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.01it/s][A100%|██████████| 8/8 [00:00<00:00, 32.19it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 43.98it/s][A100%|██████████| 8/8 [00:00<00:00, 32.17it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.03it/s][A100%|██████████| 8/8 [00:00<00:00, 32.19it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.03it/s][A100%|██████████| 8/8 [00:00<00:00, 32.20it/s]
11/14/2025 06:30:20 - INFO - root - Saved samples to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/samples/sample-1000.gif
Steps: 50%|█████ | 1000/2000 [21:41<16:24, 1.02it/s, lr=0.0001, step_loss=0.00653]Steps: 50%|█████ | 1001/2000 [21:41<6:46:32, 24.42s/it, lr=0.0001, step_loss=0.00653]Steps: 50%|█████ | 1001/2000 [21:41<6:46:32, 24.42s/it, lr=0.0001, step_loss=0.0106] Steps: 50%|█████ | 1002/2000 [21:42<4:48:05, 17.32s/it, lr=0.0001, step_loss=0.0106]Steps: 50%|█████ | 1002/2000 [21:42<4:48:05, 17.32s/it, lr=0.0001, step_loss=0.0139]Steps: 50%|█████ | 1003/2000 [21:43<3:25:15, 12.35s/it, lr=0.0001, step_loss=0.0139]Steps: 50%|█████ | 1003/2000 [21:43<3:25:15, 12.35s/it, lr=0.0001, step_loss=0.142] Steps: 50%|█████ | 1004/2000 [21:44<2:27:20, 8.88s/it, lr=0.0001, step_loss=0.142]Steps: 50%|█████ | 1004/2000 [21:44<2:27:20, 8.88s/it, lr=0.0001, step_loss=0.00266]Steps: 50%|█████ | 1005/2000 [21:45<1:46:50, 6.44s/it, lr=0.0001, step_loss=0.00266]Steps: 50%|█████ | 1005/2000 [21:45<1:46:50, 6.44s/it, lr=0.0001, step_loss=0.000427]Steps: 50%|█████ | 1006/2000 [21:45<1:18:29, 4.74s/it, lr=0.0001, step_loss=0.000427]Steps: 50%|█████ | 1006/2000 [21:45<1:18:29, 4.74s/it, lr=0.0001, step_loss=0.0115] Steps: 50%|█████ | 1007/2000 [21:46<58:40, 3.55s/it, lr=0.0001, step_loss=0.0115] Steps: 50%|█████ | 1007/2000 [21:46<58:40, 3.55s/it, lr=0.0001, step_loss=0.00337]Steps: 50%|█████ | 1008/2000 [21:47<44:48, 2.71s/it, lr=0.0001, step_loss=0.00337]Steps: 50%|█████ | 1008/2000 [21:47<44:48, 2.71s/it, lr=0.0001, step_loss=0.156] Steps: 50%|█████ | 1009/2000 [21:48<35:06, 2.13s/it, lr=0.0001, step_loss=0.156]Steps: 50%|█████ | 1009/2000 [21:48<35:06, 2.13s/it, lr=0.0001, step_loss=0.000491]Steps: 50%|█████ | 1010/2000 [21:48<28:18, 1.72s/it, lr=0.0001, step_loss=0.000491]Steps: 50%|█████ | 1010/2000 [21:48<28:18, 1.72s/it, lr=0.0001, step_loss=0.161] Steps: 51%|█████ | 1011/2000 [21:49<23:34, 1.43s/it, lr=0.0001, step_loss=0.161]Steps: 51%|█████ | 1011/2000 [21:49<23:34, 1.43s/it, lr=0.0001, step_loss=0.00101]Steps: 51%|█████ | 1012/2000 [21:50<20:14, 1.23s/it, lr=0.0001, step_loss=0.00101]Steps: 51%|█████ | 1012/2000 [21:50<20:14, 1.23s/it, lr=0.0001, step_loss=0.0145] Steps: 51%|█████ | 1013/2000 [21:51<17:55, 1.09s/it, lr=0.0001, step_loss=0.0145]Steps: 51%|█████ | 1013/2000 [21:51<17:55, 1.09s/it, lr=0.0001, step_loss=0.000866]Steps: 51%|█████ | 1014/2000 [21:51<16:17, 1.01it/s, lr=0.0001, step_loss=0.000866]Steps: 51%|█████ | 1014/2000 [21:51<16:17, 1.01it/s, lr=0.0001, step_loss=0.00952] Steps: 51%|█████ | 1015/2000 [21:52<15:09, 1.08it/s, lr=0.0001, step_loss=0.00952]Steps: 51%|█████ | 1015/2000 [21:52<15:09, 1.08it/s, lr=0.0001, step_loss=0.0416] Steps: 51%|█████ | 1016/2000 [21:53<14:20, 1.14it/s, lr=0.0001, step_loss=0.0416]Steps: 51%|█████ | 1016/2000 [21:53<14:20, 1.14it/s, lr=0.0001, step_loss=0.00421]Steps: 51%|█████ | 1017/2000 [21:54<13:45, 1.19it/s, lr=0.0001, step_loss=0.00421]Steps: 51%|█████ | 1017/2000 [21:54<13:45, 1.19it/s, lr=0.0001, step_loss=0.00332]Steps: 51%|█████ | 1018/2000 [21:54<13:21, 1.22it/s, lr=0.0001, step_loss=0.00332]Steps: 51%|█████ | 1018/2000 [21:54<13:21, 1.22it/s, lr=0.0001, step_loss=0.298] Steps: 51%|█████ | 1019/2000 [21:55<13:04, 1.25it/s, lr=0.0001, step_loss=0.298]Steps: 51%|█████ | 1019/2000 [21:55<13:04, 1.25it/s, lr=0.0001, step_loss=0.0195]Steps: 51%|█████ | 1020/2000 [21:56<12:53, 1.27it/s, lr=0.0001, step_loss=0.0195]Steps: 51%|█████ | 1020/2000 [21:56<12:53, 1.27it/s, lr=0.0001, step_loss=0.0865]Steps: 51%|█████ | 1021/2000 [21:57<12:44, 1.28it/s, lr=0.0001, step_loss=0.0865]Steps: 51%|█████ | 1021/2000 [21:57<12:44, 1.28it/s, lr=0.0001, step_loss=0.000677]Steps: 51%|█████ | 1022/2000 [21:57<12:38, 1.29it/s, lr=0.0001, step_loss=0.000677]Steps: 51%|█████ | 1022/2000 [21:57<12:38, 1.29it/s, lr=0.0001, step_loss=0.00326] Steps: 51%|█████ | 1023/2000 [21:58<12:34, 1.29it/s, lr=0.0001, step_loss=0.00326]Steps: 51%|█████ | 1023/2000 [21:58<12:34, 1.29it/s, lr=0.0001, step_loss=0.065] Steps: 51%|█████ | 1024/2000 [21:59<12:31, 1.30it/s, lr=0.0001, step_loss=0.065]11/14/2025 06:30:45 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1024)
Steps: 51%|█████ | 1024/2000 [22:06<12:31, 1.30it/s, lr=0.0001, step_loss=0.0527]11/14/2025 06:30:45 - INFO - root - ### DEBUG: Finished epoch 31, epoch_steps=32, global_step=1024
11/14/2025 06:30:45 - INFO - root - ### DEBUG: Starting epoch 32/63, global_step=1024, max_train_steps=2000
Steps: 51%|█████▏ | 1025/2000 [22:07<45:57, 2.83s/it, lr=0.0001, step_loss=0.0527]Steps: 51%|█████▏ | 1025/2000 [22:07<45:57, 2.83s/it, lr=0.0001, step_loss=0.000709]Steps: 51%|█████▏ | 1026/2000 [22:07<35:51, 2.21s/it, lr=0.0001, step_loss=0.000709]Steps: 51%|█████▏ | 1026/2000 [22:07<35:51, 2.21s/it, lr=0.0001, step_loss=0.0058] Steps: 51%|█████▏ | 1027/2000 [22:08<28:46, 1.77s/it, lr=0.0001, step_loss=0.0058]Steps: 51%|█████▏ | 1027/2000 [22:08<28:46, 1.77s/it, lr=0.0001, step_loss=0.00077]Steps: 51%|█████▏ | 1028/2000 [22:09<23:50, 1.47s/it, lr=0.0001, step_loss=0.00077]Steps: 51%|█████▏ | 1028/2000 [22:09<23:50, 1.47s/it, lr=0.0001, step_loss=0.0315] Steps: 51%|█████▏ | 1029/2000 [22:10<20:21, 1.26s/it, lr=0.0001, step_loss=0.0315]Steps: 51%|█████▏ | 1029/2000 [22:10<20:21, 1.26s/it, lr=0.0001, step_loss=0.189] Steps: 52%|█████▏ | 1030/2000 [22:10<17:55, 1.11s/it, lr=0.0001, step_loss=0.189]Steps: 52%|█████▏ | 1030/2000 [22:10<17:55, 1.11s/it, lr=0.0001, step_loss=0.000406]Steps: 52%|█████▏ | 1031/2000 [22:11<16:13, 1.00s/it, lr=0.0001, step_loss=0.000406]Steps: 52%|█████▏ | 1031/2000 [22:11<16:13, 1.00s/it, lr=0.0001, step_loss=0.087] Steps: 52%|█████▏ | 1032/2000 [22:12<15:02, 1.07it/s, lr=0.0001, step_loss=0.087]Steps: 52%|█████▏ | 1032/2000 [22:12<15:02, 1.07it/s, lr=0.0001, step_loss=0.000913]Steps: 52%|█████▏ | 1033/2000 [22:13<14:11, 1.14it/s, lr=0.0001, step_loss=0.000913]Steps: 52%|█████▏ | 1033/2000 [22:13<14:11, 1.14it/s, lr=0.0001, step_loss=0.0013] Steps: 52%|█████▏ | 1034/2000 [22:13<13:36, 1.18it/s, lr=0.0001, step_loss=0.0013]Steps: 52%|█████▏ | 1034/2000 [22:14<13:36, 1.18it/s, lr=0.0001, step_loss=0.000448]Steps: 52%|█████▏ | 1035/2000 [22:14<13:11, 1.22it/s, lr=0.0001, step_loss=0.000448]Steps: 52%|█████▏ | 1035/2000 [22:14<13:11, 1.22it/s, lr=0.0001, step_loss=0.0437] Steps: 52%|█████▏ | 1036/2000 [22:15<12:53, 1.25it/s, lr=0.0001, step_loss=0.0437]Steps: 52%|█████▏ | 1036/2000 [22:15<12:53, 1.25it/s, lr=0.0001, step_loss=0.0609]Steps: 52%|█████▏ | 1037/2000 [22:16<12:40, 1.27it/s, lr=0.0001, step_loss=0.0609]Steps: 52%|█████▏ | 1037/2000 [22:16<12:40, 1.27it/s, lr=0.0001, step_loss=0.17] Steps: 52%|█████▏ | 1038/2000 [22:17<12:31, 1.28it/s, lr=0.0001, step_loss=0.17]Steps: 52%|█████▏ | 1038/2000 [22:17<12:31, 1.28it/s, lr=0.0001, step_loss=0.0273]Steps: 52%|█████▏ | 1039/2000 [22:17<12:24, 1.29it/s, lr=0.0001, step_loss=0.0273]Steps: 52%|█████▏ | 1039/2000 [22:17<12:24, 1.29it/s, lr=0.0001, step_loss=0.0945]Steps: 52%|█████▏ | 1040/2000 [22:18<12:20, 1.30it/s, lr=0.0001, step_loss=0.0945]Steps: 52%|█████▏ | 1040/2000 [22:18<12:20, 1.30it/s, lr=0.0001, step_loss=0.00132]Steps: 52%|█████▏ | 1041/2000 [22:19<12:16, 1.30it/s, lr=0.0001, step_loss=0.00132]Steps: 52%|█████▏ | 1041/2000 [22:19<12:16, 1.30it/s, lr=0.0001, step_loss=0.00235]Steps: 52%|█████▏ | 1042/2000 [22:20<12:13, 1.31it/s, lr=0.0001, step_loss=0.00235]Steps: 52%|█████▏ | 1042/2000 [22:20<12:13, 1.31it/s, lr=0.0001, step_loss=0.00173]Steps: 52%|█████▏ | 1043/2000 [22:20<12:11, 1.31it/s, lr=0.0001, step_loss=0.00173]Steps: 52%|█████▏ | 1043/2000 [22:20<12:11, 1.31it/s, lr=0.0001, step_loss=0.0412] Steps: 52%|█████▏ | 1044/2000 [22:21<12:10, 1.31it/s, lr=0.0001, step_loss=0.0412]Steps: 52%|█████▏ | 1044/2000 [22:21<12:10, 1.31it/s, lr=0.0001, step_loss=0.0473]Steps: 52%|█████▏ | 1045/2000 [22:22<12:08, 1.31it/s, lr=0.0001, step_loss=0.0473]Steps: 52%|█████▏ | 1045/2000 [22:22<12:08, 1.31it/s, lr=0.0001, step_loss=0.0327]Steps: 52%|█████▏ | 1046/2000 [22:23<12:09, 1.31it/s, lr=0.0001, step_loss=0.0327]Steps: 52%|█████▏ | 1046/2000 [22:23<12:09, 1.31it/s, lr=0.0001, step_loss=0.000467]Steps: 52%|█████▏ | 1047/2000 [22:23<12:08, 1.31it/s, lr=0.0001, step_loss=0.000467]Steps: 52%|█████▏ | 1047/2000 [22:23<12:08, 1.31it/s, lr=0.0001, step_loss=0.0252] Steps: 52%|█████▏ | 1048/2000 [22:24<12:07, 1.31it/s, lr=0.0001, step_loss=0.0252]Steps: 52%|█████▏ | 1048/2000 [22:24<12:07, 1.31it/s, lr=0.0001, step_loss=0.00389]Steps: 52%|█████▏ | 1049/2000 [22:25<12:06, 1.31it/s, lr=0.0001, step_loss=0.00389]Steps: 52%|█████▏ | 1049/2000 [22:25<12:06, 1.31it/s, lr=0.0001, step_loss=0.00203]Steps: 52%|█████▎ | 1050/2000 [22:26<12:04, 1.31it/s, lr=0.0001, step_loss=0.00203]Steps: 52%|█████▎ | 1050/2000 [22:26<12:04, 1.31it/s, lr=0.0001, step_loss=0.00184]Steps: 53%|█████▎ | 1051/2000 [22:26<12:04, 1.31it/s, lr=0.0001, step_loss=0.00184]Steps: 53%|█████▎ | 1051/2000 [22:26<12:04, 1.31it/s, lr=0.0001, step_loss=0.00134]Steps: 53%|█████▎ | 1052/2000 [22:27<12:03, 1.31it/s, lr=0.0001, step_loss=0.00134]Steps: 53%|█████▎ | 1052/2000 [22:27<12:03, 1.31it/s, lr=0.0001, step_loss=0.0019] Steps: 53%|█████▎ | 1053/2000 [22:28<12:02, 1.31it/s, lr=0.0001, step_loss=0.0019]Steps: 53%|█████▎ | 1053/2000 [22:28<12:02, 1.31it/s, lr=0.0001, step_loss=0.0142]Steps: 53%|█████▎ | 1054/2000 [22:29<12:02, 1.31it/s, lr=0.0001, step_loss=0.0142]Steps: 53%|█████▎ | 1054/2000 [22:29<12:02, 1.31it/s, lr=0.0001, step_loss=0.0072]Steps: 53%|█████▎ | 1055/2000 [22:29<12:01, 1.31it/s, lr=0.0001, step_loss=0.0072]Steps: 53%|█████▎ | 1055/2000 [22:30<12:01, 1.31it/s, lr=0.0001, step_loss=0.0123]Steps: 53%|█████▎ | 1056/2000 [22:30<12:00, 1.31it/s, lr=0.0001, step_loss=0.0123]11/14/2025 06:31:17 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1056)
Steps: 53%|█████▎ | 1056/2000 [22:37<12:00, 1.31it/s, lr=0.0001, step_loss=0.00628]11/14/2025 06:31:17 - INFO - root - ### DEBUG: Finished epoch 32, epoch_steps=32, global_step=1056
11/14/2025 06:31:17 - INFO - root - ### DEBUG: Starting epoch 33/63, global_step=1056, max_train_steps=2000
Steps: 53%|█████▎ | 1057/2000 [22:38<43:22, 2.76s/it, lr=0.0001, step_loss=0.00628]Steps: 53%|█████▎ | 1057/2000 [22:38<43:22, 2.76s/it, lr=0.0001, step_loss=0.464] Steps: 53%|█████▎ | 1058/2000 [22:38<33:54, 2.16s/it, lr=0.0001, step_loss=0.464]Steps: 53%|█████▎ | 1058/2000 [22:38<33:54, 2.16s/it, lr=0.0001, step_loss=0.0293]Steps: 53%|█████▎ | 1059/2000 [22:39<27:17, 1.74s/it, lr=0.0001, step_loss=0.0293]Steps: 53%|█████▎ | 1059/2000 [22:39<27:17, 1.74s/it, lr=0.0001, step_loss=0.0128]Steps: 53%|█████▎ | 1060/2000 [22:40<22:40, 1.45s/it, lr=0.0001, step_loss=0.0128]Steps: 53%|█████▎ | 1060/2000 [22:40<22:40, 1.45s/it, lr=0.0001, step_loss=0.0479]Steps: 53%|█████▎ | 1061/2000 [22:41<19:25, 1.24s/it, lr=0.0001, step_loss=0.0479]Steps: 53%|█████▎ | 1061/2000 [22:41<19:25, 1.24s/it, lr=0.0001, step_loss=0.204] Steps: 53%|█████▎ | 1062/2000 [22:41<17:09, 1.10s/it, lr=0.0001, step_loss=0.204]Steps: 53%|█████▎ | 1062/2000 [22:42<17:09, 1.10s/it, lr=0.0001, step_loss=0.00246]Steps: 53%|█████▎ | 1063/2000 [22:42<15:33, 1.00it/s, lr=0.0001, step_loss=0.00246]Steps: 53%|█████▎ | 1063/2000 [22:42<15:33, 1.00it/s, lr=0.0001, step_loss=0.026] Steps: 53%|█████▎ | 1064/2000 [22:43<14:26, 1.08it/s, lr=0.0001, step_loss=0.026]Steps: 53%|█████▎ | 1064/2000 [22:43<14:26, 1.08it/s, lr=0.0001, step_loss=0.000599]Steps: 53%|█████▎ | 1065/2000 [22:44<13:39, 1.14it/s, lr=0.0001, step_loss=0.000599]Steps: 53%|█████▎ | 1065/2000 [22:44<13:39, 1.14it/s, lr=0.0001, step_loss=0.0243] Steps: 53%|█████▎ | 1066/2000 [22:45<13:06, 1.19it/s, lr=0.0001, step_loss=0.0243]Steps: 53%|█████▎ | 1066/2000 [22:45<13:06, 1.19it/s, lr=0.0001, step_loss=0.0678]Steps: 53%|█████▎ | 1067/2000 [22:45<12:42, 1.22it/s, lr=0.0001, step_loss=0.0678]Steps: 53%|█████▎ | 1067/2000 [22:45<12:42, 1.22it/s, lr=0.0001, step_loss=0.00313]Steps: 53%|█████▎ | 1068/2000 [22:46<12:26, 1.25it/s, lr=0.0001, step_loss=0.00313]Steps: 53%|█████▎ | 1068/2000 [22:46<12:26, 1.25it/s, lr=0.0001, step_loss=0.0409] Steps: 53%|█████▎ | 1069/2000 [22:47<12:14, 1.27it/s, lr=0.0001, step_loss=0.0409]Steps: 53%|█████▎ | 1069/2000 [22:47<12:14, 1.27it/s, lr=0.0001, step_loss=0.233] Steps: 54%|█████▎ | 1070/2000 [22:48<12:06, 1.28it/s, lr=0.0001, step_loss=0.233]Steps: 54%|█████▎ | 1070/2000 [22:48<12:06, 1.28it/s, lr=0.0001, step_loss=0.00119]Steps: 54%|█████▎ | 1071/2000 [22:48<11:59, 1.29it/s, lr=0.0001, step_loss=0.00119]Steps: 54%|█████▎ | 1071/2000 [22:48<11:59, 1.29it/s, lr=0.0001, step_loss=0.000749]Steps: 54%|█████▎ | 1072/2000 [22:49<11:55, 1.30it/s, lr=0.0001, step_loss=0.000749]Steps: 54%|█████▎ | 1072/2000 [22:49<11:55, 1.30it/s, lr=0.0001, step_loss=0.0235] Steps: 54%|█████▎ | 1073/2000 [22:50<11:52, 1.30it/s, lr=0.0001, step_loss=0.0235]Steps: 54%|█████▎ | 1073/2000 [22:50<11:52, 1.30it/s, lr=0.0001, step_loss=0.0011]Steps: 54%|█████▎ | 1074/2000 [22:51<11:49, 1.30it/s, lr=0.0001, step_loss=0.0011]Steps: 54%|█████▎ | 1074/2000 [22:51<11:49, 1.30it/s, lr=0.0001, step_loss=0.245] Steps: 54%|█████▍ | 1075/2000 [22:51<11:48, 1.31it/s, lr=0.0001, step_loss=0.245]Steps: 54%|█████▍ | 1075/2000 [22:51<11:48, 1.31it/s, lr=0.0001, step_loss=0.0197]Steps: 54%|█████▍ | 1076/2000 [22:52<11:46, 1.31it/s, lr=0.0001, step_loss=0.0197]Steps: 54%|█████▍ | 1076/2000 [22:52<11:46, 1.31it/s, lr=0.0001, step_loss=0.000789]Steps: 54%|█████▍ | 1077/2000 [22:53<11:45, 1.31it/s, lr=0.0001, step_loss=0.000789]Steps: 54%|█████▍ | 1077/2000 [22:53<11:45, 1.31it/s, lr=0.0001, step_loss=0.187] Steps: 54%|█████▍ | 1078/2000 [22:54<11:43, 1.31it/s, lr=0.0001, step_loss=0.187]Steps: 54%|█████▍ | 1078/2000 [22:54<11:43, 1.31it/s, lr=0.0001, step_loss=0.194]Steps: 54%|█████▍ | 1079/2000 [22:54<11:42, 1.31it/s, lr=0.0001, step_loss=0.194]Steps: 54%|█████▍ | 1079/2000 [22:54<11:42, 1.31it/s, lr=0.0001, step_loss=0.00745]Steps: 54%|█████▍ | 1080/2000 [22:55<11:41, 1.31it/s, lr=0.0001, step_loss=0.00745]Steps: 54%|█████▍ | 1080/2000 [22:55<11:41, 1.31it/s, lr=0.0001, step_loss=0.000451]Steps: 54%|█████▍ | 1081/2000 [22:56<11:40, 1.31it/s, lr=0.0001, step_loss=0.000451]Steps: 54%|█████▍ | 1081/2000 [22:56<11:40, 1.31it/s, lr=0.0001, step_loss=0.00122] Steps: 54%|█████▍ | 1082/2000 [22:57<11:39, 1.31it/s, lr=0.0001, step_loss=0.00122]Steps: 54%|█████▍ | 1082/2000 [22:57<11:39, 1.31it/s, lr=0.0001, step_loss=0.00915]Steps: 54%|█████▍ | 1083/2000 [22:57<11:38, 1.31it/s, lr=0.0001, step_loss=0.00915]Steps: 54%|█████▍ | 1083/2000 [22:58<11:38, 1.31it/s, lr=0.0001, step_loss=0.0822] Steps: 54%|█████▍ | 1084/2000 [22:58<11:38, 1.31it/s, lr=0.0001, step_loss=0.0822]Steps: 54%|█████▍ | 1084/2000 [22:58<11:38, 1.31it/s, lr=0.0001, step_loss=0.0134]Steps: 54%|█████▍ | 1085/2000 [22:59<11:37, 1.31it/s, lr=0.0001, step_loss=0.0134]Steps: 54%|█████▍ | 1085/2000 [22:59<11:37, 1.31it/s, lr=0.0001, step_loss=0.0022]Steps: 54%|█████▍ | 1086/2000 [23:00<11:36, 1.31it/s, lr=0.0001, step_loss=0.0022]Steps: 54%|█████▍ | 1086/2000 [23:00<11:36, 1.31it/s, lr=0.0001, step_loss=0.0288]Steps: 54%|█████▍ | 1087/2000 [23:01<11:36, 1.31it/s, lr=0.0001, step_loss=0.0288]Steps: 54%|█████▍ | 1087/2000 [23:01<11:36, 1.31it/s, lr=0.0001, step_loss=0.000559]Steps: 54%|█████▍ | 1088/2000 [23:01<11:34, 1.31it/s, lr=0.0001, step_loss=0.000559]11/14/2025 06:31:47 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1088)
Steps: 54%|█████▍ | 1088/2000 [23:08<11:34, 1.31it/s, lr=0.0001, step_loss=0.00276] 11/14/2025 06:31:47 - INFO - root - ### DEBUG: Finished epoch 33, epoch_steps=32, global_step=1088
11/14/2025 06:31:47 - INFO - root - ### DEBUG: Starting epoch 34/63, global_step=1088, max_train_steps=2000
Steps: 54%|█████▍ | 1089/2000 [23:08<40:25, 2.66s/it, lr=0.0001, step_loss=0.00276]Steps: 54%|█████▍ | 1089/2000 [23:08<40:25, 2.66s/it, lr=0.0001, step_loss=0.000845]Steps: 55%|█████▍ | 1090/2000 [23:09<31:43, 2.09s/it, lr=0.0001, step_loss=0.000845]Steps: 55%|█████▍ | 1090/2000 [23:09<31:43, 2.09s/it, lr=0.0001, step_loss=0.00385] Steps: 55%|█████▍ | 1091/2000 [23:10<25:38, 1.69s/it, lr=0.0001, step_loss=0.00385]Steps: 55%|█████▍ | 1091/2000 [23:10<25:38, 1.69s/it, lr=0.0001, step_loss=0.181] Steps: 55%|█████▍ | 1092/2000 [23:11<21:23, 1.41s/it, lr=0.0001, step_loss=0.181]Steps: 55%|█████▍ | 1092/2000 [23:11<21:23, 1.41s/it, lr=0.0001, step_loss=0.0649]Steps: 55%|█████▍ | 1093/2000 [23:11<18:24, 1.22s/it, lr=0.0001, step_loss=0.0649]Steps: 55%|█████▍ | 1093/2000 [23:11<18:24, 1.22s/it, lr=0.0001, step_loss=0.0223]Steps: 55%|█████▍ | 1094/2000 [23:12<16:19, 1.08s/it, lr=0.0001, step_loss=0.0223]Steps: 55%|█████▍ | 1094/2000 [23:12<16:19, 1.08s/it, lr=0.0001, step_loss=0.000877]Steps: 55%|█████▍ | 1095/2000 [23:13<14:51, 1.01it/s, lr=0.0001, step_loss=0.000877]Steps: 55%|█████▍ | 1095/2000 [23:13<14:51, 1.01it/s, lr=0.0001, step_loss=0.0741] Steps: 55%|█████▍ | 1096/2000 [23:14<13:50, 1.09it/s, lr=0.0001, step_loss=0.0741]Steps: 55%|█████▍ | 1096/2000 [23:14<13:50, 1.09it/s, lr=0.0001, step_loss=0.0159]Steps: 55%|█████▍ | 1097/2000 [23:14<13:06, 1.15it/s, lr=0.0001, step_loss=0.0159]Steps: 55%|█████▍ | 1097/2000 [23:15<13:06, 1.15it/s, lr=0.0001, step_loss=0.0655]Steps: 55%|█████▍ | 1098/2000 [23:15<12:36, 1.19it/s, lr=0.0001, step_loss=0.0655]Steps: 55%|█████▍ | 1098/2000 [23:15<12:36, 1.19it/s, lr=0.0001, step_loss=0.153] Steps: 55%|█████▍ | 1099/2000 [23:16<12:15, 1.23it/s, lr=0.0001, step_loss=0.153]Steps: 55%|█████▍ | 1099/2000 [23:16<12:15, 1.23it/s, lr=0.0001, step_loss=0.00223]Steps: 55%|█████▌ | 1100/2000 [23:17<11:59, 1.25it/s, lr=0.0001, step_loss=0.00223]Steps: 55%|█████▌ | 1100/2000 [23:17<11:59, 1.25it/s, lr=0.0001, step_loss=0.0473] Steps: 55%|█████▌ | 1101/2000 [23:18<11:48, 1.27it/s, lr=0.0001, step_loss=0.0473]Steps: 55%|█████▌ | 1101/2000 [23:18<11:48, 1.27it/s, lr=0.0001, step_loss=0.0911]Steps: 55%|█████▌ | 1102/2000 [23:18<11:40, 1.28it/s, lr=0.0001, step_loss=0.0911]Steps: 55%|█████▌ | 1102/2000 [23:18<11:40, 1.28it/s, lr=0.0001, step_loss=0.00954]Steps: 55%|█████▌ | 1103/2000 [23:19<11:34, 1.29it/s, lr=0.0001, step_loss=0.00954]Steps: 55%|█████▌ | 1103/2000 [23:19<11:34, 1.29it/s, lr=0.0001, step_loss=0.0777] Steps: 55%|█████▌ | 1104/2000 [23:20<11:30, 1.30it/s, lr=0.0001, step_loss=0.0777]Steps: 55%|█████▌ | 1104/2000 [23:20<11:30, 1.30it/s, lr=0.0001, step_loss=0.226] Steps: 55%|█████▌ | 1105/2000 [23:21<11:26, 1.30it/s, lr=0.0001, step_loss=0.226]Steps: 55%|█████▌ | 1105/2000 [23:21<11:26, 1.30it/s, lr=0.0001, step_loss=0.000418]Steps: 55%|█████▌ | 1106/2000 [23:21<11:24, 1.31it/s, lr=0.0001, step_loss=0.000418]Steps: 55%|█████▌ | 1106/2000 [23:21<11:24, 1.31it/s, lr=0.0001, step_loss=0.0145] Steps: 55%|█████▌ | 1107/2000 [23:22<11:22, 1.31it/s, lr=0.0001, step_loss=0.0145]Steps: 55%|█████▌ | 1107/2000 [23:22<11:22, 1.31it/s, lr=0.0001, step_loss=0.00489]Steps: 55%|█████▌ | 1108/2000 [23:23<11:21, 1.31it/s, lr=0.0001, step_loss=0.00489]Steps: 55%|█████▌ | 1108/2000 [23:23<11:21, 1.31it/s, lr=0.0001, step_loss=0.167] Steps: 55%|█████▌ | 1109/2000 [23:24<11:19, 1.31it/s, lr=0.0001, step_loss=0.167]Steps: 55%|█████▌ | 1109/2000 [23:24<11:19, 1.31it/s, lr=0.0001, step_loss=0.0031]Steps: 56%|█████▌ | 1110/2000 [23:24<11:18, 1.31it/s, lr=0.0001, step_loss=0.0031]Steps: 56%|█████▌ | 1110/2000 [23:24<11:18, 1.31it/s, lr=0.0001, step_loss=0.000338]Steps: 56%|█████▌ | 1111/2000 [23:25<11:17, 1.31it/s, lr=0.0001, step_loss=0.000338]Steps: 56%|█████▌ | 1111/2000 [23:25<11:17, 1.31it/s, lr=0.0001, step_loss=0.000985]Steps: 56%|█████▌ | 1112/2000 [23:26<11:16, 1.31it/s, lr=0.0001, step_loss=0.000985]Steps: 56%|█████▌ | 1112/2000 [23:26<11:16, 1.31it/s, lr=0.0001, step_loss=0.0982] Steps: 56%|█████▌ | 1113/2000 [23:27<11:15, 1.31it/s, lr=0.0001, step_loss=0.0982]Steps: 56%|█████▌ | 1113/2000 [23:27<11:15, 1.31it/s, lr=0.0001, step_loss=0.0177]Steps: 56%|█████▌ | 1114/2000 [23:27<11:16, 1.31it/s, lr=0.0001, step_loss=0.0177]Steps: 56%|█████▌ | 1114/2000 [23:27<11:16, 1.31it/s, lr=0.0001, step_loss=0.0103]Steps: 56%|█████▌ | 1115/2000 [23:28<11:15, 1.31it/s, lr=0.0001, step_loss=0.0103]Steps: 56%|█████▌ | 1115/2000 [23:28<11:15, 1.31it/s, lr=0.0001, step_loss=0.00126]Steps: 56%|█████▌ | 1116/2000 [23:29<11:14, 1.31it/s, lr=0.0001, step_loss=0.00126]Steps: 56%|█████▌ | 1116/2000 [23:29<11:14, 1.31it/s, lr=0.0001, step_loss=0.0105] Steps: 56%|█████▌ | 1117/2000 [23:30<11:12, 1.31it/s, lr=0.0001, step_loss=0.0105]Steps: 56%|█████▌ | 1117/2000 [23:30<11:12, 1.31it/s, lr=0.0001, step_loss=0.000791]Steps: 56%|█████▌ | 1118/2000 [23:30<11:12, 1.31it/s, lr=0.0001, step_loss=0.000791]Steps: 56%|█████▌ | 1118/2000 [23:31<11:12, 1.31it/s, lr=0.0001, step_loss=0.0665] Steps: 56%|█████▌ | 1119/2000 [23:31<11:11, 1.31it/s, lr=0.0001, step_loss=0.0665]Steps: 56%|█████▌ | 1119/2000 [23:31<11:11, 1.31it/s, lr=0.0001, step_loss=0.000667]Steps: 56%|█████▌ | 1120/2000 [23:32<11:11, 1.31it/s, lr=0.0001, step_loss=0.000667]11/14/2025 06:32:18 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1120)
Steps: 56%|█████▌ | 1120/2000 [23:39<11:11, 1.31it/s, lr=0.0001, step_loss=0.169] 11/14/2025 06:32:18 - INFO - root - ### DEBUG: Finished epoch 34, epoch_steps=32, global_step=1120
11/14/2025 06:32:18 - INFO - root - ### DEBUG: Starting epoch 35/63, global_step=1120, max_train_steps=2000
Steps: 56%|█████▌ | 1121/2000 [23:39<40:46, 2.78s/it, lr=0.0001, step_loss=0.169]Steps: 56%|█████▌ | 1121/2000 [23:40<40:46, 2.78s/it, lr=0.0001, step_loss=0.0148]Steps: 56%|█████▌ | 1122/2000 [23:40<31:51, 2.18s/it, lr=0.0001, step_loss=0.0148]Steps: 56%|█████▌ | 1122/2000 [23:40<31:51, 2.18s/it, lr=0.0001, step_loss=0.00357]Steps: 56%|█████▌ | 1123/2000 [23:41<25:37, 1.75s/it, lr=0.0001, step_loss=0.00357]Steps: 56%|█████▌ | 1123/2000 [23:41<25:37, 1.75s/it, lr=0.0001, step_loss=0.161] Steps: 56%|█████▌ | 1124/2000 [23:42<21:15, 1.46s/it, lr=0.0001, step_loss=0.161]Steps: 56%|█████▌ | 1124/2000 [23:42<21:15, 1.46s/it, lr=0.0001, step_loss=0.0146]Steps: 56%|█████▋ | 1125/2000 [23:43<18:11, 1.25s/it, lr=0.0001, step_loss=0.0146]Steps: 56%|█████▋ | 1125/2000 [23:43<18:11, 1.25s/it, lr=0.0001, step_loss=0.0519]Steps: 56%|█████▋ | 1126/2000 [23:43<16:03, 1.10s/it, lr=0.0001, step_loss=0.0519]Steps: 56%|█████▋ | 1126/2000 [23:43<16:03, 1.10s/it, lr=0.0001, step_loss=0.00528]Steps: 56%|█████▋ | 1127/2000 [23:44<14:32, 1.00it/s, lr=0.0001, step_loss=0.00528]Steps: 56%|█████▋ | 1127/2000 [23:44<14:32, 1.00it/s, lr=0.0001, step_loss=0.0157] Steps: 56%|█████▋ | 1128/2000 [23:45<13:29, 1.08it/s, lr=0.0001, step_loss=0.0157]Steps: 56%|█████▋ | 1128/2000 [23:45<13:29, 1.08it/s, lr=0.0001, step_loss=0.00528]Steps: 56%|█████▋ | 1129/2000 [23:46<12:44, 1.14it/s, lr=0.0001, step_loss=0.00528]Steps: 56%|█████▋ | 1129/2000 [23:46<12:44, 1.14it/s, lr=0.0001, step_loss=0.0257] Steps: 56%|█████▋ | 1130/2000 [23:46<12:13, 1.19it/s, lr=0.0001, step_loss=0.0257]Steps: 56%|█████▋ | 1130/2000 [23:46<12:13, 1.19it/s, lr=0.0001, step_loss=0.00359]Steps: 57%|█████▋ | 1131/2000 [23:47<11:51, 1.22it/s, lr=0.0001, step_loss=0.00359]Steps: 57%|█████▋ | 1131/2000 [23:47<11:51, 1.22it/s, lr=0.0001, step_loss=0.00135]Steps: 57%|█████▋ | 1132/2000 [23:48<11:35, 1.25it/s, lr=0.0001, step_loss=0.00135]Steps: 57%|█████▋ | 1132/2000 [23:48<11:35, 1.25it/s, lr=0.0001, step_loss=0.136] Steps: 57%|█████▋ | 1133/2000 [23:49<11:24, 1.27it/s, lr=0.0001, step_loss=0.136]Steps: 57%|█████▋ | 1133/2000 [23:49<11:24, 1.27it/s, lr=0.0001, step_loss=0.00522]Steps: 57%|█████▋ | 1134/2000 [23:49<11:16, 1.28it/s, lr=0.0001, step_loss=0.00522]Steps: 57%|█████▋ | 1134/2000 [23:49<11:16, 1.28it/s, lr=0.0001, step_loss=0.0413] Steps: 57%|█████▋ | 1135/2000 [23:50<11:10, 1.29it/s, lr=0.0001, step_loss=0.0413]Steps: 57%|█████▋ | 1135/2000 [23:50<11:10, 1.29it/s, lr=0.0001, step_loss=0.0204]Steps: 57%|█████▋ | 1136/2000 [23:51<11:06, 1.30it/s, lr=0.0001, step_loss=0.0204]Steps: 57%|█████▋ | 1136/2000 [23:51<11:06, 1.30it/s, lr=0.0001, step_loss=0.0108]Steps: 57%|█████▋ | 1137/2000 [23:52<11:02, 1.30it/s, lr=0.0001, step_loss=0.0108]Steps: 57%|█████▋ | 1137/2000 [23:52<11:02, 1.30it/s, lr=0.0001, step_loss=0.0213]Steps: 57%|█████▋ | 1138/2000 [23:52<11:00, 1.30it/s, lr=0.0001, step_loss=0.0213]Steps: 57%|█████▋ | 1138/2000 [23:52<11:00, 1.30it/s, lr=0.0001, step_loss=0.000612]Steps: 57%|█████▋ | 1139/2000 [23:53<10:58, 1.31it/s, lr=0.0001, step_loss=0.000612]Steps: 57%|█████▋ | 1139/2000 [23:53<10:58, 1.31it/s, lr=0.0001, step_loss=0.0134] Steps: 57%|█████▋ | 1140/2000 [23:54<10:56, 1.31it/s, lr=0.0001, step_loss=0.0134]Steps: 57%|█████▋ | 1140/2000 [23:54<10:56, 1.31it/s, lr=0.0001, step_loss=0.000446]Steps: 57%|█████▋ | 1141/2000 [23:55<10:55, 1.31it/s, lr=0.0001, step_loss=0.000446]Steps: 57%|█████▋ | 1141/2000 [23:55<10:55, 1.31it/s, lr=0.0001, step_loss=0.568] Steps: 57%|█████▋ | 1142/2000 [23:55<10:54, 1.31it/s, lr=0.0001, step_loss=0.568]Steps: 57%|█████▋ | 1142/2000 [23:56<10:54, 1.31it/s, lr=0.0001, step_loss=0.082]Steps: 57%|█████▋ | 1143/2000 [23:56<10:53, 1.31it/s, lr=0.0001, step_loss=0.082]Steps: 57%|█████▋ | 1143/2000 [23:56<10:53, 1.31it/s, lr=0.0001, step_loss=0.0305]Steps: 57%|█████▋ | 1144/2000 [23:57<10:52, 1.31it/s, lr=0.0001, step_loss=0.0305]Steps: 57%|█████▋ | 1144/2000 [23:57<10:52, 1.31it/s, lr=0.0001, step_loss=0.0208]Steps: 57%|█████▋ | 1145/2000 [23:58<10:50, 1.31it/s, lr=0.0001, step_loss=0.0208]Steps: 57%|█████▋ | 1145/2000 [23:58<10:50, 1.31it/s, lr=0.0001, step_loss=0.0667]Steps: 57%|█████▋ | 1146/2000 [23:59<10:50, 1.31it/s, lr=0.0001, step_loss=0.0667]Steps: 57%|█████▋ | 1146/2000 [23:59<10:50, 1.31it/s, lr=0.0001, step_loss=0.0244]Steps: 57%|█████▋ | 1147/2000 [23:59<10:49, 1.31it/s, lr=0.0001, step_loss=0.0244]Steps: 57%|█████▋ | 1147/2000 [23:59<10:49, 1.31it/s, lr=0.0001, step_loss=0.000774]Steps: 57%|█████▋ | 1148/2000 [24:00<10:48, 1.31it/s, lr=0.0001, step_loss=0.000774]Steps: 57%|█████▋ | 1148/2000 [24:00<10:48, 1.31it/s, lr=0.0001, step_loss=0.0476] Steps: 57%|█████▋ | 1149/2000 [24:01<10:48, 1.31it/s, lr=0.0001, step_loss=0.0476]Steps: 57%|█████▋ | 1149/2000 [24:01<10:48, 1.31it/s, lr=0.0001, step_loss=0.0599]Steps: 57%|█████▊ | 1150/2000 [24:02<10:47, 1.31it/s, lr=0.0001, step_loss=0.0599]Steps: 57%|█████▊ | 1150/2000 [24:02<10:47, 1.31it/s, lr=0.0001, step_loss=0.00314]Steps: 58%|█████▊ | 1151/2000 [24:02<10:47, 1.31it/s, lr=0.0001, step_loss=0.00314]Steps: 58%|█████▊ | 1151/2000 [24:02<10:47, 1.31it/s, lr=0.0001, step_loss=0.0121] Steps: 58%|█████▊ | 1152/2000 [24:03<10:46, 1.31it/s, lr=0.0001, step_loss=0.0121]11/14/2025 06:32:51 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1152)
Steps: 58%|█████▊ | 1152/2000 [24:11<10:46, 1.31it/s, lr=0.0001, step_loss=0.0192]11/14/2025 06:32:51 - INFO - root - ### DEBUG: Finished epoch 35, epoch_steps=32, global_step=1152
11/14/2025 06:32:51 - INFO - root - ### DEBUG: Starting epoch 36/63, global_step=1152, max_train_steps=2000
Steps: 58%|█████▊ | 1153/2000 [24:12<43:43, 3.10s/it, lr=0.0001, step_loss=0.0192]Steps: 58%|█████▊ | 1153/2000 [24:12<43:43, 3.10s/it, lr=0.0001, step_loss=0.00306]Steps: 58%|█████▊ | 1154/2000 [24:12<33:47, 2.40s/it, lr=0.0001, step_loss=0.00306]Steps: 58%|█████▊ | 1154/2000 [24:12<33:47, 2.40s/it, lr=0.0001, step_loss=0.0141] Steps: 58%|█████▊ | 1155/2000 [24:13<26:50, 1.91s/it, lr=0.0001, step_loss=0.0141]Steps: 58%|█████▊ | 1155/2000 [24:13<26:50, 1.91s/it, lr=0.0001, step_loss=0.000633]Steps: 58%|█████▊ | 1156/2000 [24:14<21:59, 1.56s/it, lr=0.0001, step_loss=0.000633]Steps: 58%|█████▊ | 1156/2000 [24:14<21:59, 1.56s/it, lr=0.0001, step_loss=0.0567] Steps: 58%|█████▊ | 1157/2000 [24:15<18:35, 1.32s/it, lr=0.0001, step_loss=0.0567]Steps: 58%|█████▊ | 1157/2000 [24:15<18:35, 1.32s/it, lr=0.0001, step_loss=0.0166]Steps: 58%|█████▊ | 1158/2000 [24:15<16:12, 1.15s/it, lr=0.0001, step_loss=0.0166]Steps: 58%|█████▊ | 1158/2000 [24:16<16:12, 1.15s/it, lr=0.0001, step_loss=0.16] Steps: 58%|█████▊ | 1159/2000 [24:16<14:32, 1.04s/it, lr=0.0001, step_loss=0.16]Steps: 58%|█████▊ | 1159/2000 [24:16<14:32, 1.04s/it, lr=0.0001, step_loss=0.27]Steps: 58%|█████▊ | 1160/2000 [24:17<13:22, 1.05it/s, lr=0.0001, step_loss=0.27]Steps: 58%|█████▊ | 1160/2000 [24:17<13:22, 1.05it/s, lr=0.0001, step_loss=0.0286]Steps: 58%|█████▊ | 1161/2000 [24:18<12:32, 1.11it/s, lr=0.0001, step_loss=0.0286]Steps: 58%|█████▊ | 1161/2000 [24:18<12:32, 1.11it/s, lr=0.0001, step_loss=0.00661]Steps: 58%|█████▊ | 1162/2000 [24:19<11:58, 1.17it/s, lr=0.0001, step_loss=0.00661]Steps: 58%|█████▊ | 1162/2000 [24:19<11:58, 1.17it/s, lr=0.0001, step_loss=0.00192]Steps: 58%|█████▊ | 1163/2000 [24:19<11:33, 1.21it/s, lr=0.0001, step_loss=0.00192]Steps: 58%|█████▊ | 1163/2000 [24:19<11:33, 1.21it/s, lr=0.0001, step_loss=0.00145]Steps: 58%|█████▊ | 1164/2000 [24:20<11:15, 1.24it/s, lr=0.0001, step_loss=0.00145]Steps: 58%|█████▊ | 1164/2000 [24:20<11:15, 1.24it/s, lr=0.0001, step_loss=0.00593]Steps: 58%|█████▊ | 1165/2000 [24:21<11:03, 1.26it/s, lr=0.0001, step_loss=0.00593]Steps: 58%|█████▊ | 1165/2000 [24:21<11:03, 1.26it/s, lr=0.0001, step_loss=0.428] Steps: 58%|█████▊ | 1166/2000 [24:22<10:54, 1.27it/s, lr=0.0001, step_loss=0.428]Steps: 58%|█████▊ | 1166/2000 [24:22<10:54, 1.27it/s, lr=0.0001, step_loss=0.0349]Steps: 58%|█████▊ | 1167/2000 [24:22<10:48, 1.29it/s, lr=0.0001, step_loss=0.0349]Steps: 58%|█████▊ | 1167/2000 [24:22<10:48, 1.29it/s, lr=0.0001, step_loss=0.0779]Steps: 58%|█████▊ | 1168/2000 [24:23<10:43, 1.29it/s, lr=0.0001, step_loss=0.0779]Steps: 58%|█████▊ | 1168/2000 [24:23<10:43, 1.29it/s, lr=0.0001, step_loss=0.000367]Steps: 58%|█████▊ | 1169/2000 [24:24<10:40, 1.30it/s, lr=0.0001, step_loss=0.000367]Steps: 58%|█████▊ | 1169/2000 [24:24<10:40, 1.30it/s, lr=0.0001, step_loss=0.0165] Steps: 58%|█████▊ | 1170/2000 [24:25<10:38, 1.30it/s, lr=0.0001, step_loss=0.0165]Steps: 58%|█████▊ | 1170/2000 [24:25<10:38, 1.30it/s, lr=0.0001, step_loss=0.00892]Steps: 59%|█████▊ | 1171/2000 [24:25<10:35, 1.30it/s, lr=0.0001, step_loss=0.00892]Steps: 59%|█████▊ | 1171/2000 [24:25<10:35, 1.30it/s, lr=0.0001, step_loss=0.000536]Steps: 59%|█████▊ | 1172/2000 [24:26<10:33, 1.31it/s, lr=0.0001, step_loss=0.000536]Steps: 59%|█████▊ | 1172/2000 [24:26<10:33, 1.31it/s, lr=0.0001, step_loss=0.0332] Steps: 59%|█████▊ | 1173/2000 [24:27<10:31, 1.31it/s, lr=0.0001, step_loss=0.0332]Steps: 59%|█████▊ | 1173/2000 [24:27<10:31, 1.31it/s, lr=0.0001, step_loss=0.00295]Steps: 59%|█████▊ | 1174/2000 [24:28<10:30, 1.31it/s, lr=0.0001, step_loss=0.00295]Steps: 59%|█████▊ | 1174/2000 [24:28<10:30, 1.31it/s, lr=0.0001, step_loss=0.106] Steps: 59%|█████▉ | 1175/2000 [24:28<10:29, 1.31it/s, lr=0.0001, step_loss=0.106]Steps: 59%|█████▉ | 1175/2000 [24:28<10:29, 1.31it/s, lr=0.0001, step_loss=0.0135]Steps: 59%|█████▉ | 1176/2000 [24:29<10:28, 1.31it/s, lr=0.0001, step_loss=0.0135]Steps: 59%|█████▉ | 1176/2000 [24:29<10:28, 1.31it/s, lr=0.0001, step_loss=0.00316]Steps: 59%|█████▉ | 1177/2000 [24:30<10:28, 1.31it/s, lr=0.0001, step_loss=0.00316]Steps: 59%|█████▉ | 1177/2000 [24:30<10:28, 1.31it/s, lr=0.0001, step_loss=0.0123] Steps: 59%|█████▉ | 1178/2000 [24:31<10:28, 1.31it/s, lr=0.0001, step_loss=0.0123]Steps: 59%|█████▉ | 1178/2000 [24:31<10:28, 1.31it/s, lr=0.0001, step_loss=0.0393]Steps: 59%|█████▉ | 1179/2000 [24:31<10:27, 1.31it/s, lr=0.0001, step_loss=0.0393]Steps: 59%|█████▉ | 1179/2000 [24:32<10:27, 1.31it/s, lr=0.0001, step_loss=0.000596]Steps: 59%|█████▉ | 1180/2000 [24:32<10:27, 1.31it/s, lr=0.0001, step_loss=0.000596]Steps: 59%|█████▉ | 1180/2000 [24:32<10:27, 1.31it/s, lr=0.0001, step_loss=0.0105] Steps: 59%|█████▉ | 1181/2000 [24:33<10:26, 1.31it/s, lr=0.0001, step_loss=0.0105]Steps: 59%|█████▉ | 1181/2000 [24:33<10:26, 1.31it/s, lr=0.0001, step_loss=0.000464]Steps: 59%|█████▉ | 1182/2000 [24:34<10:26, 1.31it/s, lr=0.0001, step_loss=0.000464]Steps: 59%|█████▉ | 1182/2000 [24:34<10:26, 1.31it/s, lr=0.0001, step_loss=0.0363] Steps: 59%|█████▉ | 1183/2000 [24:35<10:25, 1.31it/s, lr=0.0001, step_loss=0.0363]Steps: 59%|█████▉ | 1183/2000 [24:35<10:25, 1.31it/s, lr=0.0001, step_loss=0.0808]Steps: 59%|█████▉ | 1184/2000 [24:35<10:24, 1.31it/s, lr=0.0001, step_loss=0.0808]11/14/2025 06:33:22 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1184)
Steps: 59%|█████▉ | 1184/2000 [24:42<10:24, 1.31it/s, lr=0.0001, step_loss=0.0099]11/14/2025 06:33:22 - INFO - root - ### DEBUG: Finished epoch 36, epoch_steps=32, global_step=1184
11/14/2025 06:33:22 - INFO - root - ### DEBUG: Starting epoch 37/63, global_step=1184, max_train_steps=2000
Steps: 59%|█████▉ | 1185/2000 [24:43<37:12, 2.74s/it, lr=0.0001, step_loss=0.0099]Steps: 59%|█████▉ | 1185/2000 [24:43<37:12, 2.74s/it, lr=0.0001, step_loss=0.0437]Steps: 59%|█████▉ | 1186/2000 [24:43<29:06, 2.15s/it, lr=0.0001, step_loss=0.0437]Steps: 59%|█████▉ | 1186/2000 [24:43<29:06, 2.15s/it, lr=0.0001, step_loss=0.00492]Steps: 59%|█████▉ | 1187/2000 [24:44<23:26, 1.73s/it, lr=0.0001, step_loss=0.00492]Steps: 59%|█████▉ | 1187/2000 [24:44<23:26, 1.73s/it, lr=0.0001, step_loss=0.0919] Steps: 59%|█████▉ | 1188/2000 [24:45<19:28, 1.44s/it, lr=0.0001, step_loss=0.0919]Steps: 59%|█████▉ | 1188/2000 [24:45<19:28, 1.44s/it, lr=0.0001, step_loss=0.00569]Steps: 59%|█████▉ | 1189/2000 [24:46<16:42, 1.24s/it, lr=0.0001, step_loss=0.00569]Steps: 59%|█████▉ | 1189/2000 [24:46<16:42, 1.24s/it, lr=0.0001, step_loss=0.111] Steps: 60%|█████▉ | 1190/2000 [24:46<14:45, 1.09s/it, lr=0.0001, step_loss=0.111]Steps: 60%|█████▉ | 1190/2000 [24:47<14:45, 1.09s/it, lr=0.0001, step_loss=0.0279]Steps: 60%|█████▉ | 1191/2000 [24:47<13:24, 1.01it/s, lr=0.0001, step_loss=0.0279]Steps: 60%|█████▉ | 1191/2000 [24:47<13:24, 1.01it/s, lr=0.0001, step_loss=0.000662]Steps: 60%|█████▉ | 1192/2000 [24:48<12:27, 1.08it/s, lr=0.0001, step_loss=0.000662]Steps: 60%|█████▉ | 1192/2000 [24:48<12:27, 1.08it/s, lr=0.0001, step_loss=0.0041] Steps: 60%|█████▉ | 1193/2000 [24:49<11:47, 1.14it/s, lr=0.0001, step_loss=0.0041]Steps: 60%|█████▉ | 1193/2000 [24:49<11:47, 1.14it/s, lr=0.0001, step_loss=0.265] Steps: 60%|█████▉ | 1194/2000 [24:50<11:19, 1.19it/s, lr=0.0001, step_loss=0.265]Steps: 60%|█████▉ | 1194/2000 [24:50<11:19, 1.19it/s, lr=0.0001, step_loss=0.0102]Steps: 60%|█████▉ | 1195/2000 [24:50<10:59, 1.22it/s, lr=0.0001, step_loss=0.0102]Steps: 60%|█████▉ | 1195/2000 [24:50<10:59, 1.22it/s, lr=0.0001, step_loss=0.0165]Steps: 60%|█████▉ | 1196/2000 [24:51<10:44, 1.25it/s, lr=0.0001, step_loss=0.0165]Steps: 60%|█████▉ | 1196/2000 [24:51<10:44, 1.25it/s, lr=0.0001, step_loss=0.00553]Steps: 60%|█████▉ | 1197/2000 [24:52<10:34, 1.26it/s, lr=0.0001, step_loss=0.00553]Steps: 60%|█████▉ | 1197/2000 [24:52<10:34, 1.26it/s, lr=0.0001, step_loss=0.00407]Steps: 60%|█████▉ | 1198/2000 [24:53<10:27, 1.28it/s, lr=0.0001, step_loss=0.00407]Steps: 60%|█████▉ | 1198/2000 [24:53<10:27, 1.28it/s, lr=0.0001, step_loss=0.00313]Steps: 60%|█████▉ | 1199/2000 [24:53<10:21, 1.29it/s, lr=0.0001, step_loss=0.00313]Steps: 60%|█████▉ | 1199/2000 [24:53<10:21, 1.29it/s, lr=0.0001, step_loss=0.187] Steps: 60%|██████ | 1200/2000 [24:54<10:17, 1.30it/s, lr=0.0001, step_loss=0.187]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.69it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.69it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.05it/s][A100%|██████████| 8/8 [00:00<00:00, 32.22it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.00it/s][A100%|██████████| 8/8 [00:00<00:00, 32.19it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.69it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.69it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.69it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 43.91it/s][A100%|██████████| 8/8 [00:00<00:00, 32.15it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.69it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.69it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 43.95it/s][A100%|██████████| 8/8 [00:00<00:00, 32.16it/s]
11/14/2025 06:34:38 - INFO - root - Saved samples to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/samples/sample-1200.gif
Steps: 60%|██████ | 1200/2000 [25:58<10:17, 1.30it/s, lr=0.0001, step_loss=0.00489]Steps: 60%|██████ | 1201/2000 [25:59<4:26:13, 19.99s/it, lr=0.0001, step_loss=0.00489]Steps: 60%|██████ | 1201/2000 [25:59<4:26:13, 19.99s/it, lr=0.0001, step_loss=0.000511]Steps: 60%|██████ | 1202/2000 [26:00<3:09:10, 14.22s/it, lr=0.0001, step_loss=0.000511]Steps: 60%|██████ | 1202/2000 [26:00<3:09:10, 14.22s/it, lr=0.0001, step_loss=0.164] Steps: 60%|██████ | 1203/2000 [26:00<2:15:17, 10.18s/it, lr=0.0001, step_loss=0.164]Steps: 60%|██████ | 1203/2000 [26:00<2:15:17, 10.18s/it, lr=0.0001, step_loss=0.00208]Steps: 60%|██████ | 1204/2000 [26:01<1:37:37, 7.36s/it, lr=0.0001, step_loss=0.00208]Steps: 60%|██████ | 1204/2000 [26:01<1:37:37, 7.36s/it, lr=0.0001, step_loss=0.0264] Steps: 60%|██████ | 1205/2000 [26:02<1:11:16, 5.38s/it, lr=0.0001, step_loss=0.0264]Steps: 60%|██████ | 1205/2000 [26:02<1:11:16, 5.38s/it, lr=0.0001, step_loss=0.0493]Steps: 60%|██████ | 1206/2000 [26:03<52:50, 3.99s/it, lr=0.0001, step_loss=0.0493] Steps: 60%|██████ | 1206/2000 [26:03<52:50, 3.99s/it, lr=0.0001, step_loss=0.000635]Steps: 60%|██████ | 1207/2000 [26:04<39:58, 3.02s/it, lr=0.0001, step_loss=0.000635]Steps: 60%|██████ | 1207/2000 [26:04<39:58, 3.02s/it, lr=0.0001, step_loss=0.00898] Steps: 60%|██████ | 1208/2000 [26:04<30:58, 2.35s/it, lr=0.0001, step_loss=0.00898]Steps: 60%|██████ | 1208/2000 [26:04<30:58, 2.35s/it, lr=0.0001, step_loss=0.0365] Steps: 60%|██████ | 1209/2000 [26:05<24:40, 1.87s/it, lr=0.0001, step_loss=0.0365]Steps: 60%|██████ | 1209/2000 [26:05<24:40, 1.87s/it, lr=0.0001, step_loss=0.0023]Steps: 60%|██████ | 1210/2000 [26:06<20:16, 1.54s/it, lr=0.0001, step_loss=0.0023]Steps: 60%|██████ | 1210/2000 [26:06<20:16, 1.54s/it, lr=0.0001, step_loss=0.00309]Steps: 61%|██████ | 1211/2000 [26:07<17:10, 1.31s/it, lr=0.0001, step_loss=0.00309]Steps: 61%|██████ | 1211/2000 [26:07<17:10, 1.31s/it, lr=0.0001, step_loss=0.000586]Steps: 61%|██████ | 1212/2000 [26:07<15:01, 1.14s/it, lr=0.0001, step_loss=0.000586]Steps: 61%|██████ | 1212/2000 [26:07<15:01, 1.14s/it, lr=0.0001, step_loss=0.173] Steps: 61%|██████ | 1213/2000 [26:08<13:30, 1.03s/it, lr=0.0001, step_loss=0.173]Steps: 61%|██████ | 1213/2000 [26:08<13:30, 1.03s/it, lr=0.0001, step_loss=0.0538]Steps: 61%|██████ | 1214/2000 [26:09<12:26, 1.05it/s, lr=0.0001, step_loss=0.0538]Steps: 61%|██████ | 1214/2000 [26:09<12:26, 1.05it/s, lr=0.0001, step_loss=0.00509]Steps: 61%|██████ | 1215/2000 [26:10<11:41, 1.12it/s, lr=0.0001, step_loss=0.00509]Steps: 61%|██████ | 1215/2000 [26:10<11:41, 1.12it/s, lr=0.0001, step_loss=0.0227] Steps: 61%|██████ | 1216/2000 [26:10<11:10, 1.17it/s, lr=0.0001, step_loss=0.0227]11/14/2025 06:34:56 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1216)
Steps: 61%|██████ | 1216/2000 [26:17<11:10, 1.17it/s, lr=0.0001, step_loss=0.000582]11/14/2025 06:34:56 - INFO - root - ### DEBUG: Finished epoch 37, epoch_steps=32, global_step=1216
11/14/2025 06:34:56 - INFO - root - ### DEBUG: Starting epoch 38/63, global_step=1216, max_train_steps=2000
Steps: 61%|██████ | 1217/2000 [26:18<36:01, 2.76s/it, lr=0.0001, step_loss=0.000582]Steps: 61%|██████ | 1217/2000 [26:18<36:01, 2.76s/it, lr=0.0001, step_loss=0.0232] Steps: 61%|██████ | 1218/2000 [26:18<28:10, 2.16s/it, lr=0.0001, step_loss=0.0232]Steps: 61%|██████ | 1218/2000 [26:18<28:10, 2.16s/it, lr=0.0001, step_loss=0.0158]Steps: 61%|██████ | 1219/2000 [26:19<22:40, 1.74s/it, lr=0.0001, step_loss=0.0158]Steps: 61%|██████ | 1219/2000 [26:19<22:40, 1.74s/it, lr=0.0001, step_loss=0.0256]Steps: 61%|██████ | 1220/2000 [26:20<18:49, 1.45s/it, lr=0.0001, step_loss=0.0256]Steps: 61%|██████ | 1220/2000 [26:20<18:49, 1.45s/it, lr=0.0001, step_loss=0.0045]Steps: 61%|██████ | 1221/2000 [26:21<16:07, 1.24s/it, lr=0.0001, step_loss=0.0045]Steps: 61%|██████ | 1221/2000 [26:21<16:07, 1.24s/it, lr=0.0001, step_loss=0.00712]Steps: 61%|██████ | 1222/2000 [26:21<14:14, 1.10s/it, lr=0.0001, step_loss=0.00712]Steps: 61%|██████ | 1222/2000 [26:21<14:14, 1.10s/it, lr=0.0001, step_loss=0.00275]Steps: 61%|██████ | 1223/2000 [26:22<12:55, 1.00it/s, lr=0.0001, step_loss=0.00275]Steps: 61%|██████ | 1223/2000 [26:22<12:55, 1.00it/s, lr=0.0001, step_loss=0.00151]Steps: 61%|██████ | 1224/2000 [26:23<11:59, 1.08it/s, lr=0.0001, step_loss=0.00151]Steps: 61%|██████ | 1224/2000 [26:23<11:59, 1.08it/s, lr=0.0001, step_loss=0.0433] Steps: 61%|██████▏ | 1225/2000 [26:24<11:20, 1.14it/s, lr=0.0001, step_loss=0.0433]Steps: 61%|██████▏ | 1225/2000 [26:24<11:20, 1.14it/s, lr=0.0001, step_loss=0.0013]Steps: 61%|██████▏ | 1226/2000 [26:24<10:53, 1.18it/s, lr=0.0001, step_loss=0.0013]Steps: 61%|██████▏ | 1226/2000 [26:24<10:53, 1.18it/s, lr=0.0001, step_loss=0.0238]Steps: 61%|██████▏ | 1227/2000 [26:25<10:33, 1.22it/s, lr=0.0001, step_loss=0.0238]Steps: 61%|██████▏ | 1227/2000 [26:25<10:33, 1.22it/s, lr=0.0001, step_loss=0.315] Steps: 61%|██████▏ | 1228/2000 [26:26<10:19, 1.25it/s, lr=0.0001, step_loss=0.315]Steps: 61%|██████▏ | 1228/2000 [26:26<10:19, 1.25it/s, lr=0.0001, step_loss=0.197]Steps: 61%|██████▏ | 1229/2000 [26:27<10:09, 1.26it/s, lr=0.0001, step_loss=0.197]Steps: 61%|██████▏ | 1229/2000 [26:27<10:09, 1.26it/s, lr=0.0001, step_loss=0.117]Steps: 62%|██████▏ | 1230/2000 [26:28<10:02, 1.28it/s, lr=0.0001, step_loss=0.117]Steps: 62%|██████▏ | 1230/2000 [26:28<10:02, 1.28it/s, lr=0.0001, step_loss=0.0253]Steps: 62%|██████▏ | 1231/2000 [26:28<09:56, 1.29it/s, lr=0.0001, step_loss=0.0253]Steps: 62%|██████▏ | 1231/2000 [26:28<09:56, 1.29it/s, lr=0.0001, step_loss=0.00042]Steps: 62%|██████▏ | 1232/2000 [26:29<09:53, 1.29it/s, lr=0.0001, step_loss=0.00042]Steps: 62%|██████▏ | 1232/2000 [26:29<09:53, 1.29it/s, lr=0.0001, step_loss=0.00208]Steps: 62%|██████▏ | 1233/2000 [26:30<09:50, 1.30it/s, lr=0.0001, step_loss=0.00208]Steps: 62%|██████▏ | 1233/2000 [26:30<09:50, 1.30it/s, lr=0.0001, step_loss=0.0239] Steps: 62%|██████▏ | 1234/2000 [26:31<09:47, 1.30it/s, lr=0.0001, step_loss=0.0239]Steps: 62%|██████▏ | 1234/2000 [26:31<09:47, 1.30it/s, lr=0.0001, step_loss=0.00288]Steps: 62%|██████▏ | 1235/2000 [26:31<09:45, 1.31it/s, lr=0.0001, step_loss=0.00288]Steps: 62%|██████▏ | 1235/2000 [26:31<09:45, 1.31it/s, lr=0.0001, step_loss=0.00788]Steps: 62%|██████▏ | 1236/2000 [26:32<09:43, 1.31it/s, lr=0.0001, step_loss=0.00788]Steps: 62%|██████▏ | 1236/2000 [26:32<09:43, 1.31it/s, lr=0.0001, step_loss=0.375] Steps: 62%|██████▏ | 1237/2000 [26:33<09:42, 1.31it/s, lr=0.0001, step_loss=0.375]Steps: 62%|██████▏ | 1237/2000 [26:33<09:42, 1.31it/s, lr=0.0001, step_loss=0.049]Steps: 62%|██████▏ | 1238/2000 [26:34<09:41, 1.31it/s, lr=0.0001, step_loss=0.049]Steps: 62%|██████▏ | 1238/2000 [26:34<09:41, 1.31it/s, lr=0.0001, step_loss=0.0524]Steps: 62%|██████▏ | 1239/2000 [26:34<09:40, 1.31it/s, lr=0.0001, step_loss=0.0524]Steps: 62%|██████▏ | 1239/2000 [26:34<09:40, 1.31it/s, lr=0.0001, step_loss=0.00288]Steps: 62%|██████▏ | 1240/2000 [26:35<09:39, 1.31it/s, lr=0.0001, step_loss=0.00288]Steps: 62%|██████▏ | 1240/2000 [26:35<09:39, 1.31it/s, lr=0.0001, step_loss=0.000978]Steps: 62%|██████▏ | 1241/2000 [26:36<09:38, 1.31it/s, lr=0.0001, step_loss=0.000978]Steps: 62%|██████▏ | 1241/2000 [26:36<09:38, 1.31it/s, lr=0.0001, step_loss=0.000657]Steps: 62%|██████▏ | 1242/2000 [26:37<09:38, 1.31it/s, lr=0.0001, step_loss=0.000657]Steps: 62%|██████▏ | 1242/2000 [26:37<09:38, 1.31it/s, lr=0.0001, step_loss=0.00245] Steps: 62%|██████▏ | 1243/2000 [26:37<09:37, 1.31it/s, lr=0.0001, step_loss=0.00245]Steps: 62%|██████▏ | 1243/2000 [26:37<09:37, 1.31it/s, lr=0.0001, step_loss=0.0258] Steps: 62%|██████▏ | 1244/2000 [26:38<09:36, 1.31it/s, lr=0.0001, step_loss=0.0258]Steps: 62%|██████▏ | 1244/2000 [26:38<09:36, 1.31it/s, lr=0.0001, step_loss=0.129] Steps: 62%|██████▏ | 1245/2000 [26:39<09:35, 1.31it/s, lr=0.0001, step_loss=0.129]Steps: 62%|██████▏ | 1245/2000 [26:39<09:35, 1.31it/s, lr=0.0001, step_loss=0.000765]Steps: 62%|██████▏ | 1246/2000 [26:40<09:34, 1.31it/s, lr=0.0001, step_loss=0.000765]Steps: 62%|██████▏ | 1246/2000 [26:40<09:34, 1.31it/s, lr=0.0001, step_loss=0.00348] Steps: 62%|██████▏ | 1247/2000 [26:40<09:34, 1.31it/s, lr=0.0001, step_loss=0.00348]Steps: 62%|██████▏ | 1247/2000 [26:41<09:34, 1.31it/s, lr=0.0001, step_loss=0.0925] Steps: 62%|██████▏ | 1248/2000 [26:41<09:33, 1.31it/s, lr=0.0001, step_loss=0.0925]11/14/2025 06:35:29 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1248)
Steps: 62%|██████▏ | 1248/2000 [26:49<09:33, 1.31it/s, lr=0.0001, step_loss=0.46] 11/14/2025 06:35:29 - INFO - root - ### DEBUG: Finished epoch 38, epoch_steps=32, global_step=1248
11/14/2025 06:35:29 - INFO - root - ### DEBUG: Starting epoch 39/63, global_step=1248, max_train_steps=2000
Steps: 62%|██████▏ | 1249/2000 [26:50<40:09, 3.21s/it, lr=0.0001, step_loss=0.46]Steps: 62%|██████▏ | 1249/2000 [26:50<40:09, 3.21s/it, lr=0.0001, step_loss=0.103]Steps: 62%|██████▎ | 1250/2000 [26:51<30:55, 2.47s/it, lr=0.0001, step_loss=0.103]Steps: 62%|██████▎ | 1250/2000 [26:51<30:55, 2.47s/it, lr=0.0001, step_loss=0.00301]Steps: 63%|██████▎ | 1251/2000 [26:52<24:28, 1.96s/it, lr=0.0001, step_loss=0.00301]Steps: 63%|██████▎ | 1251/2000 [26:52<24:28, 1.96s/it, lr=0.0001, step_loss=0.0141] Steps: 63%|██████▎ | 1252/2000 [26:52<19:57, 1.60s/it, lr=0.0001, step_loss=0.0141]Steps: 63%|██████▎ | 1252/2000 [26:52<19:57, 1.60s/it, lr=0.0001, step_loss=0.292] Steps: 63%|██████▎ | 1253/2000 [26:53<16:47, 1.35s/it, lr=0.0001, step_loss=0.292]Steps: 63%|██████▎ | 1253/2000 [26:53<16:47, 1.35s/it, lr=0.0001, step_loss=0.262]Steps: 63%|██████▎ | 1254/2000 [26:54<14:34, 1.17s/it, lr=0.0001, step_loss=0.262]Steps: 63%|██████▎ | 1254/2000 [26:54<14:34, 1.17s/it, lr=0.0001, step_loss=0.00113]Steps: 63%|██████▎ | 1255/2000 [26:55<13:01, 1.05s/it, lr=0.0001, step_loss=0.00113]Steps: 63%|██████▎ | 1255/2000 [26:55<13:01, 1.05s/it, lr=0.0001, step_loss=0.0497] Steps: 63%|██████▎ | 1256/2000 [26:55<11:56, 1.04it/s, lr=0.0001, step_loss=0.0497]Steps: 63%|██████▎ | 1256/2000 [26:56<11:56, 1.04it/s, lr=0.0001, step_loss=0.000523]Steps: 63%|██████▎ | 1257/2000 [26:56<11:10, 1.11it/s, lr=0.0001, step_loss=0.000523]Steps: 63%|██████▎ | 1257/2000 [26:56<11:10, 1.11it/s, lr=0.0001, step_loss=0.00223] Steps: 63%|██████▎ | 1258/2000 [26:57<10:38, 1.16it/s, lr=0.0001, step_loss=0.00223]Steps: 63%|██████▎ | 1258/2000 [26:57<10:38, 1.16it/s, lr=0.0001, step_loss=0.00137]Steps: 63%|██████▎ | 1259/2000 [26:58<10:15, 1.20it/s, lr=0.0001, step_loss=0.00137]Steps: 63%|██████▎ | 1259/2000 [26:58<10:15, 1.20it/s, lr=0.0001, step_loss=0.0429] Steps: 63%|██████▎ | 1260/2000 [26:59<09:59, 1.23it/s, lr=0.0001, step_loss=0.0429]Steps: 63%|██████▎ | 1260/2000 [26:59<09:59, 1.23it/s, lr=0.0001, step_loss=0.00117]Steps: 63%|██████▎ | 1261/2000 [26:59<09:47, 1.26it/s, lr=0.0001, step_loss=0.00117]Steps: 63%|██████▎ | 1261/2000 [26:59<09:47, 1.26it/s, lr=0.0001, step_loss=0.0539] Steps: 63%|██████▎ | 1262/2000 [27:00<09:40, 1.27it/s, lr=0.0001, step_loss=0.0539]Steps: 63%|██████▎ | 1262/2000 [27:00<09:40, 1.27it/s, lr=0.0001, step_loss=0.18] Steps: 63%|██████▎ | 1263/2000 [27:01<09:33, 1.28it/s, lr=0.0001, step_loss=0.18]Steps: 63%|██████▎ | 1263/2000 [27:01<09:33, 1.28it/s, lr=0.0001, step_loss=0.00109]Steps: 63%|██████▎ | 1264/2000 [27:02<09:29, 1.29it/s, lr=0.0001, step_loss=0.00109]Steps: 63%|██████▎ | 1264/2000 [27:02<09:29, 1.29it/s, lr=0.0001, step_loss=0.000581]Steps: 63%|██████▎ | 1265/2000 [27:02<09:25, 1.30it/s, lr=0.0001, step_loss=0.000581]Steps: 63%|██████▎ | 1265/2000 [27:02<09:25, 1.30it/s, lr=0.0001, step_loss=0.198] Steps: 63%|██████▎ | 1266/2000 [27:03<09:23, 1.30it/s, lr=0.0001, step_loss=0.198]Steps: 63%|██████▎ | 1266/2000 [27:03<09:23, 1.30it/s, lr=0.0001, step_loss=0.00958]Steps: 63%|██████▎ | 1267/2000 [27:04<09:21, 1.31it/s, lr=0.0001, step_loss=0.00958]Steps: 63%|██████▎ | 1267/2000 [27:04<09:21, 1.31it/s, lr=0.0001, step_loss=0.0317] Steps: 63%|██████▎ | 1268/2000 [27:05<09:19, 1.31it/s, lr=0.0001, step_loss=0.0317]Steps: 63%|██████▎ | 1268/2000 [27:05<09:19, 1.31it/s, lr=0.0001, step_loss=0.232] Steps: 63%|██████▎ | 1269/2000 [27:05<09:18, 1.31it/s, lr=0.0001, step_loss=0.232]Steps: 63%|██████▎ | 1269/2000 [27:05<09:18, 1.31it/s, lr=0.0001, step_loss=0.00327]Steps: 64%|██████▎ | 1270/2000 [27:06<09:16, 1.31it/s, lr=0.0001, step_loss=0.00327]Steps: 64%|██████▎ | 1270/2000 [27:06<09:16, 1.31it/s, lr=0.0001, step_loss=0.0221] Steps: 64%|██████▎ | 1271/2000 [27:07<09:15, 1.31it/s, lr=0.0001, step_loss=0.0221]Steps: 64%|██████▎ | 1271/2000 [27:07<09:15, 1.31it/s, lr=0.0001, step_loss=0.0188]Steps: 64%|██████▎ | 1272/2000 [27:08<09:14, 1.31it/s, lr=0.0001, step_loss=0.0188]Steps: 64%|██████▎ | 1272/2000 [27:08<09:14, 1.31it/s, lr=0.0001, step_loss=0.154] Steps: 64%|██████▎ | 1273/2000 [27:08<09:13, 1.31it/s, lr=0.0001, step_loss=0.154]Steps: 64%|██████▎ | 1273/2000 [27:08<09:13, 1.31it/s, lr=0.0001, step_loss=0.061]Steps: 64%|██████▎ | 1274/2000 [27:09<09:13, 1.31it/s, lr=0.0001, step_loss=0.061]Steps: 64%|██████▎ | 1274/2000 [27:09<09:13, 1.31it/s, lr=0.0001, step_loss=0.00246]Steps: 64%|██████▍ | 1275/2000 [27:10<09:12, 1.31it/s, lr=0.0001, step_loss=0.00246]Steps: 64%|██████▍ | 1275/2000 [27:10<09:12, 1.31it/s, lr=0.0001, step_loss=0.00169]Steps: 64%|██████▍ | 1276/2000 [27:11<09:11, 1.31it/s, lr=0.0001, step_loss=0.00169]Steps: 64%|██████▍ | 1276/2000 [27:11<09:11, 1.31it/s, lr=0.0001, step_loss=0.000558]Steps: 64%|██████▍ | 1277/2000 [27:11<09:11, 1.31it/s, lr=0.0001, step_loss=0.000558]Steps: 64%|██████▍ | 1277/2000 [27:12<09:11, 1.31it/s, lr=0.0001, step_loss=0.151] Steps: 64%|██████▍ | 1278/2000 [27:12<09:10, 1.31it/s, lr=0.0001, step_loss=0.151]Steps: 64%|██████▍ | 1278/2000 [27:12<09:10, 1.31it/s, lr=0.0001, step_loss=0.0022]Steps: 64%|██████▍ | 1279/2000 [27:13<09:09, 1.31it/s, lr=0.0001, step_loss=0.0022]Steps: 64%|██████▍ | 1279/2000 [27:13<09:09, 1.31it/s, lr=0.0001, step_loss=0.00165]Steps: 64%|██████▍ | 1280/2000 [27:14<09:09, 1.31it/s, lr=0.0001, step_loss=0.00165]11/14/2025 06:36:00 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1280)
Steps: 64%|██████▍ | 1280/2000 [27:21<09:09, 1.31it/s, lr=0.0001, step_loss=0.0322] 11/14/2025 06:36:00 - INFO - root - ### DEBUG: Finished epoch 39, epoch_steps=32, global_step=1280
11/14/2025 06:36:00 - INFO - root - ### DEBUG: Starting epoch 40/63, global_step=1280, max_train_steps=2000
Steps: 64%|██████▍ | 1281/2000 [27:21<34:03, 2.84s/it, lr=0.0001, step_loss=0.0322]Steps: 64%|██████▍ | 1281/2000 [27:21<34:03, 2.84s/it, lr=0.0001, step_loss=0.00704]Steps: 64%|██████▍ | 1282/2000 [27:22<26:32, 2.22s/it, lr=0.0001, step_loss=0.00704]Steps: 64%|██████▍ | 1282/2000 [27:22<26:32, 2.22s/it, lr=0.0001, step_loss=0.0656] Steps: 64%|██████▍ | 1283/2000 [27:23<21:17, 1.78s/it, lr=0.0001, step_loss=0.0656]Steps: 64%|██████▍ | 1283/2000 [27:23<21:17, 1.78s/it, lr=0.0001, step_loss=0.259] Steps: 64%|██████▍ | 1284/2000 [27:24<17:36, 1.48s/it, lr=0.0001, step_loss=0.259]Steps: 64%|██████▍ | 1284/2000 [27:24<17:36, 1.48s/it, lr=0.0001, step_loss=0.00513]Steps: 64%|██████▍ | 1285/2000 [27:25<15:01, 1.26s/it, lr=0.0001, step_loss=0.00513]Steps: 64%|██████▍ | 1285/2000 [27:25<15:01, 1.26s/it, lr=0.0001, step_loss=0.0109] Steps: 64%|██████▍ | 1286/2000 [27:25<13:13, 1.11s/it, lr=0.0001, step_loss=0.0109]Steps: 64%|██████▍ | 1286/2000 [27:25<13:13, 1.11s/it, lr=0.0001, step_loss=0.000761]Steps: 64%|██████▍ | 1287/2000 [27:26<11:57, 1.01s/it, lr=0.0001, step_loss=0.000761]Steps: 64%|██████▍ | 1287/2000 [27:26<11:57, 1.01s/it, lr=0.0001, step_loss=0.0147] Steps: 64%|██████▍ | 1288/2000 [27:27<11:03, 1.07it/s, lr=0.0001, step_loss=0.0147]Steps: 64%|██████▍ | 1288/2000 [27:27<11:03, 1.07it/s, lr=0.0001, step_loss=0.41] Steps: 64%|██████▍ | 1289/2000 [27:28<10:26, 1.13it/s, lr=0.0001, step_loss=0.41]Steps: 64%|██████▍ | 1289/2000 [27:28<10:26, 1.13it/s, lr=0.0001, step_loss=0.00358]Steps: 64%|██████▍ | 1290/2000 [27:28<10:01, 1.18it/s, lr=0.0001, step_loss=0.00358]Steps: 64%|██████▍ | 1290/2000 [27:28<10:01, 1.18it/s, lr=0.0001, step_loss=0.0066] Steps: 65%|██████▍ | 1291/2000 [27:29<09:42, 1.22it/s, lr=0.0001, step_loss=0.0066]Steps: 65%|██████▍ | 1291/2000 [27:29<09:42, 1.22it/s, lr=0.0001, step_loss=0.000684]Steps: 65%|██████▍ | 1292/2000 [27:30<09:29, 1.24it/s, lr=0.0001, step_loss=0.000684]Steps: 65%|██████▍ | 1292/2000 [27:30<09:29, 1.24it/s, lr=0.0001, step_loss=0.00544] Steps: 65%|██████▍ | 1293/2000 [27:31<09:19, 1.26it/s, lr=0.0001, step_loss=0.00544]Steps: 65%|██████▍ | 1293/2000 [27:31<09:19, 1.26it/s, lr=0.0001, step_loss=0.0541] Steps: 65%|██████▍ | 1294/2000 [27:31<09:12, 1.28it/s, lr=0.0001, step_loss=0.0541]Steps: 65%|██████▍ | 1294/2000 [27:31<09:12, 1.28it/s, lr=0.0001, step_loss=0.0337]Steps: 65%|██████▍ | 1295/2000 [27:32<09:07, 1.29it/s, lr=0.0001, step_loss=0.0337]Steps: 65%|██████▍ | 1295/2000 [27:32<09:07, 1.29it/s, lr=0.0001, step_loss=0.0258]Steps: 65%|██████▍ | 1296/2000 [27:33<09:03, 1.30it/s, lr=0.0001, step_loss=0.0258]Steps: 65%|██████▍ | 1296/2000 [27:33<09:03, 1.30it/s, lr=0.0001, step_loss=0.348] Steps: 65%|██████▍ | 1297/2000 [27:34<09:00, 1.30it/s, lr=0.0001, step_loss=0.348]Steps: 65%|██████▍ | 1297/2000 [27:34<09:00, 1.30it/s, lr=0.0001, step_loss=0.00209]Steps: 65%|██████▍ | 1298/2000 [27:34<08:58, 1.30it/s, lr=0.0001, step_loss=0.00209]Steps: 65%|██████▍ | 1298/2000 [27:34<08:58, 1.30it/s, lr=0.0001, step_loss=0.00299]Steps: 65%|██████▍ | 1299/2000 [27:35<08:56, 1.31it/s, lr=0.0001, step_loss=0.00299]Steps: 65%|██████▍ | 1299/2000 [27:35<08:56, 1.31it/s, lr=0.0001, step_loss=0.168] Steps: 65%|██████▌ | 1300/2000 [27:36<08:54, 1.31it/s, lr=0.0001, step_loss=0.168]Steps: 65%|██████▌ | 1300/2000 [27:36<08:54, 1.31it/s, lr=0.0001, step_loss=0.000781]Steps: 65%|██████▌ | 1301/2000 [27:37<08:53, 1.31it/s, lr=0.0001, step_loss=0.000781]Steps: 65%|██████▌ | 1301/2000 [27:37<08:53, 1.31it/s, lr=0.0001, step_loss=0.00197] Steps: 65%|██████▌ | 1302/2000 [27:37<08:52, 1.31it/s, lr=0.0001, step_loss=0.00197]Steps: 65%|██████▌ | 1302/2000 [27:37<08:52, 1.31it/s, lr=0.0001, step_loss=0.0493] Steps: 65%|██████▌ | 1303/2000 [27:38<08:50, 1.31it/s, lr=0.0001, step_loss=0.0493]Steps: 65%|██████▌ | 1303/2000 [27:38<08:50, 1.31it/s, lr=0.0001, step_loss=0.00107]Steps: 65%|██████▌ | 1304/2000 [27:39<08:49, 1.31it/s, lr=0.0001, step_loss=0.00107]Steps: 65%|██████▌ | 1304/2000 [27:39<08:49, 1.31it/s, lr=0.0001, step_loss=0.00318]Steps: 65%|██████▌ | 1305/2000 [27:40<08:49, 1.31it/s, lr=0.0001, step_loss=0.00318]Steps: 65%|██████▌ | 1305/2000 [27:40<08:49, 1.31it/s, lr=0.0001, step_loss=0.00255]Steps: 65%|██████▌ | 1306/2000 [27:40<08:48, 1.31it/s, lr=0.0001, step_loss=0.00255]Steps: 65%|██████▌ | 1306/2000 [27:41<08:48, 1.31it/s, lr=0.0001, step_loss=0.0298] Steps: 65%|██████▌ | 1307/2000 [27:41<08:47, 1.31it/s, lr=0.0001, step_loss=0.0298]Steps: 65%|██████▌ | 1307/2000 [27:41<08:47, 1.31it/s, lr=0.0001, step_loss=0.0253]Steps: 65%|██████▌ | 1308/2000 [27:42<08:47, 1.31it/s, lr=0.0001, step_loss=0.0253]Steps: 65%|██████▌ | 1308/2000 [27:42<08:47, 1.31it/s, lr=0.0001, step_loss=0.0153]Steps: 65%|██████▌ | 1309/2000 [27:43<08:46, 1.31it/s, lr=0.0001, step_loss=0.0153]Steps: 65%|██████▌ | 1309/2000 [27:43<08:46, 1.31it/s, lr=0.0001, step_loss=0.0217]Steps: 66%|██████▌ | 1310/2000 [27:44<08:46, 1.31it/s, lr=0.0001, step_loss=0.0217]Steps: 66%|██████▌ | 1310/2000 [27:44<08:46, 1.31it/s, lr=0.0001, step_loss=0.134] Steps: 66%|██████▌ | 1311/2000 [27:44<08:46, 1.31it/s, lr=0.0001, step_loss=0.134]Steps: 66%|██████▌ | 1311/2000 [27:44<08:46, 1.31it/s, lr=0.0001, step_loss=0.00524]Steps: 66%|██████▌ | 1312/2000 [27:45<08:46, 1.31it/s, lr=0.0001, step_loss=0.00524]11/14/2025 06:36:31 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1312)
Steps: 66%|██████▌ | 1312/2000 [27:51<08:46, 1.31it/s, lr=0.0001, step_loss=0.00205]11/14/2025 06:36:31 - INFO - root - ### DEBUG: Finished epoch 40, epoch_steps=32, global_step=1312
11/14/2025 06:36:31 - INFO - root - ### DEBUG: Starting epoch 41/63, global_step=1312, max_train_steps=2000
Steps: 66%|██████▌ | 1313/2000 [27:52<30:34, 2.67s/it, lr=0.0001, step_loss=0.00205]Steps: 66%|██████▌ | 1313/2000 [27:52<30:34, 2.67s/it, lr=0.0001, step_loss=0.00123]Steps: 66%|██████▌ | 1314/2000 [27:53<23:58, 2.10s/it, lr=0.0001, step_loss=0.00123]Steps: 66%|██████▌ | 1314/2000 [27:53<23:58, 2.10s/it, lr=0.0001, step_loss=0.00117]Steps: 66%|██████▌ | 1315/2000 [27:54<19:22, 1.70s/it, lr=0.0001, step_loss=0.00117]Steps: 66%|██████▌ | 1315/2000 [27:54<19:22, 1.70s/it, lr=0.0001, step_loss=0.0484] Steps: 66%|██████▌ | 1316/2000 [27:54<16:09, 1.42s/it, lr=0.0001, step_loss=0.0484]Steps: 66%|██████▌ | 1316/2000 [27:55<16:09, 1.42s/it, lr=0.0001, step_loss=0.0272]Steps: 66%|██████▌ | 1317/2000 [27:55<13:53, 1.22s/it, lr=0.0001, step_loss=0.0272]Steps: 66%|██████▌ | 1317/2000 [27:55<13:53, 1.22s/it, lr=0.0001, step_loss=0.000655]Steps: 66%|██████▌ | 1318/2000 [27:56<12:18, 1.08s/it, lr=0.0001, step_loss=0.000655]Steps: 66%|██████▌ | 1318/2000 [27:56<12:18, 1.08s/it, lr=0.0001, step_loss=0.0275] Steps: 66%|██████▌ | 1319/2000 [27:57<11:12, 1.01it/s, lr=0.0001, step_loss=0.0275]Steps: 66%|██████▌ | 1319/2000 [27:57<11:12, 1.01it/s, lr=0.0001, step_loss=0.0663]Steps: 66%|██████▌ | 1320/2000 [27:58<10:25, 1.09it/s, lr=0.0001, step_loss=0.0663]Steps: 66%|██████▌ | 1320/2000 [27:58<10:25, 1.09it/s, lr=0.0001, step_loss=0.00707]Steps: 66%|██████▌ | 1321/2000 [27:58<09:51, 1.15it/s, lr=0.0001, step_loss=0.00707]Steps: 66%|██████▌ | 1321/2000 [27:58<09:51, 1.15it/s, lr=0.0001, step_loss=0.0104] Steps: 66%|██████▌ | 1322/2000 [27:59<09:28, 1.19it/s, lr=0.0001, step_loss=0.0104]Steps: 66%|██████▌ | 1322/2000 [27:59<09:28, 1.19it/s, lr=0.0001, step_loss=0.0128]Steps: 66%|██████▌ | 1323/2000 [28:00<09:12, 1.23it/s, lr=0.0001, step_loss=0.0128]Steps: 66%|██████▌ | 1323/2000 [28:00<09:12, 1.23it/s, lr=0.0001, step_loss=0.00896]Steps: 66%|██████▌ | 1324/2000 [28:01<09:00, 1.25it/s, lr=0.0001, step_loss=0.00896]Steps: 66%|██████▌ | 1324/2000 [28:01<09:00, 1.25it/s, lr=0.0001, step_loss=0.299] Steps: 66%|██████▋ | 1325/2000 [28:01<08:52, 1.27it/s, lr=0.0001, step_loss=0.299]Steps: 66%|██████▋ | 1325/2000 [28:01<08:52, 1.27it/s, lr=0.0001, step_loss=0.0493]Steps: 66%|██████▋ | 1326/2000 [28:02<08:46, 1.28it/s, lr=0.0001, step_loss=0.0493]Steps: 66%|██████▋ | 1326/2000 [28:02<08:46, 1.28it/s, lr=0.0001, step_loss=0.000709]Steps: 66%|██████▋ | 1327/2000 [28:03<08:42, 1.29it/s, lr=0.0001, step_loss=0.000709]Steps: 66%|██████▋ | 1327/2000 [28:03<08:42, 1.29it/s, lr=0.0001, step_loss=0.0549] Steps: 66%|██████▋ | 1328/2000 [28:04<08:38, 1.30it/s, lr=0.0001, step_loss=0.0549]Steps: 66%|██████▋ | 1328/2000 [28:04<08:38, 1.30it/s, lr=0.0001, step_loss=0.0353]Steps: 66%|██████▋ | 1329/2000 [28:04<08:35, 1.30it/s, lr=0.0001, step_loss=0.0353]Steps: 66%|██████▋ | 1329/2000 [28:04<08:35, 1.30it/s, lr=0.0001, step_loss=0.000814]Steps: 66%|██████▋ | 1330/2000 [28:05<08:33, 1.30it/s, lr=0.0001, step_loss=0.000814]Steps: 66%|██████▋ | 1330/2000 [28:05<08:33, 1.30it/s, lr=0.0001, step_loss=0.112] Steps: 67%|██████▋ | 1331/2000 [28:06<08:31, 1.31it/s, lr=0.0001, step_loss=0.112]Steps: 67%|██████▋ | 1331/2000 [28:06<08:31, 1.31it/s, lr=0.0001, step_loss=0.00161]Steps: 67%|██████▋ | 1332/2000 [28:07<08:30, 1.31it/s, lr=0.0001, step_loss=0.00161]Steps: 67%|██████▋ | 1332/2000 [28:07<08:30, 1.31it/s, lr=0.0001, step_loss=0.131] Steps: 67%|██████▋ | 1333/2000 [28:07<08:29, 1.31it/s, lr=0.0001, step_loss=0.131]Steps: 67%|██████▋ | 1333/2000 [28:07<08:29, 1.31it/s, lr=0.0001, step_loss=0.00263]Steps: 67%|██████▋ | 1334/2000 [28:08<08:28, 1.31it/s, lr=0.0001, step_loss=0.00263]Steps: 67%|██████▋ | 1334/2000 [28:08<08:28, 1.31it/s, lr=0.0001, step_loss=0.00809]Steps: 67%|██████▋ | 1335/2000 [28:09<08:27, 1.31it/s, lr=0.0001, step_loss=0.00809]Steps: 67%|██████▋ | 1335/2000 [28:09<08:27, 1.31it/s, lr=0.0001, step_loss=0.328] Steps: 67%|██████▋ | 1336/2000 [28:10<08:27, 1.31it/s, lr=0.0001, step_loss=0.328]Steps: 67%|██████▋ | 1336/2000 [28:10<08:27, 1.31it/s, lr=0.0001, step_loss=0.0405]Steps: 67%|██████▋ | 1337/2000 [28:10<08:26, 1.31it/s, lr=0.0001, step_loss=0.0405]Steps: 67%|██████▋ | 1337/2000 [28:11<08:26, 1.31it/s, lr=0.0001, step_loss=0.0458]Steps: 67%|██████▋ | 1338/2000 [28:11<08:25, 1.31it/s, lr=0.0001, step_loss=0.0458]Steps: 67%|██████▋ | 1338/2000 [28:11<08:25, 1.31it/s, lr=0.0001, step_loss=0.0964]Steps: 67%|██████▋ | 1339/2000 [28:12<08:24, 1.31it/s, lr=0.0001, step_loss=0.0964]Steps: 67%|██████▋ | 1339/2000 [28:12<08:24, 1.31it/s, lr=0.0001, step_loss=0.0338]Steps: 67%|██████▋ | 1340/2000 [28:13<08:23, 1.31it/s, lr=0.0001, step_loss=0.0338]Steps: 67%|██████▋ | 1340/2000 [28:13<08:23, 1.31it/s, lr=0.0001, step_loss=0.0025]Steps: 67%|██████▋ | 1341/2000 [28:14<08:22, 1.31it/s, lr=0.0001, step_loss=0.0025]Steps: 67%|██████▋ | 1341/2000 [28:14<08:22, 1.31it/s, lr=0.0001, step_loss=0.0464]Steps: 67%|██████▋ | 1342/2000 [28:14<08:21, 1.31it/s, lr=0.0001, step_loss=0.0464]Steps: 67%|██████▋ | 1342/2000 [28:14<08:21, 1.31it/s, lr=0.0001, step_loss=0.0598]Steps: 67%|██████▋ | 1343/2000 [28:15<08:20, 1.31it/s, lr=0.0001, step_loss=0.0598]Steps: 67%|██████▋ | 1343/2000 [28:15<08:20, 1.31it/s, lr=0.0001, step_loss=0.00287]Steps: 67%|██████▋ | 1344/2000 [28:16<08:19, 1.31it/s, lr=0.0001, step_loss=0.00287]11/14/2025 06:37:02 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1344)
Steps: 67%|██████▋ | 1344/2000 [28:22<08:19, 1.31it/s, lr=0.0001, step_loss=0.00122]11/14/2025 06:37:02 - INFO - root - ### DEBUG: Finished epoch 41, epoch_steps=32, global_step=1344
11/14/2025 06:37:02 - INFO - root - ### DEBUG: Starting epoch 42/63, global_step=1344, max_train_steps=2000
Steps: 67%|██████▋ | 1345/2000 [28:23<28:29, 2.61s/it, lr=0.0001, step_loss=0.00122]Steps: 67%|██████▋ | 1345/2000 [28:23<28:29, 2.61s/it, lr=0.0001, step_loss=0.0499] Steps: 67%|██████▋ | 1346/2000 [28:24<22:23, 2.06s/it, lr=0.0001, step_loss=0.0499]Steps: 67%|██████▋ | 1346/2000 [28:24<22:23, 2.06s/it, lr=0.0001, step_loss=0.0428]Steps: 67%|██████▋ | 1347/2000 [28:24<18:08, 1.67s/it, lr=0.0001, step_loss=0.0428]Steps: 67%|██████▋ | 1347/2000 [28:24<18:08, 1.67s/it, lr=0.0001, step_loss=0.474] Steps: 67%|██████▋ | 1348/2000 [28:25<15:09, 1.40s/it, lr=0.0001, step_loss=0.474]Steps: 67%|██████▋ | 1348/2000 [28:25<15:09, 1.40s/it, lr=0.0001, step_loss=0.386]Steps: 67%|██████▋ | 1349/2000 [28:26<13:04, 1.21s/it, lr=0.0001, step_loss=0.386]Steps: 67%|██████▋ | 1349/2000 [28:26<13:04, 1.21s/it, lr=0.0001, step_loss=0.000469]Steps: 68%|██████▊ | 1350/2000 [28:27<11:36, 1.07s/it, lr=0.0001, step_loss=0.000469]Steps: 68%|██████▊ | 1350/2000 [28:27<11:36, 1.07s/it, lr=0.0001, step_loss=0.000669]Steps: 68%|██████▊ | 1351/2000 [28:27<10:35, 1.02it/s, lr=0.0001, step_loss=0.000669]Steps: 68%|██████▊ | 1351/2000 [28:27<10:35, 1.02it/s, lr=0.0001, step_loss=0.249] Steps: 68%|██████▊ | 1352/2000 [28:28<09:52, 1.09it/s, lr=0.0001, step_loss=0.249]Steps: 68%|██████▊ | 1352/2000 [28:28<09:52, 1.09it/s, lr=0.0001, step_loss=0.237]Steps: 68%|██████▊ | 1353/2000 [28:29<09:21, 1.15it/s, lr=0.0001, step_loss=0.237]Steps: 68%|██████▊ | 1353/2000 [28:29<09:21, 1.15it/s, lr=0.0001, step_loss=0.00154]Steps: 68%|██████▊ | 1354/2000 [28:30<09:00, 1.20it/s, lr=0.0001, step_loss=0.00154]Steps: 68%|██████▊ | 1354/2000 [28:30<09:00, 1.20it/s, lr=0.0001, step_loss=0.00698]Steps: 68%|██████▊ | 1355/2000 [28:30<08:44, 1.23it/s, lr=0.0001, step_loss=0.00698]Steps: 68%|██████▊ | 1355/2000 [28:30<08:44, 1.23it/s, lr=0.0001, step_loss=0.0868] Steps: 68%|██████▊ | 1356/2000 [28:31<08:33, 1.25it/s, lr=0.0001, step_loss=0.0868]Steps: 68%|██████▊ | 1356/2000 [28:31<08:33, 1.25it/s, lr=0.0001, step_loss=0.000921]Steps: 68%|██████▊ | 1357/2000 [28:32<08:26, 1.27it/s, lr=0.0001, step_loss=0.000921]Steps: 68%|██████▊ | 1357/2000 [28:32<08:26, 1.27it/s, lr=0.0001, step_loss=0.29] Steps: 68%|██████▊ | 1358/2000 [28:33<08:20, 1.28it/s, lr=0.0001, step_loss=0.29]Steps: 68%|██████▊ | 1358/2000 [28:33<08:20, 1.28it/s, lr=0.0001, step_loss=0.00681]Steps: 68%|██████▊ | 1359/2000 [28:33<08:16, 1.29it/s, lr=0.0001, step_loss=0.00681]Steps: 68%|██████▊ | 1359/2000 [28:33<08:16, 1.29it/s, lr=0.0001, step_loss=0.0223] Steps: 68%|██████▊ | 1360/2000 [28:34<08:12, 1.30it/s, lr=0.0001, step_loss=0.0223]Steps: 68%|██████▊ | 1360/2000 [28:34<08:12, 1.30it/s, lr=0.0001, step_loss=0.04] Steps: 68%|██████▊ | 1361/2000 [28:35<08:10, 1.30it/s, lr=0.0001, step_loss=0.04]Steps: 68%|██████▊ | 1361/2000 [28:35<08:10, 1.30it/s, lr=0.0001, step_loss=0.0637]Steps: 68%|██████▊ | 1362/2000 [28:36<08:08, 1.31it/s, lr=0.0001, step_loss=0.0637]Steps: 68%|██████▊ | 1362/2000 [28:36<08:08, 1.31it/s, lr=0.0001, step_loss=0.00496]Steps: 68%|██████▊ | 1363/2000 [28:36<08:06, 1.31it/s, lr=0.0001, step_loss=0.00496]Steps: 68%|██████▊ | 1363/2000 [28:36<08:06, 1.31it/s, lr=0.0001, step_loss=0.00205]Steps: 68%|██████▊ | 1364/2000 [28:37<08:05, 1.31it/s, lr=0.0001, step_loss=0.00205]Steps: 68%|██████▊ | 1364/2000 [28:37<08:05, 1.31it/s, lr=0.0001, step_loss=0.0459] Steps: 68%|██████▊ | 1365/2000 [28:38<08:05, 1.31it/s, lr=0.0001, step_loss=0.0459]Steps: 68%|██████▊ | 1365/2000 [28:38<08:05, 1.31it/s, lr=0.0001, step_loss=0.173] Steps: 68%|██████▊ | 1366/2000 [28:39<08:04, 1.31it/s, lr=0.0001, step_loss=0.173]Steps: 68%|██████▊ | 1366/2000 [28:39<08:04, 1.31it/s, lr=0.0001, step_loss=0.0182]Steps: 68%|██████▊ | 1367/2000 [28:40<08:02, 1.31it/s, lr=0.0001, step_loss=0.0182]Steps: 68%|██████▊ | 1367/2000 [28:40<08:02, 1.31it/s, lr=0.0001, step_loss=0.0103]Steps: 68%|██████▊ | 1368/2000 [28:40<08:01, 1.31it/s, lr=0.0001, step_loss=0.0103]Steps: 68%|██████▊ | 1368/2000 [28:40<08:01, 1.31it/s, lr=0.0001, step_loss=0.00497]Steps: 68%|██████▊ | 1369/2000 [28:41<08:00, 1.31it/s, lr=0.0001, step_loss=0.00497]Steps: 68%|██████▊ | 1369/2000 [28:41<08:00, 1.31it/s, lr=0.0001, step_loss=0.0416] Steps: 68%|██████▊ | 1370/2000 [28:42<08:00, 1.31it/s, lr=0.0001, step_loss=0.0416]Steps: 68%|██████▊ | 1370/2000 [28:42<08:00, 1.31it/s, lr=0.0001, step_loss=0.00119]Steps: 69%|██████▊ | 1371/2000 [28:43<07:59, 1.31it/s, lr=0.0001, step_loss=0.00119]Steps: 69%|██████▊ | 1371/2000 [28:43<07:59, 1.31it/s, lr=0.0001, step_loss=0.108] Steps: 69%|██████▊ | 1372/2000 [28:43<07:58, 1.31it/s, lr=0.0001, step_loss=0.108]Steps: 69%|██████▊ | 1372/2000 [28:43<07:58, 1.31it/s, lr=0.0001, step_loss=0.0987]Steps: 69%|██████▊ | 1373/2000 [28:44<07:57, 1.31it/s, lr=0.0001, step_loss=0.0987]Steps: 69%|██████▊ | 1373/2000 [28:44<07:57, 1.31it/s, lr=0.0001, step_loss=0.13] Steps: 69%|██████▊ | 1374/2000 [28:45<07:57, 1.31it/s, lr=0.0001, step_loss=0.13]Steps: 69%|██████▊ | 1374/2000 [28:45<07:57, 1.31it/s, lr=0.0001, step_loss=0.0602]Steps: 69%|██████▉ | 1375/2000 [28:46<07:57, 1.31it/s, lr=0.0001, step_loss=0.0602]Steps: 69%|██████▉ | 1375/2000 [28:46<07:57, 1.31it/s, lr=0.0001, step_loss=0.0195]Steps: 69%|██████▉ | 1376/2000 [28:46<07:56, 1.31it/s, lr=0.0001, step_loss=0.0195]11/14/2025 06:37:33 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1376)
Steps: 69%|██████▉ | 1376/2000 [28:54<07:56, 1.31it/s, lr=0.0001, step_loss=0.000522]11/14/2025 06:37:33 - INFO - root - ### DEBUG: Finished epoch 42, epoch_steps=32, global_step=1376
11/14/2025 06:37:33 - INFO - root - ### DEBUG: Starting epoch 43/63, global_step=1376, max_train_steps=2000
Steps: 69%|██████▉ | 1377/2000 [28:55<31:00, 2.99s/it, lr=0.0001, step_loss=0.000522]Steps: 69%|██████▉ | 1377/2000 [28:55<31:00, 2.99s/it, lr=0.0001, step_loss=0.0401] Steps: 69%|██████▉ | 1378/2000 [28:55<24:02, 2.32s/it, lr=0.0001, step_loss=0.0401]Steps: 69%|██████▉ | 1378/2000 [28:55<24:02, 2.32s/it, lr=0.0001, step_loss=0.0779]Steps: 69%|██████▉ | 1379/2000 [28:56<19:10, 1.85s/it, lr=0.0001, step_loss=0.0779]Steps: 69%|██████▉ | 1379/2000 [28:56<19:10, 1.85s/it, lr=0.0001, step_loss=0.0868]Steps: 69%|██████▉ | 1380/2000 [28:57<15:46, 1.53s/it, lr=0.0001, step_loss=0.0868]Steps: 69%|██████▉ | 1380/2000 [28:57<15:46, 1.53s/it, lr=0.0001, step_loss=0.00619]Steps: 69%|██████▉ | 1381/2000 [28:58<13:23, 1.30s/it, lr=0.0001, step_loss=0.00619]Steps: 69%|██████▉ | 1381/2000 [28:58<13:23, 1.30s/it, lr=0.0001, step_loss=0.00696]Steps: 69%|██████▉ | 1382/2000 [28:58<11:43, 1.14s/it, lr=0.0001, step_loss=0.00696]Steps: 69%|██████▉ | 1382/2000 [28:58<11:43, 1.14s/it, lr=0.0001, step_loss=0.00483]Steps: 69%|██████▉ | 1383/2000 [28:59<10:33, 1.03s/it, lr=0.0001, step_loss=0.00483]Steps: 69%|██████▉ | 1383/2000 [28:59<10:33, 1.03s/it, lr=0.0001, step_loss=0.00307]Steps: 69%|██████▉ | 1384/2000 [29:00<09:43, 1.06it/s, lr=0.0001, step_loss=0.00307]Steps: 69%|██████▉ | 1384/2000 [29:00<09:43, 1.06it/s, lr=0.0001, step_loss=0.124] Steps: 69%|██████▉ | 1385/2000 [29:01<09:09, 1.12it/s, lr=0.0001, step_loss=0.124]Steps: 69%|██████▉ | 1385/2000 [29:01<09:09, 1.12it/s, lr=0.0001, step_loss=0.00438]Steps: 69%|██████▉ | 1386/2000 [29:01<08:44, 1.17it/s, lr=0.0001, step_loss=0.00438]Steps: 69%|██████▉ | 1386/2000 [29:01<08:44, 1.17it/s, lr=0.0001, step_loss=0.000754]Steps: 69%|██████▉ | 1387/2000 [29:02<08:27, 1.21it/s, lr=0.0001, step_loss=0.000754]Steps: 69%|██████▉ | 1387/2000 [29:02<08:27, 1.21it/s, lr=0.0001, step_loss=0.00082] Steps: 69%|██████▉ | 1388/2000 [29:03<08:14, 1.24it/s, lr=0.0001, step_loss=0.00082]Steps: 69%|██████▉ | 1388/2000 [29:03<08:14, 1.24it/s, lr=0.0001, step_loss=0.000386]Steps: 69%|██████▉ | 1389/2000 [29:04<08:05, 1.26it/s, lr=0.0001, step_loss=0.000386]Steps: 69%|██████▉ | 1389/2000 [29:04<08:05, 1.26it/s, lr=0.0001, step_loss=0.000457]Steps: 70%|██████▉ | 1390/2000 [29:04<07:59, 1.27it/s, lr=0.0001, step_loss=0.000457]Steps: 70%|██████▉ | 1390/2000 [29:05<07:59, 1.27it/s, lr=0.0001, step_loss=0.00108] Steps: 70%|██████▉ | 1391/2000 [29:05<07:54, 1.28it/s, lr=0.0001, step_loss=0.00108]Steps: 70%|██████▉ | 1391/2000 [29:05<07:54, 1.28it/s, lr=0.0001, step_loss=0.229] Steps: 70%|██████▉ | 1392/2000 [29:06<07:51, 1.29it/s, lr=0.0001, step_loss=0.229]Steps: 70%|██████▉ | 1392/2000 [29:06<07:51, 1.29it/s, lr=0.0001, step_loss=0.0249]Steps: 70%|██████▉ | 1393/2000 [29:07<07:48, 1.30it/s, lr=0.0001, step_loss=0.0249]Steps: 70%|██████▉ | 1393/2000 [29:07<07:48, 1.30it/s, lr=0.0001, step_loss=0.068] Steps: 70%|██████▉ | 1394/2000 [29:08<07:46, 1.30it/s, lr=0.0001, step_loss=0.068]Steps: 70%|██████▉ | 1394/2000 [29:08<07:46, 1.30it/s, lr=0.0001, step_loss=0.00186]Steps: 70%|██████▉ | 1395/2000 [29:08<07:44, 1.30it/s, lr=0.0001, step_loss=0.00186]Steps: 70%|██████▉ | 1395/2000 [29:08<07:44, 1.30it/s, lr=0.0001, step_loss=0.00784]Steps: 70%|██████▉ | 1396/2000 [29:09<07:42, 1.30it/s, lr=0.0001, step_loss=0.00784]Steps: 70%|██████▉ | 1396/2000 [29:09<07:42, 1.30it/s, lr=0.0001, step_loss=0.0601] Steps: 70%|██████▉ | 1397/2000 [29:10<07:41, 1.31it/s, lr=0.0001, step_loss=0.0601]Steps: 70%|██████▉ | 1397/2000 [29:10<07:41, 1.31it/s, lr=0.0001, step_loss=0.0477]Steps: 70%|██████▉ | 1398/2000 [29:11<07:40, 1.31it/s, lr=0.0001, step_loss=0.0477]Steps: 70%|██████▉ | 1398/2000 [29:11<07:40, 1.31it/s, lr=0.0001, step_loss=0.00129]Steps: 70%|██████▉ | 1399/2000 [29:11<07:39, 1.31it/s, lr=0.0001, step_loss=0.00129]Steps: 70%|██████▉ | 1399/2000 [29:11<07:39, 1.31it/s, lr=0.0001, step_loss=0.00948]Steps: 70%|███████ | 1400/2000 [29:12<07:39, 1.31it/s, lr=0.0001, step_loss=0.00948]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.69it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.01it/s][A100%|██████████| 8/8 [00:00<00:00, 32.18it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.69it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 43.97it/s][A100%|██████████| 8/8 [00:00<00:00, 32.17it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 43.98it/s][A100%|██████████| 8/8 [00:00<00:00, 32.17it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.04it/s][A100%|██████████| 8/8 [00:00<00:00, 32.19it/s]
11/14/2025 06:38:56 - INFO - root - Saved samples to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/samples/sample-1400.gif
Steps: 70%|███████ | 1400/2000 [30:16<07:39, 1.31it/s, lr=0.0001, step_loss=0.198] Steps: 70%|███████ | 1401/2000 [30:17<3:19:41, 20.00s/it, lr=0.0001, step_loss=0.198]Steps: 70%|███████ | 1401/2000 [30:17<3:19:41, 20.00s/it, lr=0.0001, step_loss=0.0374]Steps: 70%|███████ | 1402/2000 [30:18<2:21:49, 14.23s/it, lr=0.0001, step_loss=0.0374]Steps: 70%|███████ | 1402/2000 [30:18<2:21:49, 14.23s/it, lr=0.0001, step_loss=0.171] Steps: 70%|███████ | 1403/2000 [30:19<1:41:23, 10.19s/it, lr=0.0001, step_loss=0.171]Steps: 70%|███████ | 1403/2000 [30:19<1:41:23, 10.19s/it, lr=0.0001, step_loss=0.0013]Steps: 70%|███████ | 1404/2000 [30:19<1:13:07, 7.36s/it, lr=0.0001, step_loss=0.0013]Steps: 70%|███████ | 1404/2000 [30:19<1:13:07, 7.36s/it, lr=0.0001, step_loss=0.00559]Steps: 70%|███████ | 1405/2000 [30:20<53:22, 5.38s/it, lr=0.0001, step_loss=0.00559] Steps: 70%|███████ | 1405/2000 [30:20<53:22, 5.38s/it, lr=0.0001, step_loss=0.0642] Steps: 70%|███████ | 1406/2000 [30:21<39:33, 4.00s/it, lr=0.0001, step_loss=0.0642]Steps: 70%|███████ | 1406/2000 [30:21<39:33, 4.00s/it, lr=0.0001, step_loss=0.00181]Steps: 70%|███████ | 1407/2000 [30:22<29:54, 3.03s/it, lr=0.0001, step_loss=0.00181]Steps: 70%|███████ | 1407/2000 [30:22<29:54, 3.03s/it, lr=0.0001, step_loss=0.00294]Steps: 70%|███████ | 1408/2000 [30:22<23:09, 2.35s/it, lr=0.0001, step_loss=0.00294]11/14/2025 06:39:09 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1408)
Steps: 70%|███████ | 1408/2000 [30:29<23:09, 2.35s/it, lr=0.0001, step_loss=0.0044] 11/14/2025 06:39:09 - INFO - root - ### DEBUG: Finished epoch 43, epoch_steps=32, global_step=1408
11/14/2025 06:39:09 - INFO - root - ### DEBUG: Starting epoch 44/63, global_step=1408, max_train_steps=2000
Steps: 70%|███████ | 1409/2000 [30:30<37:49, 3.84s/it, lr=0.0001, step_loss=0.0044]Steps: 70%|███████ | 1409/2000 [30:30<37:49, 3.84s/it, lr=0.0001, step_loss=0.00563]Steps: 70%|███████ | 1410/2000 [30:30<28:41, 2.92s/it, lr=0.0001, step_loss=0.00563]Steps: 70%|███████ | 1410/2000 [30:30<28:41, 2.92s/it, lr=0.0001, step_loss=0.000783]Steps: 71%|███████ | 1411/2000 [30:31<22:17, 2.27s/it, lr=0.0001, step_loss=0.000783]Steps: 71%|███████ | 1411/2000 [30:31<22:17, 2.27s/it, lr=0.0001, step_loss=0.00551] Steps: 71%|███████ | 1412/2000 [30:32<17:48, 1.82s/it, lr=0.0001, step_loss=0.00551]Steps: 71%|███████ | 1412/2000 [30:32<17:48, 1.82s/it, lr=0.0001, step_loss=0.00204]Steps: 71%|███████ | 1413/2000 [30:33<14:41, 1.50s/it, lr=0.0001, step_loss=0.00204]Steps: 71%|███████ | 1413/2000 [30:33<14:41, 1.50s/it, lr=0.0001, step_loss=0.000709]Steps: 71%|███████ | 1414/2000 [30:33<12:30, 1.28s/it, lr=0.0001, step_loss=0.000709]Steps: 71%|███████ | 1414/2000 [30:34<12:30, 1.28s/it, lr=0.0001, step_loss=0.0196] Steps: 71%|███████ | 1415/2000 [30:34<10:58, 1.13s/it, lr=0.0001, step_loss=0.0196]Steps: 71%|███████ | 1415/2000 [30:34<10:58, 1.13s/it, lr=0.0001, step_loss=0.0282]Steps: 71%|███████ | 1416/2000 [30:35<09:53, 1.02s/it, lr=0.0001, step_loss=0.0282]Steps: 71%|███████ | 1416/2000 [30:35<09:53, 1.02s/it, lr=0.0001, step_loss=0.0136]Steps: 71%|███████ | 1417/2000 [30:36<09:07, 1.06it/s, lr=0.0001, step_loss=0.0136]Steps: 71%|███████ | 1417/2000 [30:36<09:07, 1.06it/s, lr=0.0001, step_loss=0.102] Steps: 71%|███████ | 1418/2000 [30:37<08:35, 1.13it/s, lr=0.0001, step_loss=0.102]Steps: 71%|███████ | 1418/2000 [30:37<08:35, 1.13it/s, lr=0.0001, step_loss=0.014]Steps: 71%|███████ | 1419/2000 [30:37<08:13, 1.18it/s, lr=0.0001, step_loss=0.014]Steps: 71%|███████ | 1419/2000 [30:37<08:13, 1.18it/s, lr=0.0001, step_loss=0.00181]Steps: 71%|███████ | 1420/2000 [30:38<07:57, 1.22it/s, lr=0.0001, step_loss=0.00181]Steps: 71%|███████ | 1420/2000 [30:38<07:57, 1.22it/s, lr=0.0001, step_loss=0.0036] Steps: 71%|███████ | 1421/2000 [30:39<07:45, 1.24it/s, lr=0.0001, step_loss=0.0036]Steps: 71%|███████ | 1421/2000 [30:39<07:45, 1.24it/s, lr=0.0001, step_loss=0.0266]Steps: 71%|███████ | 1422/2000 [30:40<07:37, 1.26it/s, lr=0.0001, step_loss=0.0266]Steps: 71%|███████ | 1422/2000 [30:40<07:37, 1.26it/s, lr=0.0001, step_loss=0.000903]Steps: 71%|███████ | 1423/2000 [30:40<07:31, 1.28it/s, lr=0.0001, step_loss=0.000903]Steps: 71%|███████ | 1423/2000 [30:40<07:31, 1.28it/s, lr=0.0001, step_loss=0.00125] Steps: 71%|███████ | 1424/2000 [30:41<07:27, 1.29it/s, lr=0.0001, step_loss=0.00125]Steps: 71%|███████ | 1424/2000 [30:41<07:27, 1.29it/s, lr=0.0001, step_loss=0.0628] Steps: 71%|███████▏ | 1425/2000 [30:42<07:24, 1.29it/s, lr=0.0001, step_loss=0.0628]Steps: 71%|███████▏ | 1425/2000 [30:42<07:24, 1.29it/s, lr=0.0001, step_loss=0.144] Steps: 71%|███████▏ | 1426/2000 [30:43<07:21, 1.30it/s, lr=0.0001, step_loss=0.144]Steps: 71%|███████▏ | 1426/2000 [30:43<07:21, 1.30it/s, lr=0.0001, step_loss=0.000478]Steps: 71%|███████▏ | 1427/2000 [30:43<07:20, 1.30it/s, lr=0.0001, step_loss=0.000478]Steps: 71%|███████▏ | 1427/2000 [30:43<07:20, 1.30it/s, lr=0.0001, step_loss=0.0126] Steps: 71%|███████▏ | 1428/2000 [30:44<07:18, 1.31it/s, lr=0.0001, step_loss=0.0126]Steps: 71%|███████▏ | 1428/2000 [30:44<07:18, 1.31it/s, lr=0.0001, step_loss=0.0354]Steps: 71%|███████▏ | 1429/2000 [30:45<07:16, 1.31it/s, lr=0.0001, step_loss=0.0354]Steps: 71%|███████▏ | 1429/2000 [30:45<07:16, 1.31it/s, lr=0.0001, step_loss=0.0196]Steps: 72%|███████▏ | 1430/2000 [30:46<07:15, 1.31it/s, lr=0.0001, step_loss=0.0196]Steps: 72%|███████▏ | 1430/2000 [30:46<07:15, 1.31it/s, lr=0.0001, step_loss=0.011] Steps: 72%|███████▏ | 1431/2000 [30:46<07:14, 1.31it/s, lr=0.0001, step_loss=0.011]Steps: 72%|███████▏ | 1431/2000 [30:46<07:14, 1.31it/s, lr=0.0001, step_loss=0.00124]Steps: 72%|███████▏ | 1432/2000 [30:47<07:13, 1.31it/s, lr=0.0001, step_loss=0.00124]Steps: 72%|███████▏ | 1432/2000 [30:47<07:13, 1.31it/s, lr=0.0001, step_loss=0.082] Steps: 72%|███████▏ | 1433/2000 [30:48<07:13, 1.31it/s, lr=0.0001, step_loss=0.082]Steps: 72%|███████▏ | 1433/2000 [30:48<07:13, 1.31it/s, lr=0.0001, step_loss=0.00567]Steps: 72%|███████▏ | 1434/2000 [30:49<07:12, 1.31it/s, lr=0.0001, step_loss=0.00567]Steps: 72%|███████▏ | 1434/2000 [30:49<07:12, 1.31it/s, lr=0.0001, step_loss=0.0212] Steps: 72%|███████▏ | 1435/2000 [30:49<07:11, 1.31it/s, lr=0.0001, step_loss=0.0212]Steps: 72%|███████▏ | 1435/2000 [30:50<07:11, 1.31it/s, lr=0.0001, step_loss=0.00563]Steps: 72%|███████▏ | 1436/2000 [30:50<07:09, 1.31it/s, lr=0.0001, step_loss=0.00563]Steps: 72%|███████▏ | 1436/2000 [30:50<07:09, 1.31it/s, lr=0.0001, step_loss=0.00946]Steps: 72%|███████▏ | 1437/2000 [30:51<07:09, 1.31it/s, lr=0.0001, step_loss=0.00946]Steps: 72%|███████▏ | 1437/2000 [30:51<07:09, 1.31it/s, lr=0.0001, step_loss=0.0996] Steps: 72%|███████▏ | 1438/2000 [30:52<07:08, 1.31it/s, lr=0.0001, step_loss=0.0996]Steps: 72%|███████▏ | 1438/2000 [30:52<07:08, 1.31it/s, lr=0.0001, step_loss=0.00163]Steps: 72%|███████▏ | 1439/2000 [30:53<07:07, 1.31it/s, lr=0.0001, step_loss=0.00163]Steps: 72%|███████▏ | 1439/2000 [30:53<07:07, 1.31it/s, lr=0.0001, step_loss=0.000638]Steps: 72%|███████▏ | 1440/2000 [30:53<07:06, 1.31it/s, lr=0.0001, step_loss=0.000638]11/14/2025 06:39:39 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1440)
Steps: 72%|███████▏ | 1440/2000 [31:00<07:06, 1.31it/s, lr=0.0001, step_loss=0.00603] 11/14/2025 06:39:39 - INFO - root - ### DEBUG: Finished epoch 44, epoch_steps=32, global_step=1440
11/14/2025 06:39:39 - INFO - root - ### DEBUG: Starting epoch 45/63, global_step=1440, max_train_steps=2000
Steps: 72%|███████▏ | 1441/2000 [31:00<24:33, 2.64s/it, lr=0.0001, step_loss=0.00603]Steps: 72%|███████▏ | 1441/2000 [31:00<24:33, 2.64s/it, lr=0.0001, step_loss=0.163] Steps: 72%|███████▏ | 1442/2000 [31:01<19:17, 2.07s/it, lr=0.0001, step_loss=0.163]Steps: 72%|███████▏ | 1442/2000 [31:01<19:17, 2.07s/it, lr=0.0001, step_loss=0.0508]Steps: 72%|███████▏ | 1443/2000 [31:02<15:36, 1.68s/it, lr=0.0001, step_loss=0.0508]Steps: 72%|███████▏ | 1443/2000 [31:02<15:36, 1.68s/it, lr=0.0001, step_loss=0.0324]Steps: 72%|███████▏ | 1444/2000 [31:03<13:01, 1.40s/it, lr=0.0001, step_loss=0.0324]Steps: 72%|███████▏ | 1444/2000 [31:03<13:01, 1.40s/it, lr=0.0001, step_loss=0.0172]Steps: 72%|███████▏ | 1445/2000 [31:03<11:12, 1.21s/it, lr=0.0001, step_loss=0.0172]Steps: 72%|███████▏ | 1445/2000 [31:03<11:12, 1.21s/it, lr=0.0001, step_loss=0.126] Steps: 72%|███████▏ | 1446/2000 [31:04<09:56, 1.08s/it, lr=0.0001, step_loss=0.126]Steps: 72%|███████▏ | 1446/2000 [31:04<09:56, 1.08s/it, lr=0.0001, step_loss=0.0554]Steps: 72%|███████▏ | 1447/2000 [31:05<09:03, 1.02it/s, lr=0.0001, step_loss=0.0554]Steps: 72%|███████▏ | 1447/2000 [31:05<09:03, 1.02it/s, lr=0.0001, step_loss=0.000998]Steps: 72%|███████▏ | 1448/2000 [31:06<08:25, 1.09it/s, lr=0.0001, step_loss=0.000998]Steps: 72%|███████▏ | 1448/2000 [31:06<08:25, 1.09it/s, lr=0.0001, step_loss=0.0273] Steps: 72%|███████▏ | 1449/2000 [31:06<07:58, 1.15it/s, lr=0.0001, step_loss=0.0273]Steps: 72%|███████▏ | 1449/2000 [31:06<07:58, 1.15it/s, lr=0.0001, step_loss=0.0219]Steps: 72%|███████▎ | 1450/2000 [31:07<07:40, 1.20it/s, lr=0.0001, step_loss=0.0219]Steps: 72%|███████▎ | 1450/2000 [31:07<07:40, 1.20it/s, lr=0.0001, step_loss=0.039] Steps: 73%|███████▎ | 1451/2000 [31:08<07:26, 1.23it/s, lr=0.0001, step_loss=0.039]Steps: 73%|███████▎ | 1451/2000 [31:08<07:26, 1.23it/s, lr=0.0001, step_loss=0.51] Steps: 73%|███████▎ | 1452/2000 [31:09<07:17, 1.25it/s, lr=0.0001, step_loss=0.51]Steps: 73%|███████▎ | 1452/2000 [31:09<07:17, 1.25it/s, lr=0.0001, step_loss=0.315]Steps: 73%|███████▎ | 1453/2000 [31:09<07:10, 1.27it/s, lr=0.0001, step_loss=0.315]Steps: 73%|███████▎ | 1453/2000 [31:09<07:10, 1.27it/s, lr=0.0001, step_loss=0.0053]Steps: 73%|███████▎ | 1454/2000 [31:10<07:05, 1.28it/s, lr=0.0001, step_loss=0.0053]Steps: 73%|███████▎ | 1454/2000 [31:10<07:05, 1.28it/s, lr=0.0001, step_loss=0.0269]Steps: 73%|███████▎ | 1455/2000 [31:11<07:01, 1.29it/s, lr=0.0001, step_loss=0.0269]Steps: 73%|███████▎ | 1455/2000 [31:11<07:01, 1.29it/s, lr=0.0001, step_loss=0.00507]Steps: 73%|███████▎ | 1456/2000 [31:12<06:58, 1.30it/s, lr=0.0001, step_loss=0.00507]Steps: 73%|███████▎ | 1456/2000 [31:12<06:58, 1.30it/s, lr=0.0001, step_loss=0.236] Steps: 73%|███████▎ | 1457/2000 [31:12<06:56, 1.30it/s, lr=0.0001, step_loss=0.236]Steps: 73%|███████▎ | 1457/2000 [31:13<06:56, 1.30it/s, lr=0.0001, step_loss=0.00394]Steps: 73%|███████▎ | 1458/2000 [31:13<06:55, 1.30it/s, lr=0.0001, step_loss=0.00394]Steps: 73%|███████▎ | 1458/2000 [31:13<06:55, 1.30it/s, lr=0.0001, step_loss=0.0876] Steps: 73%|███████▎ | 1459/2000 [31:14<06:54, 1.31it/s, lr=0.0001, step_loss=0.0876]Steps: 73%|███████▎ | 1459/2000 [31:14<06:54, 1.31it/s, lr=0.0001, step_loss=0.0529]Steps: 73%|███████▎ | 1460/2000 [31:15<06:52, 1.31it/s, lr=0.0001, step_loss=0.0529]Steps: 73%|███████▎ | 1460/2000 [31:15<06:52, 1.31it/s, lr=0.0001, step_loss=0.00489]Steps: 73%|███████▎ | 1461/2000 [31:16<06:51, 1.31it/s, lr=0.0001, step_loss=0.00489]Steps: 73%|███████▎ | 1461/2000 [31:16<06:51, 1.31it/s, lr=0.0001, step_loss=0.000659]Steps: 73%|███████▎ | 1462/2000 [31:16<06:50, 1.31it/s, lr=0.0001, step_loss=0.000659]Steps: 73%|███████▎ | 1462/2000 [31:16<06:50, 1.31it/s, lr=0.0001, step_loss=0.0186] Steps: 73%|███████▎ | 1463/2000 [31:17<06:49, 1.31it/s, lr=0.0001, step_loss=0.0186]Steps: 73%|███████▎ | 1463/2000 [31:17<06:49, 1.31it/s, lr=0.0001, step_loss=0.00119]Steps: 73%|███████▎ | 1464/2000 [31:18<06:48, 1.31it/s, lr=0.0001, step_loss=0.00119]Steps: 73%|███████▎ | 1464/2000 [31:18<06:48, 1.31it/s, lr=0.0001, step_loss=0.394] Steps: 73%|███████▎ | 1465/2000 [31:19<06:48, 1.31it/s, lr=0.0001, step_loss=0.394]Steps: 73%|███████▎ | 1465/2000 [31:19<06:48, 1.31it/s, lr=0.0001, step_loss=0.351]Steps: 73%|███████▎ | 1466/2000 [31:19<06:47, 1.31it/s, lr=0.0001, step_loss=0.351]Steps: 73%|███████▎ | 1466/2000 [31:19<06:47, 1.31it/s, lr=0.0001, step_loss=0.0055]Steps: 73%|███████▎ | 1467/2000 [31:20<06:46, 1.31it/s, lr=0.0001, step_loss=0.0055]Steps: 73%|███████▎ | 1467/2000 [31:20<06:46, 1.31it/s, lr=0.0001, step_loss=0.00235]Steps: 73%|███████▎ | 1468/2000 [31:21<06:45, 1.31it/s, lr=0.0001, step_loss=0.00235]Steps: 73%|███████▎ | 1468/2000 [31:21<06:45, 1.31it/s, lr=0.0001, step_loss=0.0127] Steps: 73%|███████▎ | 1469/2000 [31:22<06:44, 1.31it/s, lr=0.0001, step_loss=0.0127]Steps: 73%|███████▎ | 1469/2000 [31:22<06:44, 1.31it/s, lr=0.0001, step_loss=0.0112]Steps: 74%|███████▎ | 1470/2000 [31:22<06:43, 1.31it/s, lr=0.0001, step_loss=0.0112]Steps: 74%|███████▎ | 1470/2000 [31:22<06:43, 1.31it/s, lr=0.0001, step_loss=0.022] Steps: 74%|███████▎ | 1471/2000 [31:23<06:42, 1.31it/s, lr=0.0001, step_loss=0.022]Steps: 74%|███████▎ | 1471/2000 [31:23<06:42, 1.31it/s, lr=0.0001, step_loss=0.0132]Steps: 74%|███████▎ | 1472/2000 [31:24<06:42, 1.31it/s, lr=0.0001, step_loss=0.0132]11/14/2025 06:40:10 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1472)
Steps: 74%|███████▎ | 1472/2000 [31:30<06:42, 1.31it/s, lr=0.0001, step_loss=0.215] 11/14/2025 06:40:10 - INFO - root - ### DEBUG: Finished epoch 45, epoch_steps=32, global_step=1472
11/14/2025 06:40:10 - INFO - root - ### DEBUG: Starting epoch 46/63, global_step=1472, max_train_steps=2000
Steps: 74%|███████▎ | 1473/2000 [31:31<23:28, 2.67s/it, lr=0.0001, step_loss=0.215]Steps: 74%|███████▎ | 1473/2000 [31:31<23:28, 2.67s/it, lr=0.0001, step_loss=0.00581]Steps: 74%|███████▎ | 1474/2000 [31:32<18:24, 2.10s/it, lr=0.0001, step_loss=0.00581]Steps: 74%|███████▎ | 1474/2000 [31:32<18:24, 2.10s/it, lr=0.0001, step_loss=0.255] Steps: 74%|███████▍ | 1475/2000 [31:33<14:51, 1.70s/it, lr=0.0001, step_loss=0.255]Steps: 74%|███████▍ | 1475/2000 [31:33<14:51, 1.70s/it, lr=0.0001, step_loss=0.00154]Steps: 74%|███████▍ | 1476/2000 [31:33<12:22, 1.42s/it, lr=0.0001, step_loss=0.00154]Steps: 74%|███████▍ | 1476/2000 [31:33<12:22, 1.42s/it, lr=0.0001, step_loss=0.00958]Steps: 74%|███████▍ | 1477/2000 [31:34<10:38, 1.22s/it, lr=0.0001, step_loss=0.00958]Steps: 74%|███████▍ | 1477/2000 [31:34<10:38, 1.22s/it, lr=0.0001, step_loss=0.0114] Steps: 74%|███████▍ | 1478/2000 [31:35<09:25, 1.08s/it, lr=0.0001, step_loss=0.0114]Steps: 74%|███████▍ | 1478/2000 [31:35<09:25, 1.08s/it, lr=0.0001, step_loss=0.00221]Steps: 74%|███████▍ | 1479/2000 [31:36<08:33, 1.01it/s, lr=0.0001, step_loss=0.00221]Steps: 74%|███████▍ | 1479/2000 [31:36<08:33, 1.01it/s, lr=0.0001, step_loss=0.00501]Steps: 74%|███████▍ | 1480/2000 [31:36<07:57, 1.09it/s, lr=0.0001, step_loss=0.00501]Steps: 74%|███████▍ | 1480/2000 [31:36<07:57, 1.09it/s, lr=0.0001, step_loss=0.000796]Steps: 74%|███████▍ | 1481/2000 [31:37<07:32, 1.15it/s, lr=0.0001, step_loss=0.000796]Steps: 74%|███████▍ | 1481/2000 [31:37<07:32, 1.15it/s, lr=0.0001, step_loss=0.0406] Steps: 74%|███████▍ | 1482/2000 [31:38<07:14, 1.19it/s, lr=0.0001, step_loss=0.0406]Steps: 74%|███████▍ | 1482/2000 [31:38<07:14, 1.19it/s, lr=0.0001, step_loss=0.000993]Steps: 74%|███████▍ | 1483/2000 [31:39<07:02, 1.22it/s, lr=0.0001, step_loss=0.000993]Steps: 74%|███████▍ | 1483/2000 [31:39<07:02, 1.22it/s, lr=0.0001, step_loss=0.12] Steps: 74%|███████▍ | 1484/2000 [31:40<07:03, 1.22it/s, lr=0.0001, step_loss=0.12]Steps: 74%|███████▍ | 1484/2000 [31:40<07:03, 1.22it/s, lr=0.0001, step_loss=0.0345]Steps: 74%|███████▍ | 1485/2000 [31:40<06:53, 1.25it/s, lr=0.0001, step_loss=0.0345]Steps: 74%|███████▍ | 1485/2000 [31:40<06:53, 1.25it/s, lr=0.0001, step_loss=0.00335]Steps: 74%|███████▍ | 1486/2000 [31:41<06:46, 1.26it/s, lr=0.0001, step_loss=0.00335]Steps: 74%|███████▍ | 1486/2000 [31:41<06:46, 1.26it/s, lr=0.0001, step_loss=0.000673]Steps: 74%|███████▍ | 1487/2000 [31:42<06:41, 1.28it/s, lr=0.0001, step_loss=0.000673]Steps: 74%|███████▍ | 1487/2000 [31:42<06:41, 1.28it/s, lr=0.0001, step_loss=0.00284] Steps: 74%|███████▍ | 1488/2000 [31:43<06:37, 1.29it/s, lr=0.0001, step_loss=0.00284]Steps: 74%|███████▍ | 1488/2000 [31:43<06:37, 1.29it/s, lr=0.0001, step_loss=0.00083]Steps: 74%|███████▍ | 1489/2000 [31:43<06:34, 1.30it/s, lr=0.0001, step_loss=0.00083]Steps: 74%|███████▍ | 1489/2000 [31:43<06:34, 1.30it/s, lr=0.0001, step_loss=0.0988] Steps: 74%|███████▍ | 1490/2000 [31:44<06:32, 1.30it/s, lr=0.0001, step_loss=0.0988]Steps: 74%|███████▍ | 1490/2000 [31:44<06:32, 1.30it/s, lr=0.0001, step_loss=0.00136]Steps: 75%|███████▍ | 1491/2000 [31:45<06:31, 1.30it/s, lr=0.0001, step_loss=0.00136]Steps: 75%|███████▍ | 1491/2000 [31:45<06:31, 1.30it/s, lr=0.0001, step_loss=0.0642] Steps: 75%|███████▍ | 1492/2000 [31:46<06:29, 1.30it/s, lr=0.0001, step_loss=0.0642]Steps: 75%|███████▍ | 1492/2000 [31:46<06:29, 1.30it/s, lr=0.0001, step_loss=0.00324]Steps: 75%|███████▍ | 1493/2000 [31:46<06:28, 1.31it/s, lr=0.0001, step_loss=0.00324]Steps: 75%|███████▍ | 1493/2000 [31:46<06:28, 1.31it/s, lr=0.0001, step_loss=0.00376]Steps: 75%|███████▍ | 1494/2000 [31:47<06:26, 1.31it/s, lr=0.0001, step_loss=0.00376]Steps: 75%|███████▍ | 1494/2000 [31:47<06:26, 1.31it/s, lr=0.0001, step_loss=0.0386] Steps: 75%|███████▍ | 1495/2000 [31:48<06:25, 1.31it/s, lr=0.0001, step_loss=0.0386]Steps: 75%|███████▍ | 1495/2000 [31:48<06:25, 1.31it/s, lr=0.0001, step_loss=0.00659]Steps: 75%|███████▍ | 1496/2000 [31:49<06:24, 1.31it/s, lr=0.0001, step_loss=0.00659]Steps: 75%|███████▍ | 1496/2000 [31:49<06:24, 1.31it/s, lr=0.0001, step_loss=0.152] Steps: 75%|███████▍ | 1497/2000 [31:49<06:23, 1.31it/s, lr=0.0001, step_loss=0.152]Steps: 75%|███████▍ | 1497/2000 [31:49<06:23, 1.31it/s, lr=0.0001, step_loss=0.0511]Steps: 75%|███████▍ | 1498/2000 [31:50<06:22, 1.31it/s, lr=0.0001, step_loss=0.0511]Steps: 75%|███████▍ | 1498/2000 [31:50<06:22, 1.31it/s, lr=0.0001, step_loss=0.0565]Steps: 75%|███████▍ | 1499/2000 [31:51<06:22, 1.31it/s, lr=0.0001, step_loss=0.0565]Steps: 75%|███████▍ | 1499/2000 [31:51<06:22, 1.31it/s, lr=0.0001, step_loss=0.0346]Steps: 75%|███████▌ | 1500/2000 [31:52<06:21, 1.31it/s, lr=0.0001, step_loss=0.0346]11/14/2025 06:40:43 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1500)
Steps: 75%|███████▌ | 1500/2000 [32:03<06:21, 1.31it/s, lr=0.0001, step_loss=0.00572]Steps: 75%|███████▌ | 1501/2000 [32:04<35:18, 4.25s/it, lr=0.0001, step_loss=0.00572]Steps: 75%|███████▌ | 1501/2000 [32:04<35:18, 4.25s/it, lr=0.0001, step_loss=0.00206]Steps: 75%|███████▌ | 1502/2000 [32:05<26:34, 3.20s/it, lr=0.0001, step_loss=0.00206]Steps: 75%|███████▌ | 1502/2000 [32:05<26:34, 3.20s/it, lr=0.0001, step_loss=0.000867]Steps: 75%|███████▌ | 1503/2000 [32:06<20:27, 2.47s/it, lr=0.0001, step_loss=0.000867]Steps: 75%|███████▌ | 1503/2000 [32:06<20:27, 2.47s/it, lr=0.0001, step_loss=0.0346] Steps: 75%|███████▌ | 1504/2000 [32:06<16:10, 1.96s/it, lr=0.0001, step_loss=0.0346]11/14/2025 06:40:53 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1504)
Steps: 75%|███████▌ | 1504/2000 [32:13<16:10, 1.96s/it, lr=0.0001, step_loss=0.287] 11/14/2025 06:40:53 - INFO - root - ### DEBUG: Finished epoch 46, epoch_steps=32, global_step=1504
11/14/2025 06:40:53 - INFO - root - ### DEBUG: Starting epoch 47/63, global_step=1504, max_train_steps=2000
Steps: 75%|███████▌ | 1505/2000 [32:14<29:26, 3.57s/it, lr=0.0001, step_loss=0.287]Steps: 75%|███████▌ | 1505/2000 [32:14<29:26, 3.57s/it, lr=0.0001, step_loss=0.208]Steps: 75%|███████▌ | 1506/2000 [32:14<22:27, 2.73s/it, lr=0.0001, step_loss=0.208]Steps: 75%|███████▌ | 1506/2000 [32:14<22:27, 2.73s/it, lr=0.0001, step_loss=0.0102]Steps: 75%|███████▌ | 1507/2000 [32:15<17:33, 2.14s/it, lr=0.0001, step_loss=0.0102]Steps: 75%|███████▌ | 1507/2000 [32:15<17:33, 2.14s/it, lr=0.0001, step_loss=0.0934]Steps: 75%|███████▌ | 1508/2000 [32:16<14:08, 1.72s/it, lr=0.0001, step_loss=0.0934]Steps: 75%|███████▌ | 1508/2000 [32:16<14:08, 1.72s/it, lr=0.0001, step_loss=0.0947]Steps: 75%|███████▌ | 1509/2000 [32:17<11:45, 1.44s/it, lr=0.0001, step_loss=0.0947]Steps: 75%|███████▌ | 1509/2000 [32:17<11:45, 1.44s/it, lr=0.0001, step_loss=0.0162]Steps: 76%|███████▌ | 1510/2000 [32:18<10:05, 1.23s/it, lr=0.0001, step_loss=0.0162]Steps: 76%|███████▌ | 1510/2000 [32:18<10:05, 1.23s/it, lr=0.0001, step_loss=0.0174]Steps: 76%|███████▌ | 1511/2000 [32:18<08:54, 1.09s/it, lr=0.0001, step_loss=0.0174]Steps: 76%|███████▌ | 1511/2000 [32:18<08:54, 1.09s/it, lr=0.0001, step_loss=0.19] Steps: 76%|███████▌ | 1512/2000 [32:19<08:05, 1.00it/s, lr=0.0001, step_loss=0.19]Steps: 76%|███████▌ | 1512/2000 [32:19<08:05, 1.00it/s, lr=0.0001, step_loss=0.00384]Steps: 76%|███████▌ | 1513/2000 [32:20<07:30, 1.08it/s, lr=0.0001, step_loss=0.00384]Steps: 76%|███████▌ | 1513/2000 [32:20<07:30, 1.08it/s, lr=0.0001, step_loss=0.257] Steps: 76%|███████▌ | 1514/2000 [32:21<07:06, 1.14it/s, lr=0.0001, step_loss=0.257]Steps: 76%|███████▌ | 1514/2000 [32:21<07:06, 1.14it/s, lr=0.0001, step_loss=0.00329]Steps: 76%|███████▌ | 1515/2000 [32:21<06:48, 1.19it/s, lr=0.0001, step_loss=0.00329]Steps: 76%|███████▌ | 1515/2000 [32:21<06:48, 1.19it/s, lr=0.0001, step_loss=0.0146] Steps: 76%|███████▌ | 1516/2000 [32:22<06:36, 1.22it/s, lr=0.0001, step_loss=0.0146]Steps: 76%|███████▌ | 1516/2000 [32:22<06:36, 1.22it/s, lr=0.0001, step_loss=0.0684]Steps: 76%|███████▌ | 1517/2000 [32:23<06:28, 1.24it/s, lr=0.0001, step_loss=0.0684]Steps: 76%|███████▌ | 1517/2000 [32:23<06:28, 1.24it/s, lr=0.0001, step_loss=0.00514]Steps: 76%|███████▌ | 1518/2000 [32:24<06:21, 1.26it/s, lr=0.0001, step_loss=0.00514]Steps: 76%|███████▌ | 1518/2000 [32:24<06:21, 1.26it/s, lr=0.0001, step_loss=0.175] Steps: 76%|███████▌ | 1519/2000 [32:24<06:16, 1.28it/s, lr=0.0001, step_loss=0.175]Steps: 76%|███████▌ | 1519/2000 [32:24<06:16, 1.28it/s, lr=0.0001, step_loss=0.0456]Steps: 76%|███████▌ | 1520/2000 [32:25<06:13, 1.29it/s, lr=0.0001, step_loss=0.0456]Steps: 76%|███████▌ | 1520/2000 [32:25<06:13, 1.29it/s, lr=0.0001, step_loss=0.03] Steps: 76%|███████▌ | 1521/2000 [32:26<06:10, 1.29it/s, lr=0.0001, step_loss=0.03]Steps: 76%|███████▌ | 1521/2000 [32:26<06:10, 1.29it/s, lr=0.0001, step_loss=0.002]Steps: 76%|███████▌ | 1522/2000 [32:27<06:08, 1.30it/s, lr=0.0001, step_loss=0.002]Steps: 76%|███████▌ | 1522/2000 [32:27<06:08, 1.30it/s, lr=0.0001, step_loss=0.106]Steps: 76%|███████▌ | 1523/2000 [32:27<06:06, 1.30it/s, lr=0.0001, step_loss=0.106]Steps: 76%|███████▌ | 1523/2000 [32:27<06:06, 1.30it/s, lr=0.0001, step_loss=0.000478]Steps: 76%|███████▌ | 1524/2000 [32:28<06:05, 1.30it/s, lr=0.0001, step_loss=0.000478]Steps: 76%|███████▌ | 1524/2000 [32:28<06:05, 1.30it/s, lr=0.0001, step_loss=0.0236] Steps: 76%|███████▋ | 1525/2000 [32:29<06:03, 1.31it/s, lr=0.0001, step_loss=0.0236]Steps: 76%|███████▋ | 1525/2000 [32:29<06:03, 1.31it/s, lr=0.0001, step_loss=0.00131]Steps: 76%|███████▋ | 1526/2000 [32:30<06:02, 1.31it/s, lr=0.0001, step_loss=0.00131]Steps: 76%|███████▋ | 1526/2000 [32:30<06:02, 1.31it/s, lr=0.0001, step_loss=0.00314]Steps: 76%|███████▋ | 1527/2000 [32:31<06:01, 1.31it/s, lr=0.0001, step_loss=0.00314]Steps: 76%|███████▋ | 1527/2000 [32:31<06:01, 1.31it/s, lr=0.0001, step_loss=0.289] Steps: 76%|███████▋ | 1528/2000 [32:31<06:00, 1.31it/s, lr=0.0001, step_loss=0.289]Steps: 76%|███████▋ | 1528/2000 [32:31<06:00, 1.31it/s, lr=0.0001, step_loss=0.00309]Steps: 76%|███████▋ | 1529/2000 [32:32<05:59, 1.31it/s, lr=0.0001, step_loss=0.00309]Steps: 76%|███████▋ | 1529/2000 [32:32<05:59, 1.31it/s, lr=0.0001, step_loss=0.000715]Steps: 76%|███████▋ | 1530/2000 [32:33<05:58, 1.31it/s, lr=0.0001, step_loss=0.000715]Steps: 76%|███████▋ | 1530/2000 [32:33<05:58, 1.31it/s, lr=0.0001, step_loss=0.00428] Steps: 77%|███████▋ | 1531/2000 [32:34<05:57, 1.31it/s, lr=0.0001, step_loss=0.00428]Steps: 77%|███████▋ | 1531/2000 [32:34<05:57, 1.31it/s, lr=0.0001, step_loss=0.00564]Steps: 77%|███████▋ | 1532/2000 [32:34<05:57, 1.31it/s, lr=0.0001, step_loss=0.00564]Steps: 77%|███████▋ | 1532/2000 [32:34<05:57, 1.31it/s, lr=0.0001, step_loss=0.00246]Steps: 77%|███████▋ | 1533/2000 [32:35<05:56, 1.31it/s, lr=0.0001, step_loss=0.00246]Steps: 77%|███████▋ | 1533/2000 [32:35<05:56, 1.31it/s, lr=0.0001, step_loss=0.0154] Steps: 77%|███████▋ | 1534/2000 [32:36<05:55, 1.31it/s, lr=0.0001, step_loss=0.0154]Steps: 77%|███████▋ | 1534/2000 [32:36<05:55, 1.31it/s, lr=0.0001, step_loss=0.0319]Steps: 77%|███████▋ | 1535/2000 [32:37<05:54, 1.31it/s, lr=0.0001, step_loss=0.0319]Steps: 77%|███████▋ | 1535/2000 [32:37<05:54, 1.31it/s, lr=0.0001, step_loss=0.0688]Steps: 77%|███████▋ | 1536/2000 [32:37<05:53, 1.31it/s, lr=0.0001, step_loss=0.0688]11/14/2025 06:41:24 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1536)
Steps: 77%|███████▋ | 1536/2000 [32:44<05:53, 1.31it/s, lr=0.0001, step_loss=0.106] 11/14/2025 06:41:24 - INFO - root - ### DEBUG: Finished epoch 47, epoch_steps=32, global_step=1536
11/14/2025 06:41:24 - INFO - root - ### DEBUG: Starting epoch 48/63, global_step=1536, max_train_steps=2000
Steps: 77%|███████▋ | 1537/2000 [32:45<21:06, 2.74s/it, lr=0.0001, step_loss=0.106]Steps: 77%|███████▋ | 1537/2000 [32:45<21:06, 2.74s/it, lr=0.0001, step_loss=0.00635]Steps: 77%|███████▋ | 1538/2000 [32:45<16:30, 2.14s/it, lr=0.0001, step_loss=0.00635]Steps: 77%|███████▋ | 1538/2000 [32:46<16:30, 2.14s/it, lr=0.0001, step_loss=0.00266]Steps: 77%|███████▋ | 1539/2000 [32:46<13:17, 1.73s/it, lr=0.0001, step_loss=0.00266]Steps: 77%|███████▋ | 1539/2000 [32:46<13:17, 1.73s/it, lr=0.0001, step_loss=0.127] Steps: 77%|███████▋ | 1540/2000 [32:47<11:02, 1.44s/it, lr=0.0001, step_loss=0.127]Steps: 77%|███████▋ | 1540/2000 [32:47<11:02, 1.44s/it, lr=0.0001, step_loss=0.00156]Steps: 77%|███████▋ | 1541/2000 [32:48<09:27, 1.24s/it, lr=0.0001, step_loss=0.00156]Steps: 77%|███████▋ | 1541/2000 [32:48<09:27, 1.24s/it, lr=0.0001, step_loss=0.0818] Steps: 77%|███████▋ | 1542/2000 [32:49<08:21, 1.09s/it, lr=0.0001, step_loss=0.0818]Steps: 77%|███████▋ | 1542/2000 [32:49<08:21, 1.09s/it, lr=0.0001, step_loss=0.117] Steps: 77%|███████▋ | 1543/2000 [32:49<07:34, 1.00it/s, lr=0.0001, step_loss=0.117]Steps: 77%|███████▋ | 1543/2000 [32:49<07:34, 1.00it/s, lr=0.0001, step_loss=0.151]Steps: 77%|███████▋ | 1544/2000 [32:50<07:01, 1.08it/s, lr=0.0001, step_loss=0.151]Steps: 77%|███████▋ | 1544/2000 [32:50<07:01, 1.08it/s, lr=0.0001, step_loss=0.0497]Steps: 77%|███████▋ | 1545/2000 [32:51<06:38, 1.14it/s, lr=0.0001, step_loss=0.0497]Steps: 77%|███████▋ | 1545/2000 [32:51<06:38, 1.14it/s, lr=0.0001, step_loss=0.0161]Steps: 77%|███████▋ | 1546/2000 [32:52<06:21, 1.19it/s, lr=0.0001, step_loss=0.0161]Steps: 77%|███████▋ | 1546/2000 [32:52<06:21, 1.19it/s, lr=0.0001, step_loss=0.0142]Steps: 77%|███████▋ | 1547/2000 [32:52<06:10, 1.22it/s, lr=0.0001, step_loss=0.0142]Steps: 77%|███████▋ | 1547/2000 [32:52<06:10, 1.22it/s, lr=0.0001, step_loss=0.216] Steps: 77%|███████▋ | 1548/2000 [32:53<06:01, 1.25it/s, lr=0.0001, step_loss=0.216]Steps: 77%|███████▋ | 1548/2000 [32:53<06:01, 1.25it/s, lr=0.0001, step_loss=0.0034]Steps: 77%|███████▋ | 1549/2000 [32:54<05:55, 1.27it/s, lr=0.0001, step_loss=0.0034]Steps: 77%|███████▋ | 1549/2000 [32:54<05:55, 1.27it/s, lr=0.0001, step_loss=0.00121]Steps: 78%|███████▊ | 1550/2000 [32:55<05:51, 1.28it/s, lr=0.0001, step_loss=0.00121]Steps: 78%|███████▊ | 1550/2000 [32:55<05:51, 1.28it/s, lr=0.0001, step_loss=0.0667] Steps: 78%|███████▊ | 1551/2000 [32:55<05:47, 1.29it/s, lr=0.0001, step_loss=0.0667]Steps: 78%|███████▊ | 1551/2000 [32:55<05:47, 1.29it/s, lr=0.0001, step_loss=0.000931]Steps: 78%|███████▊ | 1552/2000 [32:56<05:45, 1.30it/s, lr=0.0001, step_loss=0.000931]Steps: 78%|███████▊ | 1552/2000 [32:56<05:45, 1.30it/s, lr=0.0001, step_loss=0.128] Steps: 78%|███████▊ | 1553/2000 [32:57<05:44, 1.30it/s, lr=0.0001, step_loss=0.128]Steps: 78%|███████▊ | 1553/2000 [32:57<05:44, 1.30it/s, lr=0.0001, step_loss=0.000517]Steps: 78%|███████▊ | 1554/2000 [32:58<05:42, 1.30it/s, lr=0.0001, step_loss=0.000517]Steps: 78%|███████▊ | 1554/2000 [32:58<05:42, 1.30it/s, lr=0.0001, step_loss=0.0412] Steps: 78%|███████▊ | 1555/2000 [32:58<05:40, 1.31it/s, lr=0.0001, step_loss=0.0412]Steps: 78%|███████▊ | 1555/2000 [32:58<05:40, 1.31it/s, lr=0.0001, step_loss=0.0033]Steps: 78%|███████▊ | 1556/2000 [32:59<05:39, 1.31it/s, lr=0.0001, step_loss=0.0033]Steps: 78%|███████▊ | 1556/2000 [32:59<05:39, 1.31it/s, lr=0.0001, step_loss=0.000517]Steps: 78%|███████▊ | 1557/2000 [33:00<05:38, 1.31it/s, lr=0.0001, step_loss=0.000517]Steps: 78%|███████▊ | 1557/2000 [33:00<05:38, 1.31it/s, lr=0.0001, step_loss=0.0509] Steps: 78%|███████▊ | 1558/2000 [33:01<05:37, 1.31it/s, lr=0.0001, step_loss=0.0509]Steps: 78%|███████▊ | 1558/2000 [33:01<05:37, 1.31it/s, lr=0.0001, step_loss=0.0706]Steps: 78%|███████▊ | 1559/2000 [33:01<05:36, 1.31it/s, lr=0.0001, step_loss=0.0706]Steps: 78%|███████▊ | 1559/2000 [33:02<05:36, 1.31it/s, lr=0.0001, step_loss=0.000381]Steps: 78%|███████▊ | 1560/2000 [33:02<05:35, 1.31it/s, lr=0.0001, step_loss=0.000381]Steps: 78%|███████▊ | 1560/2000 [33:02<05:35, 1.31it/s, lr=0.0001, step_loss=0.000942]Steps: 78%|███████▊ | 1561/2000 [33:03<05:34, 1.31it/s, lr=0.0001, step_loss=0.000942]Steps: 78%|███████▊ | 1561/2000 [33:03<05:34, 1.31it/s, lr=0.0001, step_loss=0.000966]Steps: 78%|███████▊ | 1562/2000 [33:04<05:33, 1.31it/s, lr=0.0001, step_loss=0.000966]Steps: 78%|███████▊ | 1562/2000 [33:04<05:33, 1.31it/s, lr=0.0001, step_loss=0.00235] Steps: 78%|███████▊ | 1563/2000 [33:05<05:32, 1.31it/s, lr=0.0001, step_loss=0.00235]Steps: 78%|███████▊ | 1563/2000 [33:05<05:32, 1.31it/s, lr=0.0001, step_loss=0.000709]Steps: 78%|███████▊ | 1564/2000 [33:05<05:31, 1.31it/s, lr=0.0001, step_loss=0.000709]Steps: 78%|███████▊ | 1564/2000 [33:05<05:31, 1.31it/s, lr=0.0001, step_loss=0.0371] Steps: 78%|███████▊ | 1565/2000 [33:06<05:31, 1.31it/s, lr=0.0001, step_loss=0.0371]Steps: 78%|███████▊ | 1565/2000 [33:06<05:31, 1.31it/s, lr=0.0001, step_loss=0.121] Steps: 78%|███████▊ | 1566/2000 [33:07<05:30, 1.31it/s, lr=0.0001, step_loss=0.121]Steps: 78%|███████▊ | 1566/2000 [33:07<05:30, 1.31it/s, lr=0.0001, step_loss=0.0093]Steps: 78%|███████▊ | 1567/2000 [33:08<05:29, 1.31it/s, lr=0.0001, step_loss=0.0093]Steps: 78%|███████▊ | 1567/2000 [33:08<05:29, 1.31it/s, lr=0.0001, step_loss=0.00261]Steps: 78%|███████▊ | 1568/2000 [33:08<05:29, 1.31it/s, lr=0.0001, step_loss=0.00261]11/14/2025 06:41:55 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1568)
Steps: 78%|███████▊ | 1568/2000 [33:15<05:29, 1.31it/s, lr=0.0001, step_loss=0.000514]11/14/2025 06:41:55 - INFO - root - ### DEBUG: Finished epoch 48, epoch_steps=32, global_step=1568
11/14/2025 06:41:55 - INFO - root - ### DEBUG: Starting epoch 49/63, global_step=1568, max_train_steps=2000
Steps: 78%|███████▊ | 1569/2000 [33:16<19:36, 2.73s/it, lr=0.0001, step_loss=0.000514]Steps: 78%|███████▊ | 1569/2000 [33:16<19:36, 2.73s/it, lr=0.0001, step_loss=0.00353] Steps: 78%|███████▊ | 1570/2000 [33:16<15:20, 2.14s/it, lr=0.0001, step_loss=0.00353]Steps: 78%|███████▊ | 1570/2000 [33:16<15:20, 2.14s/it, lr=0.0001, step_loss=0.0302] Steps: 79%|███████▊ | 1571/2000 [33:17<12:20, 1.73s/it, lr=0.0001, step_loss=0.0302]Steps: 79%|███████▊ | 1571/2000 [33:17<12:20, 1.73s/it, lr=0.0001, step_loss=0.00146]Steps: 79%|███████▊ | 1572/2000 [33:18<10:14, 1.44s/it, lr=0.0001, step_loss=0.00146]Steps: 79%|███████▊ | 1572/2000 [33:18<10:14, 1.44s/it, lr=0.0001, step_loss=0.00829]Steps: 79%|███████▊ | 1573/2000 [33:19<08:46, 1.23s/it, lr=0.0001, step_loss=0.00829]Steps: 79%|███████▊ | 1573/2000 [33:19<08:46, 1.23s/it, lr=0.0001, step_loss=0.00371]Steps: 79%|███████▊ | 1574/2000 [33:19<07:45, 1.09s/it, lr=0.0001, step_loss=0.00371]Steps: 79%|███████▊ | 1574/2000 [33:19<07:45, 1.09s/it, lr=0.0001, step_loss=0.0124] Steps: 79%|███████▉ | 1575/2000 [33:20<07:01, 1.01it/s, lr=0.0001, step_loss=0.0124]Steps: 79%|███████▉ | 1575/2000 [33:20<07:01, 1.01it/s, lr=0.0001, step_loss=0.00119]Steps: 79%|███████▉ | 1576/2000 [33:21<06:31, 1.08it/s, lr=0.0001, step_loss=0.00119]Steps: 79%|███████▉ | 1576/2000 [33:21<06:31, 1.08it/s, lr=0.0001, step_loss=0.00225]Steps: 79%|███████▉ | 1577/2000 [33:22<06:10, 1.14it/s, lr=0.0001, step_loss=0.00225]Steps: 79%|███████▉ | 1577/2000 [33:22<06:10, 1.14it/s, lr=0.0001, step_loss=0.00195]Steps: 79%|███████▉ | 1578/2000 [33:22<05:54, 1.19it/s, lr=0.0001, step_loss=0.00195]Steps: 79%|███████▉ | 1578/2000 [33:23<05:54, 1.19it/s, lr=0.0001, step_loss=0.121] Steps: 79%|███████▉ | 1579/2000 [33:23<05:44, 1.22it/s, lr=0.0001, step_loss=0.121]Steps: 79%|███████▉ | 1579/2000 [33:23<05:44, 1.22it/s, lr=0.0001, step_loss=0.0728]Steps: 79%|███████▉ | 1580/2000 [33:24<05:36, 1.25it/s, lr=0.0001, step_loss=0.0728]Steps: 79%|███████▉ | 1580/2000 [33:24<05:36, 1.25it/s, lr=0.0001, step_loss=0.041] Steps: 79%|███████▉ | 1581/2000 [33:25<05:30, 1.27it/s, lr=0.0001, step_loss=0.041]Steps: 79%|███████▉ | 1581/2000 [33:25<05:30, 1.27it/s, lr=0.0001, step_loss=0.0302]Steps: 79%|███████▉ | 1582/2000 [33:26<05:26, 1.28it/s, lr=0.0001, step_loss=0.0302]Steps: 79%|███████▉ | 1582/2000 [33:26<05:26, 1.28it/s, lr=0.0001, step_loss=0.0481]Steps: 79%|███████▉ | 1583/2000 [33:26<05:23, 1.29it/s, lr=0.0001, step_loss=0.0481]Steps: 79%|███████▉ | 1583/2000 [33:26<05:23, 1.29it/s, lr=0.0001, step_loss=0.0379]Steps: 79%|███████▉ | 1584/2000 [33:27<05:20, 1.30it/s, lr=0.0001, step_loss=0.0379]Steps: 79%|███████▉ | 1584/2000 [33:27<05:20, 1.30it/s, lr=0.0001, step_loss=0.102] Steps: 79%|███████▉ | 1585/2000 [33:28<05:18, 1.30it/s, lr=0.0001, step_loss=0.102]Steps: 79%|███████▉ | 1585/2000 [33:28<05:18, 1.30it/s, lr=0.0001, step_loss=0.0536]Steps: 79%|███████▉ | 1586/2000 [33:29<05:16, 1.31it/s, lr=0.0001, step_loss=0.0536]Steps: 79%|███████▉ | 1586/2000 [33:29<05:16, 1.31it/s, lr=0.0001, step_loss=0.00901]Steps: 79%|███████▉ | 1587/2000 [33:29<05:15, 1.31it/s, lr=0.0001, step_loss=0.00901]Steps: 79%|███████▉ | 1587/2000 [33:29<05:15, 1.31it/s, lr=0.0001, step_loss=0.0484] Steps: 79%|███████▉ | 1588/2000 [33:30<05:14, 1.31it/s, lr=0.0001, step_loss=0.0484]Steps: 79%|███████▉ | 1588/2000 [33:30<05:14, 1.31it/s, lr=0.0001, step_loss=0.00685]Steps: 79%|███████▉ | 1589/2000 [33:31<05:13, 1.31it/s, lr=0.0001, step_loss=0.00685]Steps: 79%|███████▉ | 1589/2000 [33:31<05:13, 1.31it/s, lr=0.0001, step_loss=0.00634]Steps: 80%|███████▉ | 1590/2000 [33:32<05:12, 1.31it/s, lr=0.0001, step_loss=0.00634]Steps: 80%|███████▉ | 1590/2000 [33:32<05:12, 1.31it/s, lr=0.0001, step_loss=0.00367]Steps: 80%|███████▉ | 1591/2000 [33:32<05:11, 1.31it/s, lr=0.0001, step_loss=0.00367]Steps: 80%|███████▉ | 1591/2000 [33:32<05:11, 1.31it/s, lr=0.0001, step_loss=0.121] Steps: 80%|███████▉ | 1592/2000 [33:33<05:10, 1.31it/s, lr=0.0001, step_loss=0.121]Steps: 80%|███████▉ | 1592/2000 [33:33<05:10, 1.31it/s, lr=0.0001, step_loss=0.00852]Steps: 80%|███████▉ | 1593/2000 [33:34<05:10, 1.31it/s, lr=0.0001, step_loss=0.00852]Steps: 80%|███████▉ | 1593/2000 [33:34<05:10, 1.31it/s, lr=0.0001, step_loss=0.00031]Steps: 80%|███████▉ | 1594/2000 [33:35<05:09, 1.31it/s, lr=0.0001, step_loss=0.00031]Steps: 80%|███████▉ | 1594/2000 [33:35<05:09, 1.31it/s, lr=0.0001, step_loss=0.0293] Steps: 80%|███████▉ | 1595/2000 [33:35<05:08, 1.31it/s, lr=0.0001, step_loss=0.0293]Steps: 80%|███████▉ | 1595/2000 [33:35<05:08, 1.31it/s, lr=0.0001, step_loss=0.00187]Steps: 80%|███████▉ | 1596/2000 [33:36<05:07, 1.31it/s, lr=0.0001, step_loss=0.00187]Steps: 80%|███████▉ | 1596/2000 [33:36<05:07, 1.31it/s, lr=0.0001, step_loss=0.0646] Steps: 80%|███████▉ | 1597/2000 [33:37<05:06, 1.31it/s, lr=0.0001, step_loss=0.0646]Steps: 80%|███████▉ | 1597/2000 [33:37<05:06, 1.31it/s, lr=0.0001, step_loss=0.014] Steps: 80%|███████▉ | 1598/2000 [33:38<05:06, 1.31it/s, lr=0.0001, step_loss=0.014]Steps: 80%|███████▉ | 1598/2000 [33:38<05:06, 1.31it/s, lr=0.0001, step_loss=0.00487]Steps: 80%|███████▉ | 1599/2000 [33:38<05:05, 1.31it/s, lr=0.0001, step_loss=0.00487]Steps: 80%|███████▉ | 1599/2000 [33:39<05:05, 1.31it/s, lr=0.0001, step_loss=0.177] Steps: 80%|████████ | 1600/2000 [33:39<05:04, 1.31it/s, lr=0.0001, step_loss=0.177]11/14/2025 06:42:30 - INFO - root - Saved state to outputs/actor01_training/training_actor01-2025-11-14T06-08-25/checkpoints (global_step: 1600)
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.01it/s][A100%|██████████| 8/8 [00:00<00:00, 32.20it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:09, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.70it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.70it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 43.91it/s][A100%|██████████| 8/8 [00:00<00:00, 32.14it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A
44%|████▍ | 11/25 [00:06<00:08, 1.70it/s][A
48%|████▊ | 12/25 [00:07<00:07, 1.70it/s][A
52%|█████▏ | 13/25 [00:07<00:07, 1.70it/s][A
56%|█████▌ | 14/25 [00:08<00:06, 1.70it/s][A
60%|██████ | 15/25 [00:08<00:05, 1.70it/s][A
64%|██████▍ | 16/25 [00:09<00:05, 1.70it/s][A
68%|██████▊ | 17/25 [00:10<00:04, 1.70it/s][A
72%|███████▏ | 18/25 [00:10<00:04, 1.70it/s][A
76%|███████▌ | 19/25 [00:11<00:03, 1.70it/s][A
80%|████████ | 20/25 [00:11<00:02, 1.70it/s][A
84%|████████▍ | 21/25 [00:12<00:02, 1.69it/s][A
88%|████████▊ | 22/25 [00:12<00:01, 1.70it/s][A
92%|█████████▏| 23/25 [00:13<00:01, 1.69it/s][A
96%|█████████▌| 24/25 [00:14<00:00, 1.70it/s][A
100%|██████████| 25/25 [00:14<00:00, 1.70it/s][A100%|██████████| 25/25 [00:14<00:00, 1.70it/s]
0%| | 0/8 [00:00<?, ?it/s][A
75%|███████▌ | 6/8 [00:00<00:00, 44.01it/s][A100%|██████████| 8/8 [00:00<00:00, 32.19it/s]
0%| | 0/25 [00:00<?, ?it/s][A
4%|▍ | 1/25 [00:00<00:14, 1.70it/s][A
8%|▊ | 2/25 [00:01<00:13, 1.70it/s][A
12%|█▏ | 3/25 [00:01<00:12, 1.70it/s][A
16%|█▌ | 4/25 [00:02<00:12, 1.70it/s][A
20%|██ | 5/25 [00:02<00:11, 1.70it/s][A
24%|██▍ | 6/25 [00:03<00:11, 1.70it/s][A
28%|██▊ | 7/25 [00:04<00:10, 1.70it/s][A
32%|███▏ | 8/25 [00:04<00:10, 1.70it/s][A
36%|███▌ | 9/25 [00:05<00:09, 1.70it/s][A
40%|████ | 10/25 [00:05<00:08, 1.70it/s][A