step 0测了一下multiturn
(TaskRunner pid=693237) ("Initial validation metrics: {'val-core/multi_turn_miss_param/reward/mean@1': "
(TaskRunner pid=693237) "0.19333333333333333, 'val-core/multi_turn_long_context/reward/mean@1': "
(TaskRunner pid=693237) "0.11333333333333333, 'val-core/multi_turn_miss_func/reward/mean@1': "
(TaskRunner pid=693237) "0.17333333333333334, 'val-core/multi_turn_base/reward/mean@1': "
(TaskRunner pid=693237) '0.26666666666666666}')
(TaskRunner pid=693237) step:0 - val-core/multi_turn_miss_param/reward/mean@1:0.193 - val-core/multi_turn_long_context/reward/mean@1:0.113 - val-core/multi_turn_miss_func/reward/mean@1:0.173 - val-core/multi_turn_base/reward/mean@1:0.267
感觉没这么高啊
step 0测了一下multiturn
(TaskRunner pid=693237) ("Initial validation metrics: {'val-core/multi_turn_miss_param/reward/mean@1': "
(TaskRunner pid=693237) "0.19333333333333333, 'val-core/multi_turn_long_context/reward/mean@1': "
(TaskRunner pid=693237) "0.11333333333333333, 'val-core/multi_turn_miss_func/reward/mean@1': "
(TaskRunner pid=693237) "0.17333333333333334, 'val-core/multi_turn_base/reward/mean@1': "
(TaskRunner pid=693237) '0.26666666666666666}')
(TaskRunner pid=693237) step:0 - val-core/multi_turn_miss_param/reward/mean@1:0.193 - val-core/multi_turn_long_context/reward/mean@1:0.113 - val-core/multi_turn_miss_func/reward/mean@1:0.173 - val-core/multi_turn_base/reward/mean@1:0.267
感觉没这么高啊