|
--gradient_merge_steps $(expr 67584 \/ $batch_size \/ 8)" |
there has a quote without end
I modify it as following:
$CMD --max_predictions_per_seq 80 \
--learning_rate 5e-5 \
--weight_decay 0.0 \
--adam_epsilon 1e-8 \
--warmup_steps 0 \
--output_dir ./tmp2/ \
--logging_steps 10 \
--save_steps 20000 \
--input_dir=$DATA_DIR \
--model_type bert \
--model_name_or_path bert-base-uncased \
--batch_size ${batch_size} \
--use_amp ${use_amp} \
--gradient_merge_steps $(expr 67584 \/ $batch_size \/ 8)
And it show another problem :
Traceback (most recent call last):
File "./run_pretrain.py", line 439, in
do_train(args)
File "./run_pretrain.py", line 316, in do_train
train_data_loader) * args.num_train_epochs
UnboundLocalError: local variable 'train_data_loader' referenced before assignment
I used https://github.com/PaddlePaddle/Perf/blob/master/Bert/scripts/paddle_base_pre_training.sh
This shell script worked.
what more , I wonder how get 八卡的训练吞吐率(sequences/sec)?
是把八个worklog 都加起来吗? 有没有快速加起来的方法?
Perf/Bert/README.md
Line 191 in 2106324
there has a quote without end
I modify it as following:
And it show another problem :
Traceback (most recent call last):
File "./run_pretrain.py", line 439, in
do_train(args)
File "./run_pretrain.py", line 316, in do_train
train_data_loader) * args.num_train_epochs
UnboundLocalError: local variable 'train_data_loader' referenced before assignment
I used https://github.com/PaddlePaddle/Perf/blob/master/Bert/scripts/paddle_base_pre_training.sh
This shell script worked.
what more , I wonder how get 八卡的训练吞吐率(sequences/sec)?
是把八个worklog 都加起来吗? 有没有快速加起来的方法?