Question Regarding the Llada Parallel decoding in GSM8K

Hi,
Thanks for this great work. I am having an issue understanding how the parallel decoding is enabled for LLADA.

 ### In the eval_gsm8k.sh, this is the suggested command for dual cache+parallel. 
dual cache+parallel
accelerate launch eval_llada.py --tasks ${task} --num_fewshot ${num_fewshot} \
--confirm_run_unsafe_code --model llada_dist \
--model_args model_path=${model_path},gen_length=${length},**steps=${length}**,block_length=${block_length},use_cache=True,dual_cache=True,threshold=0.9,show_speed=True

### In eval.md,
accelerate launch eval_llada.py --tasks ${task} --num_fewshot ${num_fewshot} \
--confirm_run_unsafe_code --model llada_dist \
--model_args model_path='GSAI-ML/LLaDA-8B-Instruct',gen_length=${length},**steps=${steps}**,block_length=${block_length},use_cache=True,dual_cache=True,threshold=0.9,show_speed=True

To the best of my knowledge, the steps should enable parallel decoding. However, when I do **steps=${steps}** for Llada, the generation makes no sense. Do you have any suggestions in this matter? Thank you so much! 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question Regarding the Llada Parallel decoding in GSM8K #54

In the eval_gsm8k.sh, this is the suggested command for dual cache+parallel.

In eval.md,

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question Regarding the Llada Parallel decoding in GSM8K #54

Description

In the eval_gsm8k.sh, this is the suggested command for dual cache+parallel.

In eval.md,

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions