Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/common_options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -525,7 +525,7 @@ Only the adapter is saved. Merge it back with the base model to deploy:
--lora_path ./checkpoint/llama3-8b-rm \
--output_path ./checkpoint/llama-3-8b-rm-combined \
--is_rm \
--param_dtype bf16
--ds.param_dtype bf16

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The lora_combiner script is a standalone utility that does not use the DeepSpeed trainer's dotted argument parser. It does not support --ds.param_dtype bf16. Instead, it uses a simple --bf16 boolean flag to enable bfloat16 precision during merging.

Suggested change
--ds.param_dtype bf16
--bf16

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not square with the output I got while running the code:

usage: lora_combiner.py [-h] --model_path MODEL_PATH --lora_path LORA_PATH
                        --output_path OUTPUT_PATH [--is_rm]
                        [--ds.param_dtype {bf16,fp16}]
lora_combiner.py: error: unrecognized arguments: --param_dtype bf16


Use ``--is_rm`` when merging a reward model (preserves the score head).

Expand Down