From fe6b0739981be2293d88991270da5032f808487f Mon Sep 17 00:00:00 2001 From: Adam H <74554328+excepto64@users.noreply.github.com> Date: Tue, 16 Jun 2026 11:50:01 +0100 Subject: [PATCH] Fix lora_combiner docs. --- docs/source/common_options.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/common_options.rst b/docs/source/common_options.rst index dd35acb..785dfce 100644 --- a/docs/source/common_options.rst +++ b/docs/source/common_options.rst @@ -525,7 +525,7 @@ Only the adapter is saved. Merge it back with the base model to deploy: --lora_path ./checkpoint/llama3-8b-rm \ --output_path ./checkpoint/llama-3-8b-rm-combined \ --is_rm \ - --param_dtype bf16 + --ds.param_dtype bf16 Use ``--is_rm`` when merging a reward model (preserves the score head).