Skip to content

Commit dc0d886

Browse files
committed
docs: clarify --ctx-size and --parallel interaction in arg.cpp
When using --parallel N, the --ctx-size value is the TOTAL context divided among all slots, not the per-slot context. This is a common source of confusion (see #11681, #5732). Examples: - --ctx-size 4096 --parallel 4 → each slot gets 1024 tokens - To get 4096 tokens per slot with 4 parallel slots, use --ctx-size 16384 Updated the help text in arg.cpp (the source for auto-generated docs) for both --ctx-size and --parallel flags to clarify this behavior. Fixes #11681
1 parent c42712b commit dc0d886

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

common/arg.cpp

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -920,7 +920,9 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
920920
).set_examples({LLAMA_EXAMPLE_LOOKUP}));
921921
add_opt(common_arg(
922922
{"-c", "--ctx-size"}, "N",
923-
string_format("size of the prompt context (default: %d, 0 = loaded from model)", params.n_ctx),
923+
string_format("size of the prompt context (default: %d, 0 = loaded from model). "
924+
"Note: when using --parallel N, this is the TOTAL context divided among all slots, "
925+
"not per-slot. For X tokens per slot with N parallel slots, use --ctx-size X*N", params.n_ctx),
924926
[](common_params & params, int value) {
925927
params.n_ctx = value;
926928
}
@@ -1756,7 +1758,8 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
17561758
).set_env("LLAMA_ARG_DEFRAG_THOLD"));
17571759
add_opt(common_arg(
17581760
{"-np", "--parallel"}, "N",
1759-
string_format("number of parallel sequences to decode (default: %d)", params.n_parallel),
1761+
string_format("number of parallel sequences to decode (default: %d). "
1762+
"Note: total context (--ctx-size) is divided equally among parallel slots", params.n_parallel),
17601763
[](common_params & params, int value) {
17611764
params.n_parallel = value;
17621765
}

0 commit comments

Comments
 (0)