Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Commit 5973770

Browse files
committed
add customization for batch size
1 parent c2a0ff9 commit 5973770

File tree

2 files changed

+2
-1
lines changed

2 files changed

+2
-1
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,7 @@ Table of parameters
107107
| `system_prompt` | String | The prompt to use for system rules. |
108108
| `pre_prompt` | String | The prompt to use for internal configuration. |
109109
| `cpu_threads` | Integer | The number of threads to use for inferencing (CPU MODE ONLY) |
110+
| `n_batch` | Integer | The batch size for prompt eval step |
110111

111112
***OPTIONAL***: You can run Nitro on a different port like 5000 instead of 3928 by running it manually in terminal
112113
```zsh

controllers/llamaCPP.cc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -376,7 +376,7 @@ void llamaCPP::loadModel(
376376
params.n_ctx = (*jsonBody).get("ctx_len", 2048).asInt();
377377
params.embedding = (*jsonBody).get("embedding", true).asBool();
378378
// Check if n_parallel exists in jsonBody, if not, set to drogon_thread
379-
379+
params.n_batch = (*jsonBody).get("n_batch",512).asInt();
380380
params.n_parallel = (*jsonBody).get("n_parallel", drogon_thread).asInt();
381381
params.n_threads =
382382
(*jsonBody)

0 commit comments

Comments
 (0)