NVIDIA-NeMo · lbliii · May 29, 2026 · May 29, 2026
diff --git a/fern/versions/latest/pages/model-server/local-vllm-proxy.mdx b/fern/versions/latest/pages/model-server/local-vllm-proxy.mdx
@@ -10,12 +10,12 @@ It is a subclass of VLLMModel, so it accepts the same configuration fields, but
 ## When to use it
 
 Use a proxy when you need several model servers that share **one** vLLM deployment but differ in their request-time configuration.
-For example, one server with reasoning enabled and one with reasoning disabled in through the request params, or servers with different sampling parameters.
+For example, one server with reasoning enabled and one with reasoning disabled through the request params, or servers with different sampling parameters.
 Without the proxy you would have to launch a separate vLLM engine (and duplicate GPUs) for each variation.
 
 At startup the proxy waits for its referenced LocalVLLMModel to come up, reads that server's inner vLLM endpoint (`base_url`, `api_key`, `model`), and routes all of its own requests there.
 
-If you are working with an extising vLLM endpoint that you manage outside of Gym, use [VLLMModel](/model-server/vllm) instead.
+If you are working with an existing vLLM endpoint that you manage outside of Gym, use [VLLMModel](/model-server/vllm) instead.
 
 ## Configuration
 
@@ -54,4 +54,4 @@ ng_run "+config_paths=[${config_paths}]" \
 | `model_server` | `ModelServerRef` | — | **Required.** The LocalVLLMModel server to forward requests to, by `type` and `name`. |
 
 `base_url`, `api_key`, and `model` are populated automatically from the backing server and should **not** be set in your config.
-All other VLLMModel fields (`chat_template_kwargs`, `extra_body`, `return_token_id_information`, etc.) behave as documented in the [VLLMModel configuration reference](/model-server/vllm#vllmmodel-configuration-reference).
+All other VLLMModel fields (`chat_template_kwargs`, `extra_body`, `return_token_id_information`, and so on) behave as documented in the [VLLMModel configuration reference](/model-server/vllm#vllmmodel-configuration-reference).
diff --git a/fern/versions/latest/pages/model-server/local-vllm.mdx b/fern/versions/latest/pages/model-server/local-vllm.mdx
@@ -8,7 +8,7 @@ NeMo Gym can launch and manage the vLLM server for you using LocalVLLMModel (in
 LocalVLLMModel is a subclass of VLLMModel that spawns the vLLM engine and auto-configures the model server to use it.
 The Chat Completions to Responses API conversion is inherited from VLLMModel. See [VLLMModel](/model-server/vllm) for details.
 
-A single LocalVLLMModel deployment can back multiple model servers, even when they need different request-time settings (e.g. sampling parameters or reasoning on/off).
+A single LocalVLLMModel deployment can back multiple model servers, even when they need different request-time settings (for example, sampling parameters or reasoning on or off).
 See [Local vLLM Proxy](/model-server/local-vllm-proxy) for this configuration.
 
 <Note>
@@ -49,7 +49,7 @@ LocalVLLMModel inherits all fields from VLLMModel (see [VLLMModel configuration
 |-----------|------|---------|-------------|
 | `vllm_serve_kwargs` | `dict` | — | **Required.** Arguments passed through to `vllm serve`. See `vllm_serve_kwargs` below. |
 | `vllm_serve_env_vars` | `dict` | — | **Required.** Environment variables for the vLLM process. Must include `VLLM_RAY_DP_PACK_STRATEGY`. |
-| `hf_home` | `str` | `<cwd>/.cache/huggingface` | Hugging Face cache directory. Set this if you've already downloaded weights elsewhere. |
+| `hf_home` | `str` | `<cwd>/.cache/huggingface` | Hugging Face cache directory. Set this if you have already downloaded weights elsewhere. |
 | `debug` | `bool` | `false` | Print vLLM server logs to stderr. |
 | `show_vllm_engine_stats` | `bool` | `false` | Periodically log vLLM engine throughput stats. |
 | `ray_worker_py_executable` | `str` | `sys.executable` | Python interpreter Ray uses for worker processes. |
@@ -152,7 +152,7 @@ ng_run "+config_paths=[${config_paths}]"
 The following capabilities work the same as in VLLMModel. See [VLLMModel configuration reference](/model-server/vllm#vllmmodel-configuration-reference) for details.
 
 - **`chat_template_kwargs`**: override chat template behavior per model.
-- **`extra_body`**: pass vLLM-specific request parameters (e.g. `guided_json`, `reasoning.effort`).
+- **`extra_body`**: pass vLLM-specific request parameters (for example, `guided_json`, `reasoning.effort`).
 - **`return_token_id_information`**: enable for training workflows that need `prompt_token_ids`, `generation_token_ids`, and `generation_log_probs`.
 
 <Note>

diff --git a/fern/versions/latest/pages/model-server/vllm.mdx b/fern/versions/latest/pages/model-server/vllm.mdx
@@ -17,7 +17,7 @@ ng_run "+config_paths=[$config_paths]"
 ```
 
 <Note>
-VLLMModel connects NeMo Gym to a vLLM server that you start and manage yourself. If you'd prefer NeMo Gym to launch and manage vLLM itself, use LocalVLLMModel instead. See [LocalVLLMModel](/model-server/local-vllm) to learn more.
+VLLMModel connects NeMo Gym to a vLLM server that you start and manage yourself. If you would prefer NeMo Gym to launch and manage vLLM itself, use LocalVLLMModel instead. See [LocalVLLMModel](/model-server/local-vllm) to learn more.
 </Note>
 
 ## Use VLLMModel

diff --git a/responses_api_models/vllm_model/README.md b/responses_api_models/vllm_model/README.md
@@ -16,7 +16,7 @@ View the logs
 tail -f temp.log
 ```
 
-Once you see that server instances are up, call the server. If you see a model response here, then everything is working as intended!
+Once you see that server instances are up, call the server. If you see a model response here, then everything is working as intended.
 ```bash
 python responses_api_agents/simple_agent/client.py
 ```
-Original file line number
+Diff line change
@@ Expand Up / @@ -17,7 +17,7 @@ ng_run "+config_paths=[$config_paths]" @@
     ```
     <Note>
-    VLLMModel connects NeMo Gym to a vLLM server that you start and manage yourself. If you'd prefer NeMo Gym to launch and manage vLLM itself, use LocalVLLMModel instead. See [LocalVLLMModel](/model-server/local-vllm) to learn more.
+    VLLMModel connects NeMo Gym to a vLLM server that you start and manage yourself. If you would prefer NeMo Gym to launch and manage vLLM itself, use LocalVLLMModel instead. See [LocalVLLMModel](/model-server/local-vllm) to learn more.
     </Note>
     ## Use VLLMModel
@@ Expand Down @@