docs: showcase advanced training parameters in README by AndresNinou · Pull Request #1 · reflex-inc/quickstart

AndresNinou · 2026-05-09T16:43:15Z

Summary

Adds an Advanced training section between Run and Expected output demonstrating the 8 new optional kwargs on `client.training.lora_finetune(...)` shipped in reflex-sdk 0.2.0:

`lora_alpha` — LoRA scaling factor [1, 256]
`lora_dropout` — LoRA dropout [0.0, 0.5]
`target_modules` — whitelisted pi0.5 modules to adapt
`warmup_steps` — LR schedule warmup [0, max_steps/2]
`gradient_checkpointing` — trade compute for VRAM
`freeze_vision_encoder` — freeze vision tower
`dtype` — `{"bfloat16", "float32"}`
`save_freq` — checkpoint cadence [50, max_steps]

Includes a single self-contained example with all 8 wired up to production-grade defaults, plus a bounds table so users know exactly what the server will accept. Calls out that LoRA-only kwargs are rejected on full fine-tunes.

Requires `reflex-sdk >= 0.2.0`.

Test plan

README renders cleanly on GitHub
Code example matches the SDK function signature exactly
Bounds match server-side validation in `platform2/convex/trainingRuns.ts`
Once 0.2.0 is published to PyPI, run the example end-to-end against the live API

🤖 Generated with Claude Code

^{Need help on this PR? Tag @codesmith with what you need.}

Let Codesmith autofix CI failures and bot reviews

Summary by CodeRabbit

Documentation
- Added comprehensive documentation for advanced LoRA fine-tuning parameters, including usage examples, parameter bounds, defaults, and supported target modules.

Adds an "Advanced training" section between "Run" and "Expected output" demonstrating the 8 new optional kwargs on `client.training.lora_finetune(...)`: lora_alpha, lora_dropout, target_modules, warmup_steps, gradient_checkpointing, freeze_vision_encoder, dtype, save_freq. Includes a single end-to-end example wiring up all 8 with production-grade defaults, plus a bounds table and the pi0.5 target_modules whitelist. Notes that LoRA-only kwargs are rejected on full fine-tunes. Requires reflex-sdk >= 0.2.0.

coderabbitai · 2026-05-09T16:43:27Z

📝 Walkthrough

Walkthrough

This PR adds an "Advanced training" section to the README documenting optional LoRA finetune kwargs supported by client.training.lora_finetune(). The update includes a usage example, a reference table with parameter defaults and bounds, the pi0.5 target_modules whitelist, and a note that these LoRA-specific kwargs are rejected when using client.training.full_finetune().

Changes

Documentation

Layer / File(s)	Summary
Advanced Training Documentation `README.md`	Adds "Advanced training" section describing eight optional LoRA finetune kwargs with Python usage example, parameter reference table with defaults/bounds, target_modules whitelist for pi0.5, and note that LoRA-specific kwargs are unsupported in full finetune.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: adding documentation about advanced training parameters to the README.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs/advanced-training-params

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@README.md`:
- Around line 155-164: Clarify the README's `lora_alpha` default by replacing
"derived from `lora_rank`" with the explicit derivation formula or an example;
update the table row for `lora_alpha` to state the exact rule (e.g., "defaults
to 2 * lora_rank" or "defaults to lora_rank — typically 2 * lora_rank") and, if
applicable, add a short parenthetical example to show numeric behavior when
`lora_rank` is X; edit the README entry for `lora_alpha` (and nearby `lora_rank`
docs if present) so users see the concrete default computation and an example.
- Line 152: The README example uses job["training_job_id"] which is inconsistent
with the documented identifier name run_id; update the example to access the
returned identifier using job["run_id"] (or the equivalent run_id key) so the
snippet matches other references to run_id throughout the docs and examples.
- Around line 128-153: The README example uses a non-matching API surface
(Reflex()/lots of params) — update it to match quickstart.py by replacing the
import/instantiation and call to reflect the real SDK: use
reflex.Client(api_key=...) instead of Reflex(), and call training.lora_finetune
with the actual supported parameters (hf_source_uri, model_name, base_model,
epochs) or, if you want to keep the extended knobs, explicitly document that
only reflex-sdk>=0.1.4 supports them and list which parameters are accepted;
ensure the README's example and parameter names (training.lora_finetune,
hf_source_uri, model_name, base_model, epochs, reflex.Client) exactly match the
implementation in quickstart.py.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 5adbd97f-686a-4cfe-90ac-c505aa088e56

📥 Commits

Reviewing files that changed from the base of the PR and between 36fb33d and 52242f1.

📒 Files selected for processing (1)

README.md

coderabbitai · 2026-05-09T16:46:37Z

+```python
+from reflex import Reflex
+
+client = Reflex()
+job = client.training.lora_finetune(
+    hf_source_uri="hf://lerobot/aloha_sim_transfer_cube_human",
+    model_name="pi05-aloha-cube",
+    model_version="v1",
+    # Core knobs:
+    max_steps=2000,
+    batch_size=8,
+    learning_rate=1e-4,
+    lora_rank=16,
+    # Advanced — all optional, all bounds-checked server-side:
+    lora_alpha=32,                                       # [1, 256]
+    lora_dropout=0.05,                                   # [0.0, 0.5]
+    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
+    warmup_steps=200,                                    # [0, max_steps/2]
+    gradient_checkpointing=True,                         # save VRAM
+    freeze_vision_encoder=True,                          # standard for LoRA
+    dtype="bfloat16",                                    # {"bfloat16", "float32"}
+    save_freq=500,                                       # [50, max_steps]
+)
+
+print(job["training_job_id"])
+```


⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

🧩 Analysis chain

🌐 Web query:

What is the latest published version of reflex-sdk on PyPI?

💡 Result:

The latest published version of reflex-sdk on PyPI is v0.1.4.[1]

Citations:

1: https://pypi.org/project/reflex-sdk/

🏁 Script executed:

# First, understand the repository structure git ls-files | head -20

Repository: reflex-inc/quickstart

Length of output: 151

🏁 Script executed:

# Look for SDK-related files, particularly the Reflex class and training API fd -t f "reflex" | grep -E "\.(py|toml|md)$" | head -20

Repository: reflex-inc/quickstart

Length of output: 47

🏁 Script executed:

# Check for setup.py or pyproject.toml to understand version info fd -t f -E "__pycache__" "(setup|pyproject)" | head -10

Repository: reflex-inc/quickstart

Length of output: 80

🏁 Script executed:

# Search for lora_finetune method definition rg "lora_finetune" -t py

Repository: reflex-inc/quickstart

Length of output: 127

🏁 Script executed:

# Look for the Reflex class definition rg "class Reflex" -t py

Repository: reflex-inc/quickstart

Length of output: 47

🏁 Script executed:

cat -n pyproject.toml

Repository: reflex-inc/quickstart

Length of output: 2101

🏁 Script executed:

cat -n quickstart.py

Repository: reflex-inc/quickstart

Length of output: 25221

🏁 Script executed:

# Get the specific lines from README to compare sed -n '128,153p' README.md | cat -n

Repository: reflex-inc/quickstart

Length of output: 1240

Update the README code example to match the actual SDK in quickstart.py.

The documented API in README.md (lines 128–153) does not match the working implementation in quickstart.py (lines 154–159). The README shows parameters like model_version, max_steps, batch_size, lora_rank, lora_alpha, lora_dropout, target_modules, warmup_steps, gradient_checkpointing, freeze_vision_encoder, dtype, and save_freq, plus a different import and instantiation pattern (from reflex import Reflex / Reflex()), but the actual code uses only hf_source_uri, model_name, base_model, and epochs with reflex.Client(api_key=...). Developers following the README will encounter errors. Replace the README example with the actual working call pattern from quickstart.py, or document which parameters are genuinely supported by reflex-sdk>=0.1.4.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` around lines 128 - 153, The README example uses a non-matching API surface (Reflex()/lots of params) — update it to match quickstart.py by replacing the import/instantiation and call to reflect the real SDK: use reflex.Client(api_key=...) instead of Reflex(), and call training.lora_finetune with the actual supported parameters (hf_source_uri, model_name, base_model, epochs) or, if you want to keep the extended knobs, explicitly document that only reflex-sdk>=0.1.4 supports them and list which parameters are accepted; ensure the README's example and parameter names (training.lora_finetune, hf_source_uri, model_name, base_model, epochs, reflex.Client) exactly match the implementation in quickstart.py.

coderabbitai · 2026-05-09T16:46:38Z

+    save_freq=500,                                       # [50, max_steps]
+)
+
+print(job["training_job_id"])


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Check for identifier naming patterns in the README rg -n 'run_id|training_job_id' README.md

Repository: reflex-inc/quickstart

Length of output: 653

Fix inconsistent key name in code example.

Line 152 accesses job["training_job_id"], but the documentation consistently references run_id as the returned identifier (lines 26, 192, 196, 246, 249). Update the code example to use the correct key name for consistency.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` at line 152, The README example uses job["training_job_id"] which is inconsistent with the documented identifier name run_id; update the example to access the returned identifier using job["run_id"] (or the equivalent run_id key) so the snippet matches other references to run_id throughout the docs and examples.

coderabbitai · 2026-05-09T16:46:38Z

+| Kwarg | Default if omitted | Bounds | What it does |
+|---|---|---|---|
+| `lora_alpha` | derived from `lora_rank` | [1, 256] | LoRA scaling factor |
+| `lora_dropout` | `0.0` | [0.0, 0.5] | Dropout on LoRA layers |
+| `target_modules` | full pi0.5 set | whitelist below | Modules to LoRA-adapt |
+| `warmup_steps` | `100` | [0, `max_steps/2`] | LR warmup length |
+| `gradient_checkpointing` | `False` | bool | Trade compute for VRAM |
+| `freeze_vision_encoder` | `True` | bool | Freeze the vision tower |
+| `dtype` | `"bfloat16"` | `{"bfloat16", "float32"}` | Compute dtype |
+| `save_freq` | `500` | [50, `max_steps`] | Steps between checkpoints |


🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Consider clarifying the "derived from lora_rank" default.

The table states that lora_alpha defaults to a value "derived from lora_rank" when omitted. For users to understand the actual default behavior, consider documenting the derivation formula or providing an example (e.g., "typically 2 * lora_rank" or "defaults to lora_rank").

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` around lines 155 - 164, Clarify the README's `lora_alpha` default by replacing "derived from `lora_rank`" with the explicit derivation formula or an example; update the table row for `lora_alpha` to state the exact rule (e.g., "defaults to 2 * lora_rank" or "defaults to lora_rank — typically 2 * lora_rank") and, if applicable, add a short parenthetical example to show numeric behavior when `lora_rank` is X; edit the README entry for `lora_alpha` (and nearby `lora_rank` docs if present) so users see the concrete default computation and an example.

coderabbitai Bot reviewed May 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: showcase advanced training parameters in README#1

docs: showcase advanced training parameters in README#1
AndresNinou wants to merge 1 commit intomainfrom
docs/advanced-training-params

AndresNinou commented May 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 9, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 9, 2026

Uh oh!

coderabbitai Bot May 9, 2026

Uh oh!

coderabbitai Bot May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AndresNinou commented May 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AndresNinou commented May 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 9, 2026 •

edited

Loading