Skip to content

docs: showcase advanced training parameters in README#1

Open
AndresNinou wants to merge 1 commit intomainfrom
docs/advanced-training-params
Open

docs: showcase advanced training parameters in README#1
AndresNinou wants to merge 1 commit intomainfrom
docs/advanced-training-params

Conversation

@AndresNinou
Copy link
Copy Markdown
Collaborator

@AndresNinou AndresNinou commented May 9, 2026

Summary

Adds an Advanced training section between Run and Expected output demonstrating the 8 new optional kwargs on `client.training.lora_finetune(...)` shipped in reflex-sdk 0.2.0:

  • `lora_alpha` — LoRA scaling factor [1, 256]
  • `lora_dropout` — LoRA dropout [0.0, 0.5]
  • `target_modules` — whitelisted pi0.5 modules to adapt
  • `warmup_steps` — LR schedule warmup [0, max_steps/2]
  • `gradient_checkpointing` — trade compute for VRAM
  • `freeze_vision_encoder` — freeze vision tower
  • `dtype` — `{"bfloat16", "float32"}`
  • `save_freq` — checkpoint cadence [50, max_steps]

Includes a single self-contained example with all 8 wired up to production-grade defaults, plus a bounds table so users know exactly what the server will accept. Calls out that LoRA-only kwargs are rejected on full fine-tunes.

Requires `reflex-sdk >= 0.2.0`.

Test plan

  • README renders cleanly on GitHub
  • Code example matches the SDK function signature exactly
  • Bounds match server-side validation in `platform2/convex/trainingRuns.ts`
  • Once 0.2.0 is published to PyPI, run the example end-to-end against the live API

🤖 Generated with Claude Code


View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

Summary by CodeRabbit

  • Documentation
    • Added comprehensive documentation for advanced LoRA fine-tuning parameters, including usage examples, parameter bounds, defaults, and supported target modules.

Review Change Stack

Adds an "Advanced training" section between "Run" and "Expected
output" demonstrating the 8 new optional kwargs on
`client.training.lora_finetune(...)`: lora_alpha, lora_dropout,
target_modules, warmup_steps, gradient_checkpointing,
freeze_vision_encoder, dtype, save_freq.

Includes a single end-to-end example wiring up all 8 with
production-grade defaults, plus a bounds table and the pi0.5
target_modules whitelist. Notes that LoRA-only kwargs are
rejected on full fine-tunes.

Requires reflex-sdk >= 0.2.0.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 9, 2026

📝 Walkthrough

Walkthrough

This PR adds an "Advanced training" section to the README documenting optional LoRA finetune kwargs supported by client.training.lora_finetune(). The update includes a usage example, a reference table with parameter defaults and bounds, the pi0.5 target_modules whitelist, and a note that these LoRA-specific kwargs are rejected when using client.training.full_finetune().

Changes

Documentation

Layer / File(s) Summary
Advanced Training Documentation
README.md
Adds "Advanced training" section describing eight optional LoRA finetune kwargs with Python usage example, parameter reference table with defaults/bounds, target_modules whitelist for pi0.5, and note that LoRA-specific kwargs are unsupported in full finetune.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding documentation about advanced training parameters to the README.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/advanced-training-params

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@README.md`:
- Around line 155-164: Clarify the README's `lora_alpha` default by replacing
"derived from `lora_rank`" with the explicit derivation formula or an example;
update the table row for `lora_alpha` to state the exact rule (e.g., "defaults
to 2 * lora_rank" or "defaults to lora_rank — typically 2 * lora_rank") and, if
applicable, add a short parenthetical example to show numeric behavior when
`lora_rank` is X; edit the README entry for `lora_alpha` (and nearby `lora_rank`
docs if present) so users see the concrete default computation and an example.
- Line 152: The README example uses job["training_job_id"] which is inconsistent
with the documented identifier name run_id; update the example to access the
returned identifier using job["run_id"] (or the equivalent run_id key) so the
snippet matches other references to run_id throughout the docs and examples.
- Around line 128-153: The README example uses a non-matching API surface
(Reflex()/lots of params) — update it to match quickstart.py by replacing the
import/instantiation and call to reflect the real SDK: use
reflex.Client(api_key=...) instead of Reflex(), and call training.lora_finetune
with the actual supported parameters (hf_source_uri, model_name, base_model,
epochs) or, if you want to keep the extended knobs, explicitly document that
only reflex-sdk>=0.1.4 supports them and list which parameters are accepted;
ensure the README's example and parameter names (training.lora_finetune,
hf_source_uri, model_name, base_model, epochs, reflex.Client) exactly match the
implementation in quickstart.py.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 5adbd97f-686a-4cfe-90ac-c505aa088e56

📥 Commits

Reviewing files that changed from the base of the PR and between 36fb33d and 52242f1.

📒 Files selected for processing (1)
  • README.md

Comment thread README.md
Comment on lines +128 to +153
```python
from reflex import Reflex

client = Reflex()
job = client.training.lora_finetune(
hf_source_uri="hf://lerobot/aloha_sim_transfer_cube_human",
model_name="pi05-aloha-cube",
model_version="v1",
# Core knobs:
max_steps=2000,
batch_size=8,
learning_rate=1e-4,
lora_rank=16,
# Advanced — all optional, all bounds-checked server-side:
lora_alpha=32, # [1, 256]
lora_dropout=0.05, # [0.0, 0.5]
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
warmup_steps=200, # [0, max_steps/2]
gradient_checkpointing=True, # save VRAM
freeze_vision_encoder=True, # standard for LoRA
dtype="bfloat16", # {"bfloat16", "float32"}
save_freq=500, # [50, max_steps]
)

print(job["training_job_id"])
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

🧩 Analysis chain

🌐 Web query:

What is the latest published version of reflex-sdk on PyPI?

💡 Result:

The latest published version of reflex-sdk on PyPI is v0.1.4.[1]

Citations:


🏁 Script executed:

# First, understand the repository structure
git ls-files | head -20

Repository: reflex-inc/quickstart

Length of output: 151


🏁 Script executed:

# Look for SDK-related files, particularly the Reflex class and training API
fd -t f "reflex" | grep -E "\.(py|toml|md)$" | head -20

Repository: reflex-inc/quickstart

Length of output: 47


🏁 Script executed:

# Check for setup.py or pyproject.toml to understand version info
fd -t f -E "__pycache__" "(setup|pyproject)" | head -10

Repository: reflex-inc/quickstart

Length of output: 80


🏁 Script executed:

# Search for lora_finetune method definition
rg "lora_finetune" -t py

Repository: reflex-inc/quickstart

Length of output: 127


🏁 Script executed:

# Look for the Reflex class definition
rg "class Reflex" -t py

Repository: reflex-inc/quickstart

Length of output: 47


🏁 Script executed:

cat -n pyproject.toml

Repository: reflex-inc/quickstart

Length of output: 2101


🏁 Script executed:

cat -n quickstart.py

Repository: reflex-inc/quickstart

Length of output: 25221


🏁 Script executed:

# Get the specific lines from README to compare
sed -n '128,153p' README.md | cat -n

Repository: reflex-inc/quickstart

Length of output: 1240


Update the README code example to match the actual SDK in quickstart.py.

The documented API in README.md (lines 128–153) does not match the working implementation in quickstart.py (lines 154–159). The README shows parameters like model_version, max_steps, batch_size, lora_rank, lora_alpha, lora_dropout, target_modules, warmup_steps, gradient_checkpointing, freeze_vision_encoder, dtype, and save_freq, plus a different import and instantiation pattern (from reflex import Reflex / Reflex()), but the actual code uses only hf_source_uri, model_name, base_model, and epochs with reflex.Client(api_key=...). Developers following the README will encounter errors. Replace the README example with the actual working call pattern from quickstart.py, or document which parameters are genuinely supported by reflex-sdk>=0.1.4.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 128 - 153, The README example uses a non-matching API
surface (Reflex()/lots of params) — update it to match quickstart.py by
replacing the import/instantiation and call to reflect the real SDK: use
reflex.Client(api_key=...) instead of Reflex(), and call training.lora_finetune
with the actual supported parameters (hf_source_uri, model_name, base_model,
epochs) or, if you want to keep the extended knobs, explicitly document that
only reflex-sdk>=0.1.4 supports them and list which parameters are accepted;
ensure the README's example and parameter names (training.lora_finetune,
hf_source_uri, model_name, base_model, epochs, reflex.Client) exactly match the
implementation in quickstart.py.

Comment thread README.md
save_freq=500, # [50, max_steps]
)

print(job["training_job_id"])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check for identifier naming patterns in the README

rg -n 'run_id|training_job_id' README.md

Repository: reflex-inc/quickstart

Length of output: 653


Fix inconsistent key name in code example.

Line 152 accesses job["training_job_id"], but the documentation consistently references run_id as the returned identifier (lines 26, 192, 196, 246, 249). Update the code example to use the correct key name for consistency.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` at line 152, The README example uses job["training_job_id"] which
is inconsistent with the documented identifier name run_id; update the example
to access the returned identifier using job["run_id"] (or the equivalent run_id
key) so the snippet matches other references to run_id throughout the docs and
examples.

Comment thread README.md
Comment on lines +155 to +164
| Kwarg | Default if omitted | Bounds | What it does |
|---|---|---|---|
| `lora_alpha` | derived from `lora_rank` | [1, 256] | LoRA scaling factor |
| `lora_dropout` | `0.0` | [0.0, 0.5] | Dropout on LoRA layers |
| `target_modules` | full pi0.5 set | whitelist below | Modules to LoRA-adapt |
| `warmup_steps` | `100` | [0, `max_steps/2`] | LR warmup length |
| `gradient_checkpointing` | `False` | bool | Trade compute for VRAM |
| `freeze_vision_encoder` | `True` | bool | Freeze the vision tower |
| `dtype` | `"bfloat16"` | `{"bfloat16", "float32"}` | Compute dtype |
| `save_freq` | `500` | [50, `max_steps`] | Steps between checkpoints |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Consider clarifying the "derived from lora_rank" default.

The table states that lora_alpha defaults to a value "derived from lora_rank" when omitted. For users to understand the actual default behavior, consider documenting the derivation formula or providing an example (e.g., "typically 2 * lora_rank" or "defaults to lora_rank").

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 155 - 164, Clarify the README's `lora_alpha` default
by replacing "derived from `lora_rank`" with the explicit derivation formula or
an example; update the table row for `lora_alpha` to state the exact rule (e.g.,
"defaults to 2 * lora_rank" or "defaults to lora_rank — typically 2 *
lora_rank") and, if applicable, add a short parenthetical example to show
numeric behavior when `lora_rank` is X; edit the README entry for `lora_alpha`
(and nearby `lora_rank` docs if present) so users see the concrete default
computation and an example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant