-
Notifications
You must be signed in to change notification settings - Fork 0
docs: showcase advanced training parameters in README #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -116,6 +116,63 @@ sudo python quickstart.py --prompt "pick up the red cube" | |
|
|
||
| --- | ||
|
|
||
| ## Advanced training | ||
|
|
||
| The quickstart submits a one-epoch LoRA fine-tune with sensible defaults. | ||
| For real production work, `client.training.lora_finetune(...)` accepts | ||
| eight extra optional kwargs that thread straight through to the | ||
| LeRobot training pipeline. Every value is bounds-checked server-side | ||
| before a GPU is provisioned, so a bad config fails fast and you don't | ||
| pay for it. | ||
|
|
||
| ```python | ||
| from reflex import Reflex | ||
|
|
||
| client = Reflex() | ||
| job = client.training.lora_finetune( | ||
| hf_source_uri="hf://lerobot/aloha_sim_transfer_cube_human", | ||
| model_name="pi05-aloha-cube", | ||
| model_version="v1", | ||
| # Core knobs: | ||
| max_steps=2000, | ||
| batch_size=8, | ||
| learning_rate=1e-4, | ||
| lora_rank=16, | ||
| # Advanced — all optional, all bounds-checked server-side: | ||
| lora_alpha=32, # [1, 256] | ||
| lora_dropout=0.05, # [0.0, 0.5] | ||
| target_modules=["q_proj", "k_proj", "v_proj", "o_proj"], | ||
| warmup_steps=200, # [0, max_steps/2] | ||
| gradient_checkpointing=True, # save VRAM | ||
| freeze_vision_encoder=True, # standard for LoRA | ||
| dtype="bfloat16", # {"bfloat16", "float32"} | ||
| save_freq=500, # [50, max_steps] | ||
| ) | ||
|
|
||
| print(job["training_job_id"]) | ||
| ``` | ||
|
Comment on lines
+128
to
+153
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🌐 Web query:
💡 Result: The latest published version of reflex-sdk on PyPI is v0.1.4.[1] Citations: 🏁 Script executed: # First, understand the repository structure
git ls-files | head -20Repository: reflex-inc/quickstart Length of output: 151 🏁 Script executed: # Look for SDK-related files, particularly the Reflex class and training API
fd -t f "reflex" | grep -E "\.(py|toml|md)$" | head -20Repository: reflex-inc/quickstart Length of output: 47 🏁 Script executed: # Check for setup.py or pyproject.toml to understand version info
fd -t f -E "__pycache__" "(setup|pyproject)" | head -10Repository: reflex-inc/quickstart Length of output: 80 🏁 Script executed: # Search for lora_finetune method definition
rg "lora_finetune" -t pyRepository: reflex-inc/quickstart Length of output: 127 🏁 Script executed: # Look for the Reflex class definition
rg "class Reflex" -t pyRepository: reflex-inc/quickstart Length of output: 47 🏁 Script executed: cat -n pyproject.tomlRepository: reflex-inc/quickstart Length of output: 2101 🏁 Script executed: cat -n quickstart.pyRepository: reflex-inc/quickstart Length of output: 25221 🏁 Script executed: # Get the specific lines from README to compare
sed -n '128,153p' README.md | cat -nRepository: reflex-inc/quickstart Length of output: 1240 Update the README code example to match the actual SDK in quickstart.py. The documented API in README.md (lines 128–153) does not match the working implementation in quickstart.py (lines 154–159). The README shows parameters like 🤖 Prompt for AI Agents |
||
|
|
||
| | Kwarg | Default if omitted | Bounds | What it does | | ||
| |---|---|---|---| | ||
| | `lora_alpha` | derived from `lora_rank` | [1, 256] | LoRA scaling factor | | ||
| | `lora_dropout` | `0.0` | [0.0, 0.5] | Dropout on LoRA layers | | ||
| | `target_modules` | full pi0.5 set | whitelist below | Modules to LoRA-adapt | | ||
| | `warmup_steps` | `100` | [0, `max_steps/2`] | LR warmup length | | ||
| | `gradient_checkpointing` | `False` | bool | Trade compute for VRAM | | ||
| | `freeze_vision_encoder` | `True` | bool | Freeze the vision tower | | ||
| | `dtype` | `"bfloat16"` | `{"bfloat16", "float32"}` | Compute dtype | | ||
| | `save_freq` | `500` | [50, `max_steps`] | Steps between checkpoints | | ||
|
Comment on lines
+155
to
+164
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win Consider clarifying the "derived from lora_rank" default. The table states that 🤖 Prompt for AI Agents |
||
|
|
||
| `target_modules` whitelist for pi0.5: `q_proj`, `k_proj`, `v_proj`, | ||
| `o_proj`, `gate_proj`, `up_proj`, `down_proj`, `action_in_proj`, | ||
| `action_out_proj`. Pass any subset. | ||
|
|
||
| The same kwargs work on `client.training.full_finetune(...)` except | ||
| for the LoRA-specific ones (`lora_alpha`, `lora_dropout`, | ||
| `target_modules`) — those are rejected on full fine-tunes. | ||
|
|
||
| --- | ||
|
|
||
| ## Expected output | ||
|
|
||
| ``` | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: reflex-inc/quickstart
Length of output: 653
Fix inconsistent key name in code example.
Line 152 accesses
job["training_job_id"], but the documentation consistently referencesrun_idas the returned identifier (lines 26, 192, 196, 246, 249). Update the code example to use the correct key name for consistency.🤖 Prompt for AI Agents