Skip to content

compare to remote HEAD#1

Draft
CharlieFRuan wants to merge 38 commits intoremote-originfrom
main
Draft

compare to remote HEAD#1
CharlieFRuan wants to merge 38 commits intoremote-originfrom
main

Conversation

@CharlieFRuan
Copy link
Copy Markdown
Collaborator

No description provided.

joyemang33 and others added 26 commits March 19, 2026 17:56
Adds examples/evolve/ with the SkyRL training integration for the
EvolveAgent advisor RL loop (main_evolve.py + train_evolve.sh).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s=10

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
enable_auto_tool_choice + tool_call_parser=qwen3_coder (advisor uses get_call_code tool)
language_model_only=true + attention_backend=FLASH_ATTN

Intentionally omitting reasoning_parser so thinking tokens stay in
content and are captured in the training token sequence.
- CKPTS_DIR, EXPORTS_DIR, LOG_DIR → /data/qmang/outputs/ (avoid ~18GB checkpoints in home)
- HF_HOME → /data/qmang/hf_cache
- TRITON_CACHE_DIR → /data/qmang/triton_cache
- TORCH_HOME → /data/qmang/torch_cache
…pyarrow fixes

- Configure all 8 GPUs for advisor vLLM + FSDP training (frozen solver uses GPT-5 via OpenAI API)
- Pin pyarrow>=20,<22 to avoid jemalloc background thread segfault in multiprocessing.spawn
- Set ARROW_DEFAULT_MEMORY_POOL=system and disable jemalloc background thread in runtime env
- Guard eval when eval_dataloader is None in trainer
- Add Qwen3.5 accuracy+thinking jinja2 template
- Add binary_search full_context example config
- Update uv.lock
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants