-
Notifications
You must be signed in to change notification settings - Fork 202
Open
Description
Hi,
I trained a model using the Flow-GRPO training pipeline following the README setup:
- Base model:
Qwen/Qwen2.5-7B-Instruct(served via vLLM) - MODEL_ENGINE in
config.yaml:["trainable", "gpt-4o-mini", "gpt-4o-mini", "gpt-4o-mini"]
(i.e., I usedgpt-4o-minifor the non-trainable model engines, while the trainable engine is the local vLLM-served model)
Now I’m trying to evaluate the trained checkpoint using the “AgentFlow Benchmark” section in the README. My understanding is:
- Serve the model checkpoint with vLLM
- Run the benchmark
run.shscript
Question: Is that the correct evaluation flow?
If so, I’m confused about how to use my local training checkpoint with the provided benchmark scripts. The example scripts seem written for the published model on Hugging Face (AgentFlow-7B/agentflow-planner-7b). I’m not sure what to change in:
serve_vllm.shrun.sh
…so they load my checkpoint saved under:
checkpoints/AgentFlow_pro/AgentFlow_pro/global_step_*
Could you please clarify what edits are needed and the recommended way to point the benchmark scripts to a locally trained checkpoint?
Thanks!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels