Skip to content

Latest commit

 

History

History
19 lines (17 loc) · 1.54 KB

File metadata and controls

19 lines (17 loc) · 1.54 KB

Main repo for PRM-o1 research project

TODO

  1. Do experiment on steps produced by llama 3.1 8b instruct (how many go over length, what is distribution of lengths, etc.)
  2. Implement LLM step expansion routine that involves generating N steps that are less than max tokens and the paths are semantically novel
  3. Implement A* search using QVM and PRM (maybe use skywork PRM?)

Notes