Updated Bayesian Optimization Workflow#14
Conversation
…standard ReactCA simulation and Bayesian Optimization simulation.
Updated docstring to clarify jobflow usage.
Copied original jobs.py file to here
Added example usage for creating a simulation flow in the docstring.
…etting temperatures; rxn-network uses string instead of float/int so the decimal point caused KeyErrors.
…orrectly, while making sure the reaction library path is exported in BO workflow when it is first generated to prevent mongoDB memory errors.
Update campaign handling to write JSON to a file for next trial.
Added functionality to optimize individual precursor amounts using ratio parameters.
workflow to build 1 common reaction library and share
Ray auto-detects full node CPU count (e.g. 256) instead of the SLURM- allocated CPUs, causing it to spawn task workers that hang on startup. Initialize Ray with SLURM_NTASKS * SLURM_CPUS_PER_TASK before run_enumerators / get_scored_rxns to keep tasks within the allocated pool.
mcgalcode
left a comment
There was a problem hiding this comment.
This looks good to me! Not sure I can understand completely enough to be really critical, but nothing stood out to me.
|
|
||
| Discrete rather than continuous: BayBE enumerates all candidate values | ||
| and selects via Thompson Sampling, avoiding the L-BFGS-B boundary-hang | ||
| that occurs when a continuous parameter's optimum sits at its lower bound |
There was a problem hiding this comment.
Just out of interest: was that boundary hang thing a problem? Did you resolve it somehow by using this Thompson Sampling?
There was a problem hiding this comment.
I think I need to do more testing on this; making precursor ratio as a discrete variable was a semi-temporary fix and I was planning on looking at it more, but after talking to KP she said it'd be okay to not touch precursor ratios at all for now.
| import multiprocessing as mp | ||
|
|
||
| _scoring_globals = {} | ||
| def _pool_initializer(data: dict): |
There was a problem hiding this comment.
Thanks, this is a lot better than the naked global declaration lol
| #!/usr/bin/env python3 | ||
| """Quick local test for the BOFlowMaker jobflow. | ||
|
|
||
| Runs 2 initial + 2 BO trials on a tiny 5x5 grid with 1 realization. |
There was a problem hiding this comment.
Did you mean to include this file? It looks like a debug script.
There was a problem hiding this comment.
Oh shoot yes it's not supposed to be here my bad–can you get rid of it from your side or do I have to do it?
There was a problem hiding this comment.
You can just delete it, make a commit, and push again.
There was a problem hiding this comment.
Okie I did that and I also made some changes on my fork, I'm assuming they automatically get added here? Or should I PR again?
|
|
||
| search_space = SearchSpace.from_dict(search_space_config) | ||
|
|
||
| stub_objective = MockObjectiveFunction( |
There was a problem hiding this comment.
This seems out of place - shouldn't this be a real objective function?
There was a problem hiding this comment.
I added this because BayesianOptimizer needed to initialize the campaign requires objective argument, but _build_campaign doesn't require it. So it was like a placeholder, but I definitely think this can be written better.
| "to prevent this (see search_space.add_precursor_ratio)." | ||
| ) | ||
| from baybe.recommenders.pure.nonpredictive.sampling import RandomRecommender | ||
| recommendation = RandomRecommender().recommend( |
There was a problem hiding this comment.
Again out of curiosity, do you know what the different recommender options are?
There was a problem hiding this comment.
I used the random recommender as a starting point, but there's ~10 recommenders that come with BayBE–this is future room for improvement of this BO workflow. Perhaps it would be better if we allow users to choose which recommender to use, instead of fixing it to ReandomRecommender?
There was a problem hiding this comment.
Yeah, maybe making this a parameter would be good? RandomRecommender could be the default.
…to improve speed.
This was added previously but it's nolonger needed and is causing errors. Therefore reverting it back.
Revamped the Bayesian Optimization workflows under /workflow/flows and /workflow/jobs to support precursor, temperature, and time-based optimization of reaction recipes.