Skip to content

Updated Bayesian Optimization Workflow#14

Open
jaeminy00 wants to merge 32 commits into
mcgalcode:masterfrom
jaeminy00:master
Open

Updated Bayesian Optimization Workflow#14
jaeminy00 wants to merge 32 commits into
mcgalcode:masterfrom
jaeminy00:master

Conversation

@jaeminy00
Copy link
Copy Markdown

Revamped the Bayesian Optimization workflows under /workflow/flows and /workflow/jobs to support precursor, temperature, and time-based optimization of reaction recipes.

jaeminy00 and others added 26 commits April 9, 2026 14:23
…standard ReactCA simulation and Bayesian Optimization simulation.
Updated docstring to clarify jobflow usage.
Copied original jobs.py file to here
Added example usage for creating a simulation flow in the docstring.
…etting temperatures; rxn-network uses string instead of float/int so the decimal point caused KeyErrors.
…orrectly, while making sure the reaction library path is exported in BO workflow when it is first generated to prevent mongoDB memory errors.
Update campaign handling to write JSON to a file for next trial.
Added functionality to optimize individual precursor amounts using ratio parameters.
workflow to build 1 common reaction library and share
  Ray auto-detects full node CPU count (e.g. 256) instead of the SLURM-
  allocated CPUs, causing it to spawn task workers that hang on startup.
  Initialize Ray with SLURM_NTASKS * SLURM_CPUS_PER_TASK before
  run_enumerators / get_scored_rxns to keep tasks within the allocated pool.
Copy link
Copy Markdown
Owner

@mcgalcode mcgalcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! Not sure I can understand completely enough to be really critical, but nothing stood out to me.


Discrete rather than continuous: BayBE enumerates all candidate values
and selects via Thompson Sampling, avoiding the L-BFGS-B boundary-hang
that occurs when a continuous parameter's optimum sits at its lower bound
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of interest: was that boundary hang thing a problem? Did you resolve it somehow by using this Thompson Sampling?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I need to do more testing on this; making precursor ratio as a discrete variable was a semi-temporary fix and I was planning on looking at it more, but after talking to KP she said it'd be okay to not touch precursor ratios at all for now.

import multiprocessing as mp

_scoring_globals = {}
def _pool_initializer(data: dict):
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a lot better than the naked global declaration lol

Comment thread src/rxn_ca/workflow/flows/ Outdated
#!/usr/bin/env python3
"""Quick local test for the BOFlowMaker jobflow.

Runs 2 initial + 2 BO trials on a tiny 5x5 grid with 1 realization.
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to include this file? It looks like a debug script.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh shoot yes it's not supposed to be here my bad–can you get rid of it from your side or do I have to do it?

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just delete it, make a commit, and push again.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okie I did that and I also made some changes on my fork, I'm assuming they automatically get added here? Or should I PR again?


search_space = SearchSpace.from_dict(search_space_config)

stub_objective = MockObjectiveFunction(
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems out of place - shouldn't this be a real objective function?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this because BayesianOptimizer needed to initialize the campaign requires objective argument, but _build_campaign doesn't require it. So it was like a placeholder, but I definitely think this can be written better.

"to prevent this (see search_space.add_precursor_ratio)."
)
from baybe.recommenders.pure.nonpredictive.sampling import RandomRecommender
recommendation = RandomRecommender().recommend(
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again out of curiosity, do you know what the different recommender options are?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the random recommender as a starting point, but there's ~10 recommenders that come with BayBE–this is future room for improvement of this BO workflow. Perhaps it would be better if we allow users to choose which recommender to use, instead of fixing it to ReandomRecommender?

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, maybe making this a parameter would be good? RandomRecommender could be the default.

jaeminy00 added 3 commits May 27, 2026 14:26
This was added previously but it's nolonger needed and is causing
errors. Therefore reverting it back.
@jaeminy00
Copy link
Copy Markdown
Author

Note that the force-push from 77f533c to 74bcf32 was done to change the github account and email that was used to commit (HPC configuration wasn't done properly, so a different, automatically generated github account was used for the commits; overwrote the authorship)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants