As drafted by Claude:
Agent Ensemble: Orchestrating LLM Agents with libEnsemble
Goal
Brainstorm and plan how libEnsemble can orchestrate ensembles of LLM agents, with generators as planning/reasoning agents and simulators as tool-using/execution agents.
Constraints & Preferences
- Use modern gest-api
Generator class and dataclass specs (SimSpecs, GenSpecs, etc.), not legacy dicts.
- Prefer zero or minimal changes to libEnsemble core for initial implementation.
- VOCS from
gest_api.vocs, not xopt.
- Keep it as simple as possible.
Architecture Mapping
| libEnsemble Concept |
Agent Concept |
Notes |
| Generator |
Planning/reasoning agent |
suggest() produces tasks, ingest() learns from results |
| Simulator |
Execution/tool-using agent |
Receives task, performs work, returns results |
| Manager |
Orchestrator |
Routes tasks, maintains history, enforces exit criteria |
| VOCS |
Task schema |
Defines variables, objectives, constraints |
| History |
Shared memory |
NumPy structured array tracking all inputs/outputs |
| Allocation function |
Scheduling policy |
Decides which workers get which tasks |
Levels of Ambition
Level 1: LLM as Generator
The generator is an LLM that produces candidate configurations (e.g., hyperparameters, prompts, code variants). The simulator is a traditional function that evaluates them. The LLM replaces a numerical optimization algorithm.
Level 2: LLM as Simulator
The simulator is an LLM that evaluates or processes inputs (e.g., code review, text analysis, classification). The generator can be a traditional sampling function. Workers run LLM inference calls in parallel.
Level 3: Full Agent Loop
Both generator and simulator are LLM-backed. The generator is a planner that reasons about what to try next, and the simulator is an executor that carries out tasks using tools. This is a multi-agent system orchestrated by libEnsemble's manager.
Key Findings
Natural Fit
- The modern gest-api interface is already
list[dict]-based, which maps naturally to LLM structured output and function-calling features.
Generator.suggest(num_points) -> list[dict]: each dict has VOCS variable names as keys, scalar values.
Generator.ingest(results: list[dict]) -> None: receives all VOCS fields back. Default is no-op.
Generator.returns_id: bool = False: if True, suggest includes _id key, ingest receives it back.
- Modern simulator:
def sim_f(input_dict: dict, **kwargs) -> dict — auto-wrapped via gest_api_sim.
- libEnsemble supports async returns, persistent workers,
active_recv, multiple comm backends (local/threads, MPI, TCP) — all helpful for slow LLM calls.
Friction Points
- VOCS variables are numeric with bounds.
ContinuousVariable has domain=[low, high], DiscreteVariable is set-based. There is no free-text/string variable type.
- Workarounds for unstructured data:
- Task-ID indexing: use a numeric
task_id in VOCS; generator maintains an internal mapping of task_id -> task_description.
- Extend VOCS with
StringVariable (upstream gest-api change).
- Use
user/constants side channels for metadata.
- Context window limits:
ingest() receives results but LLMs have finite context. May need summarization or RAG over History.
- Token/cost management: No built-in mechanism; could use VOCS constraints or History tracking.
Key Decisions
- Start with Level 1 + Level 2 (LLM as Generator + LLM as Simulator) requiring zero core changes, using existing gest-api interface.
- Use numeric
task_id indexing initially to avoid needing string VOCS variables.
- Generator maintains internal mapping of
task_id -> task_description.
- LLM structured output / function-calling features map naturally to VOCS-derived JSON schemas.
Next Steps
Relevant Files
| File |
Description |
libensemble/generators.py |
LibensembleGenerator, PersistentGenInterfacer base classes |
libensemble/gen_classes/external/sampling.py |
Pure gest-api UniformSample, UniformSampleArray |
libensemble/gen_classes/sampling.py |
UniformSample via LibensembleGenerator |
libensemble/gen_classes/gpCAM.py |
GP_CAM, GP_CAM_Covar (complex generator example with ingest) |
libensemble/gen_classes/aposmm.py |
APOSMM via PersistentGenInterfacer |
libensemble/specs.py |
SimSpecs, GenSpecs, AllocSpecs, ExitCriteria, LibeSpecs dataclasses |
libensemble/ensemble.py |
Primary Ensemble interface |
libensemble/worker.py |
Worker execution loop (recv -> handle -> send) |
libensemble/manager.py |
Manager coordination, History updates, allocation calls |
libensemble/comms/ |
Communication backends (local, MPI, TCP) |
libensemble/tests/regression_tests/test_1d_sampling.py |
Minimal example |
from gest_api.vocs import VOCS
import numpy as np
typical_vocs = VOCS(
variables={"x1": [0, 1.0], "x2": [0, 10.0]},
objectives={"y1": "MINIMIZE"},
constraints={"c1": ["GREATER_THAN", 0.5]},
constants={"constant1": 1.0},
)
...
def typical_simulator(Input, persis_info, sim_specs, libE_info):
Output = np.zeros(Input.shape, dtype=sim_specs.output_dtype)
for i in range(Input.shape[0]):
Output["f"][i] = Input[i] * application_output()
return Output
...
class TypicalGenerator:
def __init__(self, vocs):
self.vocs = vocs
self.model = init_model(self.vocs)
def suggest(self, num_points):
return self.model.suggest(num_points)
def ingest(self, points):
self.model.ingest(points)
...
agentic_vocs = VOCS(
variables={"possibilities": ["AMD example", "CUDA example"],
"strategies": ["single gpu", "multi gpu"]},
objectives={"quality_score": "MAXIMIZE"},
constraints={"cost": ["LESS_THAN", 100]},
)
...
def agentic_simulator(Input, persis_info, sim_specs, libE_info):
Output = np.zeros(Input.shape, dtype=sim_specs.output_dtype)
prompt_base = "Run the {} using {}"
for i in range(Input.shape[0]):
prompt = prompt_base.format(Input[i]["possibilities"], Input[i]["strategies"])
Output["quality_score"][i] = call_llm(prompt)
return Output
...
def AgenticGenerator:
def __init__(self, vocs):
self.model = init_llm("You're an agent that proposes GPU strategies.")
self.prompt = ""
for var, obj in vocs.variables.items():
self.prompt += f"{var}: {', '.join(obj)}\n"
self.prompt += "You must choose from the above possibilities and strategies, but explain why you've made your choice."
def suggest(self, num_points):
return self.model.chat("Please provide a strategy") * num_points
def ingest(self, points):
self.model.chat("Here's the results: " + str(points))
As drafted by Claude:
Agent Ensemble: Orchestrating LLM Agents with libEnsemble
Goal
Brainstorm and plan how libEnsemble can orchestrate ensembles of LLM agents, with generators as planning/reasoning agents and simulators as tool-using/execution agents.
Constraints & Preferences
Generatorclass and dataclass specs (SimSpecs,GenSpecs, etc.), not legacy dicts.gest_api.vocs, not xopt.Architecture Mapping
suggest()produces tasks,ingest()learns from resultsLevels of Ambition
Level 1: LLM as Generator
The generator is an LLM that produces candidate configurations (e.g., hyperparameters, prompts, code variants). The simulator is a traditional function that evaluates them. The LLM replaces a numerical optimization algorithm.
Level 2: LLM as Simulator
The simulator is an LLM that evaluates or processes inputs (e.g., code review, text analysis, classification). The generator can be a traditional sampling function. Workers run LLM inference calls in parallel.
Level 3: Full Agent Loop
Both generator and simulator are LLM-backed. The generator is a planner that reasons about what to try next, and the simulator is an executor that carries out tasks using tools. This is a multi-agent system orchestrated by libEnsemble's manager.
Key Findings
Natural Fit
list[dict]-based, which maps naturally to LLM structured output and function-calling features.Generator.suggest(num_points) -> list[dict]: each dict has VOCS variable names as keys, scalar values.Generator.ingest(results: list[dict]) -> None: receives all VOCS fields back. Default is no-op.Generator.returns_id: bool = False: if True,suggestincludes_idkey,ingestreceives it back.def sim_f(input_dict: dict, **kwargs) -> dict— auto-wrapped viagest_api_sim.active_recv, multiple comm backends (local/threads, MPI, TCP) — all helpful for slow LLM calls.Friction Points
ContinuousVariablehasdomain=[low, high],DiscreteVariableis set-based. There is no free-text/string variable type.task_idin VOCS; generator maintains an internal mapping oftask_id -> task_description.StringVariable(upstream gest-api change).user/constantsside channels for metadata.ingest()receives results but LLMs have finite context. May need summarization or RAG over History.Key Decisions
task_idindexing initially to avoid needing string VOCS variables.task_id -> task_description.Next Steps
LLMGenerator(Generator subclass) andllm_simulatorfunctioningest()(summarization/RAG over History)Relevant Files
libensemble/generators.pyLibensembleGenerator,PersistentGenInterfacerbase classeslibensemble/gen_classes/external/sampling.pyUniformSample,UniformSampleArraylibensemble/gen_classes/sampling.pyUniformSampleviaLibensembleGeneratorlibensemble/gen_classes/gpCAM.pyGP_CAM,GP_CAM_Covar(complex generator example withingest)libensemble/gen_classes/aposmm.pyAPOSMMviaPersistentGenInterfacerlibensemble/specs.pySimSpecs,GenSpecs,AllocSpecs,ExitCriteria,LibeSpecsdataclasseslibensemble/ensemble.pyEnsembleinterfacelibensemble/worker.pylibensemble/manager.pylibensemble/comms/libensemble/tests/regression_tests/test_1d_sampling.py