NVIDIA-NeMo · cmunley1 · May 30, 2026 · May 30, 2026
diff --git a/environments/ether0/README.md b/environments/ether0/README.md
@@ -0,0 +1,41 @@
+# ether0 benchmark environment
+
+[Benchmark](https://huggingface.co/datasets/futurehouse/ether0-benchmark) and [paper](https://arxiv.org/pdf/2506.17238).
+
+325 chemistry reasoning questions across 14 task types. All answers are a molecule. Around 25 questions per task, including:
+
+- Completing SMILES fragments
+- Designing molecules adhering to molecular formula and functional group constraints
+- Predicting reaction outcomes
+- Proposing one-step synthesis pathways
+- Editing the solubility of a molecule
+- Converting IUPAC name to SMILES
+- Answering multiple-choice questions about safety, ADME properties, BBB permeability, toxicity, scent, and pKa
+
+Note that retro-synthesis and oracle-solubility require an additional verifier server (see `ether0-serve` in the [ether0 repo](https://github.com/Future-House/ether0/)).
+
+## Quickstart 
+
+Create `env.yaml`:
+```
+policy_base_url: http://localhost:8000/v1
+policy_api_key: EMPTY
+policy_model_name: futurehouse/ether0
+```
+
+Start servers and collect rollouts
+```bash
+# start vllm and nemo gym servers
+vllm serve futurehouse/ether0 & 
+ng_run "+config_paths=[environments/ether0/config.yaml,responses_api_models/vllm_model/configs/vllm_model.yaml]" &
+
+# wait for above to be ready
+ng_collect_rollouts \
+    +agent_name=ether0_simple_agent \
+    +input_jsonl_fpath=environments/ether0/data/example.jsonl \
+    +output_jsonl_fpath=environments/ether0/data/ether0_rollouts.jsonl
+
+tail -n 1 environments/ether0/data/ether0_rollouts.jsonl | jq | less
+```
+
+See `prepare.py` to prepare the full dataset.
diff --git a/environments/ether0/__init__.py b/environments/ether0/__init__.py
diff --git a/environments/ether0/config.yaml b/environments/ether0/config.yaml
@@ -0,0 +1,26 @@
+ether0:
+  resources_servers:
+    ether0:
+      entrypoint: app.py
+      domain: knowledge
+      verified: false
+      description: ether0 chemistry benchmark verifiers
+      value: Evalutate chemistry knowledge and reasoning with ether0 benchmark
+ether0_simple_agent:
+  responses_api_agents:
+    simple_agent:
+      entrypoint: app.py
+      resources_server:
+        type: resources_servers
+        name: ether0
+      model_server:
+        type: responses_api_models
+        name: policy_model
+      datasets:
+      - name: example
+        type: example
+        jsonl_fpath: environments/ether0/data/example.jsonl
+      - name: val
+        type: validation
+        jsonl_fpath: environments/ether0/data/val.jsonl
+        license: Creative Commons Attribution 4.0 International
diff --git a/environments/ether0/data/.gitignore b/environments/ether0/data/.gitignore
@@ -0,0 +1,6 @@
+*train.jsonl
+*validation.jsonl
+*val.jsonl
+*train_prepare.jsonl
+*validation_prepare.jsonl
+*example_prepare.jsonl
diff --git a/environments/ether0/data/example.jsonl b/environments/ether0/data/example.jsonl
@@ -0,0 +1,5 @@
+{"responses_create_params": {"input": [{"role": "system", "content": "You are a scientific reasoning agent. Think step by step, then place your final answer inside <answer></answer> tags. For example: <answer>CCO</answer>"}, {"role": "user", "content": "Generate a SMILES representation for a molecule containing groups: charged and nitro. It should also have formula C13H12N6O5."}]}, "verifier_metadata": {"solution": "functional_group_eval!:!('C13H12N6O5', ['charged', 'nitro'])!:!functional-group", "problem_type": "functional-group", "ideal": "Cc1ncc([N+](=O)[O-])n1CC(=O)N/N=C/c1ccc([N+](=O)[O-])cc1", "id": "00c8bc2d-0bb3-53c2-8bdf-cd19616d4536"}, "agent_ref": {"type": "responses_api_agents", "name": "ether0_simple_agent"}}
+{"responses_create_params": {"input": [{"role": "system", "content": "You are a scientific reasoning agent. Think step by step, then place your final answer inside <answer></answer> tags. For example: <answer>CCO</answer>"}, {"role": "user", "content": "Among the following, which molecule is predicted to have a permeability in MDCK cells in MDR1-MDCK efflux ratio B-A/A-B close to 1.04?\nFC(C1=NC(C(NC2C(OCC)=CC3=NN(CCC(O)(C)C)C=C3C=2)=O)=CC=C1)F\nC(NC1=CC2=CN(N=C2C=C1C(C)(O)C)CCC(C)(O)C)(=O)C1N=C(C=CC=1)C(F)(F)F\nC12C=C(NC(=O)C3C=CN(C(F)F)N=3)C(OC)=CC1=NN(C=2)CCC(C)(C)O"}]}, "verifier_metadata": {"solution": "str_eval!:!FC(C1=NC(C(NC2C(OCC)=CC3=NN(CCC(O)(C)C)C=C3C=2)=O)=CC=C1)F!:!property-regression-adme/log_mdr1-mdck_er", "problem_type": "property-regression-adme/log_mdr1-mdck_er", "ideal": "CCOc1cc2nn(CCC(C)(C)O)cc2cc1NC(=O)c1cccc(C(F)F)n1", "id": "066b28c7-c991-5095-8045-a5da176c150a"}, "agent_ref": {"type": "responses_api_agents", "name": "ether0_simple_agent"}}
+{"responses_create_params": {"input": [{"role": "system", "content": "You are a scientific reasoning agent. Think step by step, then place your final answer inside <answer></answer> tags. For example: <answer>CCO</answer>"}, {"role": "user", "content": "A compound with formula C30H44O7 was isolated from Nerium oleander L.. What is a plausible SMILES for it given this organism?\nspecies: Nerium oleander L.\ntaxonomicGroup: Angiosperms\nhabitat: Temperate and subtropical areas, along river banks, stream beds in river valleys, roadsides, parks, coastal gardens\nlifestyle: Free-living\nmetabolicType: Photoautotrophic\ncellularOrganization: Multicellular\npresenceOfOrganelles: Mitochondria, chloroplasts\ncellWallComposition: Cellulose"}]}, "verifier_metadata": {"solution": "formula_eval!:!CO[C@@H]1C[C@H](O[C@H]2CC[C@@]3(C)[C@H](CC[C@]45CC[C@H](C6=CC(=O)OC6)[C@@](C)(CC[C@H]34)C5=O)C2)O[C@H](C)[C@@H]1O!:!molecule-formula", "problem_type": "molecule-formula", "ideal": "CO[C@@H]1C[C@H](O[C@H]2CC[C@@]3(C)[C@H](CC[C@]45CC[C@H](C6=CC(=O)OC6)[C@@](C)(CC[C@H]34)C5=O)C2)O[C@H](C)[C@@H]1O", "id": "a0e5657a-901a-5888-af3b-87c8c8471ea8"}, "agent_ref": {"type": "responses_api_agents", "name": "ether0_simple_agent"}}
+{"responses_create_params": {"input": [{"role": "system", "content": "You are a scientific reasoning agent. Think step by step, then place your final answer inside <answer></answer> tags. For example: <answer>CCO</answer>"}, {"role": "user", "content": "Identify which of the following molecules will most likely have a rat LD50 oral in mg/kg of 6.09:\nC(CS)(=O)O.C(O)CN\nFCC(=O)O\nClCC(=O)O"}]}, "verifier_metadata": {"solution": "str_eval!:!FCC(=O)O!:!property-regression-ld50", "problem_type": "property-regression-ld50", "ideal": "FCC(=O)O", "id": "6af247d8-aaec-5047-8b57-af1b42f9d38a"}, "agent_ref": {"type": "responses_api_agents", "name": "ether0_simple_agent"}}
+{"responses_create_params": {"input": [{"role": "system", "content": "You are a scientific reasoning agent. Think step by step, then place your final answer inside <answer></answer> tags. For example: <answer>CCO</answer>"}, {"role": "user", "content": "Given that molecule [N+](=O)([O-])C1=CC=CC2=CC=CC=C12 is toxic, select from below the molecule most expected to not have this characteristic:\n[N+](=O)([O-])C1=C(C)C(=CC=C1)[N+](=O)[O-]\n[N+](=O)([O-])C1=CC=CC2=N[Se]N=C21\nIC1=C(C=CC=C1[N+](=O)[O-])[N+](=O)[O-]\nBrC=1C=C(C2=CC=CC=C2C1)[N+](=O)[O-]"}]}, "verifier_metadata": {"solution": "str_eval!:!BrC=1C=C(C2=CC=CC=C2C1)[N+](=O)[O-]!:!property-cat-safety/delta-toxic", "problem_type": "property-cat-safety/delta-toxic", "ideal": "BrC=1C=C(C2=CC=CC=C2C1)[N+](=O)[O-]", "id": "ea5a9ab5-7207-5016-8465-4634f7db5437"}, "agent_ref": {"type": "responses_api_agents", "name": "ether0_simple_agent"}}
diff --git a/environments/ether0/data/example_metrics.json b/environments/ether0/data/example_metrics.json
@@ -0,0 +1,38 @@
+{
+    "name": "example",
+    "type": "example",
+    "jsonl_fpath": "resources_servers/ether0/data/example.jsonl",
+    "num_repeats": 1,
+    "gitlab_identifier": null,
+    "huggingface_identifier": null,
+    "license": null,
+    "Number of examples": 5,
+    "Number of tools": {
+        "Total # non-null values": 0,
+        "Average": 0.0,
+        "Min": 0.0,
+        "Max": 0.0,
+        "Standard deviation": 0.0
+    },
+    "Json-dumped number of words (proxy for token count)": {
+        "Total # non-null values": 5,
+        "Average": 52.6,
+        "Min": 46.0,
+        "Max": 75.0,
+        "Standard deviation": 12.64
+    },
+    "Number of turns": {
+        "Total # non-null values": 5,
+        "Average": 1.0,
+        "Min": 1.0,
+        "Max": 1.0,
+        "Standard deviation": 0.0
+    },
+    "Temperature": {
+        "Total # non-null values": 0,
+        "Average": 0.0,
+        "Min": 0.0,
+        "Max": 0.0,
+        "Standard deviation": 0.0
+    }
+}