Emmi-AI · kinggongzilla · Mar 10, 2026 · Mar 11, 2026 · Mar 11, 2026 · Mar 11, 2026
@@ -123,7 +123,30 @@ You might be in a situation when your venv won't be configured as intended anymo
 ---
 # Quickstart
 
-You can run a training job immediately using the [tutorial](./tutorial/README.MD) configuration. For local development (Mac/CPU), use:
+> [!IMPORTANT]
+> Before training, you need a prepared dataset. To get started with the ShapeNet-Car dataset,
+> follow the download and preprocessing steps in the
+> [ShapeNet-Car dataset README](./src/noether/data/datasets/cfd/shapenet_car/README.MD).
+
+## Scaffold a New Project
+
+Use `noether-init` to generate a complete training project:
+
+```console
+uv run noether-init my_project --model upt --dataset shapenet_car --dataset-path /path/to/shapenet_car
+```
+
+Then train with:
+
+```console
+uv run noether-train --config-dir my_project/configs --config-name train +experiment=upt
+```
+
+See the [scaffolding tutorial](https://noether-docs.emmi.ai/tutorials/scaffolding_a_new_project.html) for all options and the generated project structure.
+
+## Run the Tutorial Example
+
+You can also run a training job immediately using the [tutorial](./tutorial/README.MD) configuration. For local development (Mac/CPU), use:
 
 ```console
 uv run noether-train --hp tutorial/configs/train_shapenet.yaml \

@@ -1,5 +1,8 @@
 ## `Noether` Starter Kit Project
 ----
+
+> You can use `noether-init` to automatically scaffold a complete project with your choice of model, dataset, and configuration. See the [scaffolding tutorial](https://noether-docs.emmi.ai/tutorials/scaffolding_a_new_project.html) for details.
+
 This folder contains skeleton/boilerplate code for a minimal working `Noether` training pipeline, including all required components.
 
 1.  A dataset that loads (and generates) dummy data.

@@ -183,6 +183,7 @@
     "**/*.ipynb",
     "**/*.md",
     "**/.venv/**",
+    "**/scaffold/template_files/**",
 ]
 
 

@@ -65,3 +65,32 @@ Verify your setup by running the ``estimate`` command, which fetches metadata an
     noether-data aws estimate noaa-goes16 ABI-L1b-RadC/2023/001/00/
 
 If you see no errors — congratulations, your setup works!
+
+Scaffolding a New Project
+-------------------------
+
+The ``noether-init`` command generates a complete Noether training project with all required modules and configurations.
+
+.. code-block:: bash
+
+   uv run noether-init my_project \
+       --model upt \
+       --dataset shapenet_car \
+       --dataset-path /path/to/shapenet_car
+
+**Required arguments:**
+
+- ``project_name`` (positional) — project name, e.g. ``my_project``
+- ``--model, -m`` — model architecture, e.g. ``ab_upt``
+- ``--dataset, -d`` — dataset, e.g. ``shapenet_car``
+- ``--dataset-path`` — path to dataset on disk
+
+**Optional arguments:**
+
+- ``--optimizer, -o`` — optimizer, e.g. ``adamw`` (default)
+- ``--tracker, -t`` — experiment tracker, e.g. ``wandb``
+- ``--hardware`` — hardware target, e.g. ``gpu`` (default)
+- ``--project-dir, -l`` — parent directory for the project folder
+- ``--wandb-entity`` — W&B entity name (only used with ``--tracker wandb``)
+
+For all available options, see :doc:`/tutorials/scaffolding_a_new_project`.
@@ -37,8 +37,9 @@ Welcome to the Noether Framework documentation. Here you will find available API
    tutorials/training_first_model_with_code
    tutorials/full_code_tutorial
    tutorials/how_to_initialize
-   
+
    Walkthrough <https://github.com/Emmi-AI/noether/blob/main/tutorial/README.MD>
+   tutorials/scaffolding_a_new_project
 
 
 .. toctree::

@@ -8,3 +8,4 @@ Step-by-step instructions to get you up and running with Noether.
 * :doc:`training_first_model_with_configs`: Learn how to train models by simply editing configuration files.
 * :doc:`training_first_model_with_code`: Understand how to use Noether as a library to build custom training scripts.
 * `Walkthrough <https://github.com/Emmi-AI/noether/blob/main/tutorial/README.MD>`_: A hands-on guide through the repository's tutorial examples.
+* :doc:`scaffolding_a_new_project`: Use ``noether-init`` to generate a complete training project from scratch.
@@ -0,0 +1,109 @@
+Scaffolding a New Project
+=========================
+
+The ``noether-init`` command generates a complete, ready-to-train Noether project for
+models and datasets supported out of the box by the framework. It creates all required Python modules, Hydra configuration
+files, schemas, data pipelines, trainers, and callbacks, giving you a working starting point that you
+can adapt to your own use case.
+
+Prerequisites
+-------------
+
+Before scaffolding, download and preprocess the dataset you want to use. Each dataset has its own
+fetching and preprocessing instructions — see the
+`Dataset Zoo README <https://github.com/Emmi-AI/noether/blob/main/src/noether/data/datasets/README.md>`_
+for an overview and links to dataset-specific guides.
+
+Example Usage
+-------------
+
+.. code-block:: bash
+
+   uv run noether-init my_project \
+       --model upt \
+       --dataset shapenet_car \
+       --dataset-path /path/to/shapenet_car
+
+This creates a ``my_project/`` directory in the current working directory with a UPT model and the ``shapenet_car`` dataset.
+After completion, ``noether-init`` prints a summary of the configuration and the corresponding
+``noether-train`` command to start training.
+
+Arguments
+---------
+
+.. list-table::
+   :header-rows: 1
+   :widths: 25 50 25
+
+   * - Option
+     - Values
+     - Default
+   * - ``project_name`` *(required)*
+     - Positional argument. Must be a valid Python identifier (no hyphens).
+     -
+   * - ``--model, -m`` *(required)*
+     - ``transformer``, ``upt``, ``ab_upt``, ``transolver``
+     -
+   * - ``--dataset, -d`` *(required)*
+     - ``shapenet_car``, ``drivaernet``, ``drivaerml``, ``ahmedml``, ``emmi_wing``
+     -
+   * - ``--dataset-path`` *(required)*
+     - Path to the dataset on disk
+     -
+   * - ``--optimizer, -o``
+     - ``adamw``, ``lion``
+     - ``adamw``
+   * - ``--tracker, -t``
+     - ``wandb``, ``trackio``, ``tensorboard``, ``disabled``
+     - ``disabled``
+   * - ``--hardware``
+     - ``gpu``, ``mps``, ``cpu``
+     - ``gpu``
+   * - ``--project-dir, -l``
+     - Parent directory for the project folder
+     - current directory
+   * - ``--wandb-entity``
+     - W&B entity name (only with ``--tracker wandb``)
+     - your W&B username
+
+Generated Project Structure
+---------------------------
+
+The generated project contains:
+
+.. code-block:: text
+
+   my_project/
+   ├── configs/
+   │   ├── callbacks/          # Training callback configs
+   │   ├── data_specs/         # Data specification configs
+   │   ├── dataset_normalizers/
+   │   ├── dataset_statistics/
+   │   ├── datasets/           # Dataset configs
+   │   ├── experiment/         # Experiment configs (one per model)
+   │   ├── model/              # Model architecture config
+   │   ├── optimizer/          # Optimizer config
+   │   ├── pipeline/           # Data pipeline config
+   │   ├── tracker/            # Experiment tracker config
+   │   ├── trainer/            # Trainer config
+   │   └── train.yaml          # Main training config
+   ├── model/                  # Model implementation
+   ├── schemas/                # Configuration dataclasses
+   ├── pipeline/               # Data processing (collators, sample processors)
+   ├── trainers/               # Training loop implementation
+   └── callbacks/              # Training callbacks
+
+All Python files are wired up with correct imports for your chosen model, and all Hydra configs reference
+your dataset path, optimizer, and tracker selections.
+
+Running Training
+----------------
+
+After scaffolding, start training with:
+
+.. code-block:: bash
+
+   uv run noether-train \
+       --config-dir my_project/configs \
+       --config-name train \
+       +experiment=upt
@@ -43,12 +43,16 @@ Docs = "https://noether-docs.emmi.ai/"
 [tool.setuptools_scm]
 write_to = "src/noether/_version.py"
 
+[tool.setuptools.package-data]
+"noether.scaffold" = ["references/*.yaml", "template_files/**/*"]
+
 [project.scripts]
 noether-train = "noether.training.cli.main_train:main"
 noether-train-submit-job = "noether.training.cli.submit_job:main"
 noether-eval = "noether.inference.cli.main_inference:main"
 noether-data = "noether.io.cli.cli:app"
 noether-dataset-stats = "noether.data.tools.calculate_statistics:main"
+noether-init = "noether.scaffold.cli:app"
 
 # --- Centralized Development & Tooling Dependencies ---
 # These are dependencies for developing the *entire* workspace.
@@ -130,6 +134,10 @@ module = [
     #    "rtree.*"
 ]
 
+[[tool.mypy.overrides]]
+module = ["noether.scaffold.template_files.*"]
+ignore_errors = true
+
 [tool.pytest.ini_options]
 testpaths = ["tests"]
 pythonpath = ["src"]

@@ -0,0 +1 @@
+#  Copyright © 2025 Emmi AI GmbH. All rights reserved.
@@ -0,0 +1,61 @@
+#  Copyright © 2025 Emmi AI GmbH. All rights reserved.
+
+from __future__ import annotations
+
+from enum import StrEnum
+
+_MODEL_CLASS_NAMES: dict[str, str] = {
+    "transformer": "Transformer",
+    "upt": "UPT",
+    "ab_upt": "ABUPT",
+    "transolver": "Transolver",
+}
+
+
+class ModelChoice(StrEnum):
+    TRANSFORMER = "transformer"
+    UPT = "upt"
+    AB_UPT = "ab_upt"
+    TRANSOLVER = "transolver"
+
+    @property
+    def class_name(self) -> str:
+        return _MODEL_CLASS_NAMES[self.value]
+
+    @property
+    def module_name(self) -> str:
+        return self.value
+
+    @property
+    def schema_module(self) -> str:
+        return f"{self.value}_config"
+
+    @property
+    def config_class_name(self) -> str:
+        return f"{self.class_name}Config"
+
+
+class DatasetChoice(StrEnum):
+    SHAPENET_CAR = "shapenet_car"
+    DRIVAERNET = "drivaernet"
+    DRIVAERML = "drivaerml"
+    AHMEDML = "ahmedml"
+    EMMI_WING = "emmi_wing"
+
+
+class OptimizerChoice(StrEnum):
+    ADAMW = "adamw"
+    LION = "lion"
+
+
+class TrackerChoice(StrEnum):
+    WANDB = "wandb"
+    TRACKIO = "trackio"
+    TENSORBOARD = "tensorboard"
+    DISABLED = "disabled"
+
+
+class HardwareChoice(StrEnum):
+    GPU = "gpu"
+    MPS = "mps"
+    CPU = "cpu"
@@ -0,0 +1,96 @@
+#  Copyright © 2025 Emmi AI GmbH. All rights reserved.
+
+from pathlib import Path
+from typing import Annotated
+
+import typer
+
+from .choices import DatasetChoice, HardwareChoice, ModelChoice, OptimizerChoice, TrackerChoice
+from .config import ScaffoldConfig, resolve_config
+from .generator import generate_project
+
+app = typer.Typer(
+    name="noether-init",
+    help="Scaffold a new Noether training project.",
+    add_completion=False,
+)
+
+
+@app.command()
+def main(
+    project_name: Annotated[
+        str,
+        typer.Argument(
+            help="Project name (valid Python identifier). Examples: 'my_project', 'MyProject1'). No hyphens allowed."
+        ),
+    ],
+    model: Annotated[ModelChoice, typer.Option("--model", "-m", help="Model architecture")] = ...,  # type: ignore[assignment]
+    dataset: Annotated[DatasetChoice, typer.Option("--dataset", "-d", help="Dataset")] = ...,  # type: ignore[assignment]
+    dataset_path: Annotated[str, typer.Option("--dataset-path", help="Path to dataset")] = ...,  # type: ignore[assignment]
+    optimizer: Annotated[OptimizerChoice, typer.Option("--optimizer", "-o", help="Optimizer")] = OptimizerChoice.ADAMW,
+    tracker: Annotated[
+        TrackerChoice, typer.Option("--tracker", "-t", help="Experiment tracker")
+    ] = TrackerChoice.DISABLED,
+    hardware: Annotated[HardwareChoice, typer.Option("--hardware", help="Hardware target")] = HardwareChoice.GPU,
+    project_dir: Annotated[Path, typer.Option("--project-dir", "-l", help="Where to create project dir")] = Path("."),
+    wandb_entity: Annotated[
+        str | None, typer.Option("--wandb-entity", help="W&B entity, e.g. 'my-team' (defaults to your W&B username)")
+    ] = None,
+) -> None:
+    """Scaffold a new Noether training project."""
+    # Validate project name
+    if not project_name.isidentifier():
+        typer.echo(f"Error: '{project_name}' is not a valid Python identifier.", err=True)
+        raise typer.Exit(1)
+
+    # Resolve to absolute path
+    project_dir = (project_dir / project_name).resolve()
+
+    # Check if project dir already exists
+    if project_dir.exists():
+        typer.echo(f"Error: Directory already exists: {project_dir}", err=True)
+        raise typer.Exit(1)
+
+    # Build config
+    config = resolve_config(
+        project_name=project_name,
+        model=model,
+        dataset=dataset,
+        dataset_path=dataset_path,
+        optimizer=optimizer,
+        tracker=tracker,
+        hardware=hardware,
+        project_dir=project_dir,
+        wandb_entity=wandb_entity,
+    )
+
+    # Generate
+    typer.echo(f"Creating project '{project_name}' at {project_dir}")
+    generate_project(config)
+
+    # Print summary
+    _print_summary(config)
+
+
+def _print_summary(config: ScaffoldConfig) -> None:
+    typer.echo(
+        "\nProject created successfully!\n"
+        "Configuration:\n"
+        f"  Project:   {config.project_name}\n"
+        f"  Model:     {config.model.value}\n"
+        f"  Dataset:   {config.dataset.value}\n"
+        f"  Optimizer: {config.optimizer.value}\n"
+        f"  Tracker:   {config.tracker.value}\n"
+        f"  Hardware:  {config.hardware.value}\n"
+        f"  Path:      {config.project_dir}\n"
+    )
+    # Suggest run command
+    typer.echo(
+        "To train, run:\n"
+        f"  uv run noether-train --config-dir {config.project_dir}/configs --config-name train +experiment={config.model.value}\n\n"
+        "Experiment configs for all models are in configs/experiment/."
+    )
+
+
+if __name__ == "__main__":
+    app()
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		# Copyright © 2025 Emmi AI GmbH. All rights reserved.