ThermalForge

LLM-Powered Agents for Modeling Thermal Dynamics of Buildings

ThermalForge is a framework for automatically generating hybrid neuro-physical thermal dynamics models for residential buildings where modeling decisions are made by a Large Language Model (LLM). Specifically, ThermalForge implements a bi-level optimization approach where an LLM proposes model structures (physics-based equations or neural architectures) and then their parameters are calibrated against smart thermostat data.

This repository contains the source code for ThermalForge and the necessary configurations needed to replicate the experiments in the accompanying paper:

Exploring LLM-Powered Agents for Modeling Thermal Dynamics of Buildings in Proceedings of ACM BuildSys 2026

How It Works

ThermalForge runs a two-phase modeling pipeline for each building:

Physics-Based Modeling — The LLM generates grey-box thermal models (RC networks) as PyTorch code. Each candidate is calibrated against observational data, and the LLM receives feedback to improve subsequent proposals.
Neural Residual Modeling — The best physics model's rolled-out predictions become input to a neural model that learns to model the residual. The LLM can generate full architectures, modify a U-Net template, or tune hyperparameters.

The framework is implemented as a LangGraph state graph with 7 nodes:

create_phy_dataloader → generate_phy_model ⇄ evaluate_phy_model
                              ↓ (converged)
                        select_phy_model → generate_nn_model ⇄ evaluate_nn_model
                                                                    ↓ (converged)
                                                              select_nn_model → END

Installation

Requires Python ≥ 3.12. From the repository root:

# Option 1: conda + pip
conda create -n thermal_forge python=3.12 -y
conda activate thermal_forge
pip install -e .

# Option 2: venv + pip
python3.12 -m venv .venv
source .venv/bin/activate
pip install -e .

# Option 3: uv
uv sync

All ThermalForge settings — experiment parameters, LLM options, and infrastructure paths — are controlled through a single YAML config file. See src/thermal_forge/experiments/default.yaml for the full template with all available fields. The Configuration section has details on priority and overrides.

AWS Setup Prerequisites

AWS Account Setup

In order to leverage AWS services (such as S3 for model input/output storage, or SageMaker for running experiments at scale in the cloud) first configure a personal AWS account as follows. All steps assume you are signed in as the account root user or an existing admin.

Choose a region. Pick a single AWS region and use it consistently throughout these instructions. Make sure the region selector in the AWS console (top-right) is set accordingly.
Grant permissions for an IAM user. In the IAM console, go to Users → Create user (unless you have one already). On the permissions page of that user, choose Attach policies directly and attach AdministratorAccess. This single policy covers everything the experiments need (Bedrock, S3, SageMaker, ECR, and IAM role passing for the Docker and SageMaker steps below).
Install the AWS CLI. Download and install the AWS CLI on your local machine before continuing.
Create an access key for CLI use. On the new user's page, go to Security credentials → Create access key and select Command Line Interface (CLI) as the use case. Acknowledge the warning and click Next. Copy the Access Key ID and Secret Access Key since the secret is shown only once. Then run aws configure on your local machine and provide the keys, the region you chose in step 1, and json as the default output format.

SageMaker Execution Role

Create an IAM role for SageMaker. When SageMaker runs your training jobs, it assumes a separate IAM role (distinct from your user permissions). To create it: in the IAM console, go to Roles → Create role, select AWS service as the trusted entity type, then choose SageMaker as the use case (SageMaker - Execution, specifically). On the permissions page, AmazonSageMakerFullAccess will be pre-selected, but after creating the role we will need to add one more permission. Name the role (e.g., SageMakerThermalForgeRole) and create it. Note the Role ARN — you will need it when launching jobs. Once done, go back to the role and also attach the AmazonBedrockFullAccess policy (needed for LLM calls from within the training jobs).

S3 Bucket Creation

Create an S3 bucket. In the S3 console, click Create bucket, give it a name that contains the string sagemaker (e.g., sagemaker-thermalforge-<your-suffix>), leave the defaults, and click Create. The SageMaker execution role (created in the previous step) is granted S3 access by default only on buckets whose names contain sagemaker. If you prefer a different name, you will need to attach an inline policy granting the execution role s3:GetObject and s3:PutObject on that bucket once it exists.

Docker Image

Build a custom Docker image. The Dockerfile extends an AWS Deep Learning Container (DLC) base image hosted in an AWS-managed ECR registry. You must authenticate Docker to that registry before building so that it can pull the base image. First, run the configuration script to set the correct base image for your region:

# Configure the Dockerfile (prompts for region and DLC account ID)
python scripts/configure_docker.py

# Or specify explicitly (DLC_ACCOUNT is the registry account for your region,
# found at https://docs.aws.amazon.com/sagemaker/latest/dg-ecr-paths/sagemaker-algo-docker-registry-paths.html)
python scripts/configure_docker.py --region REGION --account DLC_ACCOUNT

# Authenticate Docker to the DLC ECR registry (use the same DLC_ACCOUNT and REGION from above)
aws ecr get-login-password --region REGION | \
docker login --username AWS --password-stdin DLC_ACCOUNT.dkr.ecr.REGION.amazonaws.com

# Build the image
cd docker/
docker build -t thermalforge .
cd ..

Push to ECR. Authenticate Docker to your own account's Amazon Elastic Container Registry (ECR) and push the image. This is a different registry than the DLC one used in step 7, so a separate docker login is required. Replace ACCOUNT with your 12-digit AWS account ID (run aws sts get-caller-identity --query Account --output text to find it) and REGION with the region you chose in step 1:

# Authenticate Docker to your account's ECR registry
aws ecr get-login-password --region REGION | \
docker login --username AWS --password-stdin ACCOUNT.dkr.ecr.REGION.amazonaws.com

# Create the repository (first time only)
aws ecr create-repository --repository-name thermalforge --region REGION

# Tag and push
docker tag thermalforge ACCOUNT.dkr.ecr.REGION.amazonaws.com/thermalforge
docker push ACCOUNT.dkr.ecr.REGION.amazonaws.com/thermalforge

The resulting image URI (ACCOUNT.dkr.ecr.REGION.amazonaws.com/thermalforge) is what you'll set as ecr_image_uri in your config YAML (or via the ECR_IMAGE_URI environment variable) when launching SageMaker jobs.

Data

The paper uses a subset of the Ecobee Donate Your Data dataset, which contains data from Ecobee smart thermostat in North American homes. The specific subset used in the paper includes:

300 homes with ≥180 heating days
Variables: indoor temperature (from 1 to 6 sensors), outdoor temperature, HVAC run time (stages 1–3), fan run time
Resolution: 5-minute intervals, downsampled to 60-minute for modeling
Split: 80% train / 10% validation / 10% test (by day)

Data Preparation

To prepare the dataset, first install the data preprocessing dependency:

# If using pip/conda
pip install -e ".[data]"

# If using uv
uv sync --extra data

Then:

Create an account at Building Benchmark Datasets and download the "processed" dataset (~282 MB), which arrives as ecobee.zip.
Unzip ecobee.zip and extract the clean_data.rar archive inside it to obtain the *.nc (netCDF) files.
Run the preprocessing script, which takes two arguments — the path to the local folder containing the netCDF files, and an S3 path for uploading the output .npz files:

python -m thermal_forge.dataset.run_data_prep /path/to/netcdf/files/ s3://your-bucket/ecobee_data/

This filters for homes with ≥180 heating-only days, removes days with missing data, and uploads the resulting .npz files to S3.

Quick Start

Local (single home)

python -m thermal_forge.agent.agent \
    --config default.yaml \
    --datafile home_190.npz \
    --seed 42

This requires:

AWS credentials configured for Bedrock access
sm_channel_train set in the YAML config (or via the SM_CHANNEL_TRAIN environment variable) pointing to the directory containing the .npz data file
sm_output_data_dir set in the YAML config (or via SM_OUTPUT_DATA_DIR); defaults to "output"
--datafile is the filename only (not a full path), and must follow the naming convention <prefix>_<num_days>.npz

Note: The sm_ prefix on these field names reflects SageMaker's conventions — SageMaker containers automatically set SM_CHANNEL_TRAIN and SM_OUTPUT_DATA_DIR at runtime. Using the same names locally means the same config works in both environments without modification.

At Scale (SageMaker)

To run across hundreds of homes in parallel, we suggest using SageMaker. The config YAML must be placed in src/thermal_forge/experiments/ so that it is bundled and available inside each SageMaker container. This command must be run from the src/ directory:

cd src/

# Launch jobs (config and seed are passed to each training job)
python -m thermal_forge.sm.run_agent --config default.yaml --seed 42

Infrastructure settings (sagemaker_role, ecr_image_uri, s3_data_folder, s3_output_base) can be set either as environment variables or in the YAML config file (see Configuration).

Evaluating resulting models

The Jupyter Notebook in scripts/plot_results.ipynb can be used as a reference for how to evaluate the performance of the resulting models using statistics and graphs like those shown in the paper (e.g., percentile of RMSE for 24-hour roll-outs).

Configuration

All settings are controlled via YAML files in src/thermal_forge/experiments/. The default configuration matches the paper's primary experiment.

Configuration Priority

Settings are resolved in this order (highest wins):

YAML config file — the experiment definition (selected via --config)
Environment variables — for infrastructure fields (see below)
Defaults — sensible values built into the code

The --seed CLI flag is the one exception that overrides the YAML value directly.

Experiment Configurations

Config File	Paper Section	Description
`default.yaml`	§4.2–4.3	Expert physics prompt + full NN architecture search
`basic_prompt.yaml`	§4.2	Basic physics prompt (no expert knowledge)
`agent_control.yaml`	§4.4	LLM-controlled transitions (δ=1)
`unet_hparam.yaml`	§4.3	U-Net hyperparameter optimization only
`unet_code.yaml`	§4.3	U-Net code generation via LLM
`fixed_model.yaml`	§4.3	Baseline RC model (no LLM physics generation)

Experiment Parameters

Parameter	Default	Description
`fixed_model`	`false`	Skip LLM physics generation, use baseline RC model
`nn_arch_search`	`true`	LLM generates full NN architectures (vs. hparam tuning)
`agent_control`	`false`	LLM decides when to stop iterating (δ parameter)
`llm_model_id`	Claude Sonnet 4.5	Anthropic model via Bedrock
`llm_temperature`	`0.5`	LLM sampling temperature
`max_phy_gen_extra`	`10`	Physics model generation iterations
`max_nn_gen_extra`	`5`	Neural model generation iterations
`downsampling_factor`	`12`	5-min → 60-min resolution (12×)
`seed`	`null`	Random seed for reproducible data splits

Infrastructure Settings

These fields default to their corresponding environment variable, so they work automatically in SageMaker containers. Setting the same field in a YAML file takes precedence over the environment variable.

Parameter	Environment Variable	Default	Description
`sm_channel_train`	`SM_CHANNEL_TRAIN`	`""`	Directory containing `.npz` data files
`sm_output_data_dir`	`SM_OUTPUT_DATA_DIR`	`"output"`	Directory for saving outputs
`s3_data_folder`	`S3_DATA_FOLDER`	`""`	S3 path to `.npz` data files
`s3_output_base`	`S3_OUTPUT_BASE`	`""`	S3 path for job outputs
`sagemaker_role`	`SAGEMAKER_ROLE`	`""`	IAM role ARN for SageMaker execution
`ecr_image_uri`	`ECR_IMAGE_URI`	`""`	Docker image URI in ECR

Examples

Create a custom YAML with both experiment and infrastructure settings:

exp_id: "my-experiment"
agent_control: true
max_phy_gen_extra: 5
llm_temperature: 0.8

# Infrastructure (overrides environment variables if set)
sm_channel_train: "/path/to/data"
sm_output_data_dir: "./output"

Or use environment variables for infrastructure and YAML for experiments:

export SM_CHANNEL_TRAIN="/path/to/data"
export SM_OUTPUT_DATA_DIR="./output"
python -m thermal_forge.agent.agent --config default.yaml --datafile home_190.npz --seed 42

Project Structure

├── docker/
│   └── Dockerfile              # SageMaker-compatible container
├── src/thermal_forge/
│   ├── agent/
│   │   ├── agent.py            # Entry point (CLI)
│   │   ├── graph.py            # LangGraph agent (Algorithm 1)
│   │   ├── llm.py              # Bedrock LLM client
│   │   ├── prompts.py          # All prompt templates (see prompts/ folder)
│   │   └── state.py            # Agent state definition
│   ├── config/
│   │   └── config.py           # ThermalForgeConfig dataclass + YAML loading
│   ├── experiments/            # YAML experiment configurations
│   │   ├── default.yaml        #   Paper's primary experiment
│   │   ├── basic_prompt.yaml   #   §4.2 basic prompt
│   │   ├── agent_control.yaml  #   §4.4 δ=1
│   │   ├── unet_hparam.yaml    #   §4.3 option 1
│   │   ├── unet_code.yaml      #   §4.3 option 2
│   │   ├── fixed_model.yaml    #   Baseline
│   ├── prompts/                # Full prompt text (readable markdown)
│   │   ├── README.md           #   Index with paper notation mapping
│   │   ├── gen_phy_basic.md    #   P_pg: physics model generation
│   │   ├── gen_phy_expert.md   #   P_pg: physics generation with expert knowledge
│   │   ├── eval_phy.md         #   P_pz: physics feedback elicitation
│   │   ├── route_phy.md        #   P_pa: physics convergence check
│   │   ├── gen_nn_full_search.md   #   P_ng: full neural architecture search
│   │   ├── gen_nn_unet_search.md   #   P_ng: U-Net code generation
│   │   ├── eval_nn_hparams.md      #   P_ng: U-Net hyperparameter optimization
│   │   ├── eval_nn_full_search.md  #   P_nz: neural feedback (full search)
│   │   ├── eval_nn_unet_search.md  #   P_nz: neural feedback (U-Net)
│   │   └── route_nn.md             #   P_na: neural convergence check
│   ├── dataset/
│   │   ├── ecobee_phy_dataset.py   # Physics-phase dataset
│   │   ├── ecobee_nn_dataset.py    # Neural-phase dataset
│   │   └── run_data_prep.py        # netCDF → .npz preprocessing + S3 upload
│   ├── model/
│   │   ├── rc_thermal/         # Baseline RC thermal model
│   │   └── unet1d/             # 1D U-Net (encoder, decoder)
│   ├── train/
│   │   ├── train_phy.py        # Physics model training loop
│   │   └── train_nn.py         # Neural model training loop
│   ├── sm/
│   │   └── run_agent.py        # SageMaker job launcher
│   └── utils/
│       ├── agent_utils.py      # Model instantiation, predictions, token stats
│       └── data_utils.py       # S3 file listing
├── pyproject.toml              # Package metadata and dependencies
└── LICENSE                     # CC-BY-NC-4.0

Mapping to the Paper

Paper Concept	Code
Algorithm 1 (agent loop)	`agent/graph.py` — LangGraph `StateGraph` with 7 nodes
Bi-level optimization (Eq. 1–2)	Outer: LLM calls in `generate_` nodes; Inner: `train/train_.py`
Prompt P_pg (physics generation)	`agent/prompts.py::gen_phy_instructions` / `gen_phy_instructions_expert`
Prompt P_ng (neural generation)	`agent/prompts.py::gen_nn_instructions_full_search` / `_unet_search`
Prompt P_pz / P_nz (feedback)	`agent/prompts.py::eval_phy_instructions` / `eval_nn_instructions_*`
Prompt P_pa / P_na (agent check)	`agent/prompts.py::route_phy_instructions` / `route_nn_instructions`
δ parameter (agent control)	`config: agent_control` → `graph.py::route_phy_eval` / `route_nn_eval`
Figure 2 (prediction example)	Generated from `outputs.npy` / `targets.npy` saved by `agent_utils.py`
Table 1 (token usage)	`token_stats.pkl` saved by `agent_utils.py::save_token_stats`

Citation

If you use ThermalForge in your research, please cite it as follows:

@inproceedings{thermalforge2026,
  title     = {Exploring LLM-Powered Agents for Modeling Thermal Dynamics of Buildings},
  author    = {Krzysztof Walczak and Bergés, Mario},
  booktitle = {Proceedings of the 13th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation (BuildSys '26)},
  year      = {2026},
  publisher = {Association for Computing Machinery},
  address   = {New York, NY, USA},
  doi       = {10.1145/3744256.3812580},
  url       = {https://doi.org/10.1145/3744256.3812580},
  location  = {Banff, AB, Canada},
  series    = {BuildSys '26}
}

License

CC-BY-NC-4.0 — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ThermalForge

How It Works

Installation

AWS Setup Prerequisites

AWS Account Setup

SageMaker Execution Role

S3 Bucket Creation

Docker Image

Data

Data Preparation

Quick Start

Local (single home)

At Scale (SageMaker)

Evaluating resulting models

Configuration

Configuration Priority

Experiment Configurations

Experiment Parameters

Infrastructure Settings

Examples

Project Structure

Mapping to the Paper

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docker		docker
scripts		scripts
src/thermal_forge		src/thermal_forge
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

ThermalForge

How It Works

Installation

AWS Setup Prerequisites

AWS Account Setup

SageMaker Execution Role

S3 Bucket Creation

Docker Image

Data

Data Preparation

Quick Start

Local (single home)

At Scale (SageMaker)

Evaluating resulting models

Configuration

Configuration Priority

Experiment Configurations

Experiment Parameters

Infrastructure Settings

Examples

Project Structure

Mapping to the Paper

Citation

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages