Skip to content

amazon-science/thermal-forge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

ThermalForge

LLM-Powered Agents for Modeling Thermal Dynamics of Buildings

ThermalForge is a framework for automatically generating hybrid neuro-physical thermal dynamics models for residential buildings where modeling decisions are made by a Large Language Model (LLM). Specifically, ThermalForge implements a bi-level optimization approach where an LLM proposes model structures (physics-based equations or neural architectures) and then their parameters are calibrated against smart thermostat data.

This repository contains the source code for ThermalForge and the necessary configurations needed to replicate the experiments in the accompanying paper:

Exploring LLM-Powered Agents for Modeling Thermal Dynamics of Buildings in Proceedings of ACM BuildSys 2026


How It Works

ThermalForge runs a two-phase modeling pipeline for each building:

  1. Physics-Based Modeling — The LLM generates grey-box thermal models (RC networks) as PyTorch code. Each candidate is calibrated against observational data, and the LLM receives feedback to improve subsequent proposals.

  2. Neural Residual Modeling — The best physics model's rolled-out predictions become input to a neural model that learns to model the residual. The LLM can generate full architectures, modify a U-Net template, or tune hyperparameters.

The framework is implemented as a LangGraph state graph with 7 nodes:

create_phy_dataloader → generate_phy_model ⇄ evaluate_phy_model
                              ↓ (converged)
                        select_phy_model → generate_nn_model ⇄ evaluate_nn_model
                                                                    ↓ (converged)
                                                              select_nn_model → END

Installation

Requires Python ≥ 3.12. From the repository root:

# Option 1: conda + pip
conda create -n thermal_forge python=3.12 -y
conda activate thermal_forge
pip install -e .

# Option 2: venv + pip
python3.12 -m venv .venv
source .venv/bin/activate
pip install -e .

# Option 3: uv
uv sync

All ThermalForge settings — experiment parameters, LLM options, and infrastructure paths — are controlled through a single YAML config file. See src/thermal_forge/experiments/default.yaml for the full template with all available fields. The Configuration section has details on priority and overrides.

AWS Setup Prerequisites

AWS Account Setup

In order to leverage AWS services (such as S3 for model input/output storage, or SageMaker for running experiments at scale in the cloud) first configure a personal AWS account as follows. All steps assume you are signed in as the account root user or an existing admin.

  1. Choose a region. Pick a single AWS region and use it consistently throughout these instructions. Make sure the region selector in the AWS console (top-right) is set accordingly.

  2. Grant permissions for an IAM user. In the IAM console, go to Users → Create user (unless you have one already). On the permissions page of that user, choose Attach policies directly and attach AdministratorAccess. This single policy covers everything the experiments need (Bedrock, S3, SageMaker, ECR, and IAM role passing for the Docker and SageMaker steps below).

  3. Install the AWS CLI. Download and install the AWS CLI on your local machine before continuing.

  4. Create an access key for CLI use. On the new user's page, go to Security credentials → Create access key and select Command Line Interface (CLI) as the use case. Acknowledge the warning and click Next. Copy the Access Key ID and Secret Access Key since the secret is shown only once. Then run aws configure on your local machine and provide the keys, the region you chose in step 1, and json as the default output format.

SageMaker Execution Role

  1. Create an IAM role for SageMaker. When SageMaker runs your training jobs, it assumes a separate IAM role (distinct from your user permissions). To create it: in the IAM console, go to Roles → Create role, select AWS service as the trusted entity type, then choose SageMaker as the use case (SageMaker - Execution, specifically). On the permissions page, AmazonSageMakerFullAccess will be pre-selected, but after creating the role we will need to add one more permission. Name the role (e.g., SageMakerThermalForgeRole) and create it. Note the Role ARN — you will need it when launching jobs. Once done, go back to the role and also attach the AmazonBedrockFullAccess policy (needed for LLM calls from within the training jobs).

S3 Bucket Creation

  1. Create an S3 bucket. In the S3 console, click Create bucket, give it a name that contains the string sagemaker (e.g., sagemaker-thermalforge-<your-suffix>), leave the defaults, and click Create. The SageMaker execution role (created in the previous step) is granted S3 access by default only on buckets whose names contain sagemaker. If you prefer a different name, you will need to attach an inline policy granting the execution role s3:GetObject and s3:PutObject on that bucket once it exists.

Docker Image

  1. Build a custom Docker image. The Dockerfile extends an AWS Deep Learning Container (DLC) base image hosted in an AWS-managed ECR registry. You must authenticate Docker to that registry before building so that it can pull the base image. First, run the configuration script to set the correct base image for your region:
# Configure the Dockerfile (prompts for region and DLC account ID)
python scripts/configure_docker.py

# Or specify explicitly (DLC_ACCOUNT is the registry account for your region,
# found at https://docs.aws.amazon.com/sagemaker/latest/dg-ecr-paths/sagemaker-algo-docker-registry-paths.html)
python scripts/configure_docker.py --region REGION --account DLC_ACCOUNT

# Authenticate Docker to the DLC ECR registry (use the same DLC_ACCOUNT and REGION from above)
aws ecr get-login-password --region REGION | \
docker login --username AWS --password-stdin DLC_ACCOUNT.dkr.ecr.REGION.amazonaws.com

# Build the image
cd docker/
docker build -t thermalforge .
cd ..
  1. Push to ECR. Authenticate Docker to your own account's Amazon Elastic Container Registry (ECR) and push the image. This is a different registry than the DLC one used in step 7, so a separate docker login is required. Replace ACCOUNT with your 12-digit AWS account ID (run aws sts get-caller-identity --query Account --output text to find it) and REGION with the region you chose in step 1:
# Authenticate Docker to your account's ECR registry
aws ecr get-login-password --region REGION | \
docker login --username AWS --password-stdin ACCOUNT.dkr.ecr.REGION.amazonaws.com

# Create the repository (first time only)
aws ecr create-repository --repository-name thermalforge --region REGION

# Tag and push
docker tag thermalforge ACCOUNT.dkr.ecr.REGION.amazonaws.com/thermalforge
docker push ACCOUNT.dkr.ecr.REGION.amazonaws.com/thermalforge

The resulting image URI (ACCOUNT.dkr.ecr.REGION.amazonaws.com/thermalforge) is what you'll set as ecr_image_uri in your config YAML (or via the ECR_IMAGE_URI environment variable) when launching SageMaker jobs.

Data

The paper uses a subset of the Ecobee Donate Your Data dataset, which contains data from Ecobee smart thermostat in North American homes. The specific subset used in the paper includes:

  • 300 homes with ≥180 heating days
  • Variables: indoor temperature (from 1 to 6 sensors), outdoor temperature, HVAC run time (stages 1–3), fan run time
  • Resolution: 5-minute intervals, downsampled to 60-minute for modeling
  • Split: 80% train / 10% validation / 10% test (by day)

Data Preparation

To prepare the dataset, first install the data preprocessing dependency:

# If using pip/conda
pip install -e ".[data]"

# If using uv
uv sync --extra data

Then:

  1. Create an account at Building Benchmark Datasets and download the "processed" dataset (~282 MB), which arrives as ecobee.zip.
  2. Unzip ecobee.zip and extract the clean_data.rar archive inside it to obtain the *.nc (netCDF) files.
  3. Run the preprocessing script, which takes two arguments — the path to the local folder containing the netCDF files, and an S3 path for uploading the output .npz files:
python -m thermal_forge.dataset.run_data_prep /path/to/netcdf/files/ s3://your-bucket/ecobee_data/

This filters for homes with ≥180 heating-only days, removes days with missing data, and uploads the resulting .npz files to S3.

Quick Start

Local (single home)

python -m thermal_forge.agent.agent \
    --config default.yaml \
    --datafile home_190.npz \
    --seed 42

This requires:

  • AWS credentials configured for Bedrock access
  • sm_channel_train set in the YAML config (or via the SM_CHANNEL_TRAIN environment variable) pointing to the directory containing the .npz data file
  • sm_output_data_dir set in the YAML config (or via SM_OUTPUT_DATA_DIR); defaults to "output"
  • --datafile is the filename only (not a full path), and must follow the naming convention <prefix>_<num_days>.npz

Note: The sm_ prefix on these field names reflects SageMaker's conventions — SageMaker containers automatically set SM_CHANNEL_TRAIN and SM_OUTPUT_DATA_DIR at runtime. Using the same names locally means the same config works in both environments without modification.

At Scale (SageMaker)

To run across hundreds of homes in parallel, we suggest using SageMaker. The config YAML must be placed in src/thermal_forge/experiments/ so that it is bundled and available inside each SageMaker container. This command must be run from the src/ directory:

cd src/

# Launch jobs (config and seed are passed to each training job)
python -m thermal_forge.sm.run_agent --config default.yaml --seed 42

Infrastructure settings (sagemaker_role, ecr_image_uri, s3_data_folder, s3_output_base) can be set either as environment variables or in the YAML config file (see Configuration).

Evaluating resulting models

The Jupyter Notebook in scripts/plot_results.ipynb can be used as a reference for how to evaluate the performance of the resulting models using statistics and graphs like those shown in the paper (e.g., percentile of RMSE for 24-hour roll-outs).

Configuration

All settings are controlled via YAML files in src/thermal_forge/experiments/. The default configuration matches the paper's primary experiment.

Configuration Priority

Settings are resolved in this order (highest wins):

  1. YAML config file — the experiment definition (selected via --config)
  2. Environment variables — for infrastructure fields (see below)
  3. Defaults — sensible values built into the code

The --seed CLI flag is the one exception that overrides the YAML value directly.

Experiment Configurations

Config File Paper Section Description
default.yaml §4.2–4.3 Expert physics prompt + full NN architecture search
basic_prompt.yaml §4.2 Basic physics prompt (no expert knowledge)
agent_control.yaml §4.4 LLM-controlled transitions (δ=1)
unet_hparam.yaml §4.3 U-Net hyperparameter optimization only
unet_code.yaml §4.3 U-Net code generation via LLM
fixed_model.yaml §4.3 Baseline RC model (no LLM physics generation)

Experiment Parameters

Parameter Default Description
fixed_model false Skip LLM physics generation, use baseline RC model
nn_arch_search true LLM generates full NN architectures (vs. hparam tuning)
agent_control false LLM decides when to stop iterating (δ parameter)
llm_model_id Claude Sonnet 4.5 Anthropic model via Bedrock
llm_temperature 0.5 LLM sampling temperature
max_phy_gen_extra 10 Physics model generation iterations
max_nn_gen_extra 5 Neural model generation iterations
downsampling_factor 12 5-min → 60-min resolution (12×)
seed null Random seed for reproducible data splits

Infrastructure Settings

These fields default to their corresponding environment variable, so they work automatically in SageMaker containers. Setting the same field in a YAML file takes precedence over the environment variable.

Parameter Environment Variable Default Description
sm_channel_train SM_CHANNEL_TRAIN "" Directory containing .npz data files
sm_output_data_dir SM_OUTPUT_DATA_DIR "output" Directory for saving outputs
s3_data_folder S3_DATA_FOLDER "" S3 path to .npz data files
s3_output_base S3_OUTPUT_BASE "" S3 path for job outputs
sagemaker_role SAGEMAKER_ROLE "" IAM role ARN for SageMaker execution
ecr_image_uri ECR_IMAGE_URI "" Docker image URI in ECR

Examples

Create a custom YAML with both experiment and infrastructure settings:

exp_id: "my-experiment"
agent_control: true
max_phy_gen_extra: 5
llm_temperature: 0.8

# Infrastructure (overrides environment variables if set)
sm_channel_train: "/path/to/data"
sm_output_data_dir: "./output"

Or use environment variables for infrastructure and YAML for experiments:

export SM_CHANNEL_TRAIN="/path/to/data"
export SM_OUTPUT_DATA_DIR="./output"
python -m thermal_forge.agent.agent --config default.yaml --datafile home_190.npz --seed 42

Project Structure

├── docker/
│   └── Dockerfile              # SageMaker-compatible container
├── src/thermal_forge/
│   ├── agent/
│   │   ├── agent.py            # Entry point (CLI)
│   │   ├── graph.py            # LangGraph agent (Algorithm 1)
│   │   ├── llm.py              # Bedrock LLM client
│   │   ├── prompts.py          # All prompt templates (see prompts/ folder)
│   │   └── state.py            # Agent state definition
│   ├── config/
│   │   └── config.py           # ThermalForgeConfig dataclass + YAML loading
│   ├── experiments/            # YAML experiment configurations
│   │   ├── default.yaml        #   Paper's primary experiment
│   │   ├── basic_prompt.yaml   #   §4.2 basic prompt
│   │   ├── agent_control.yaml  #   §4.4 δ=1
│   │   ├── unet_hparam.yaml    #   §4.3 option 1
│   │   ├── unet_code.yaml      #   §4.3 option 2
│   │   ├── fixed_model.yaml    #   Baseline
│   ├── prompts/                # Full prompt text (readable markdown)
│   │   ├── README.md           #   Index with paper notation mapping
│   │   ├── gen_phy_basic.md    #   P_pg: physics model generation
│   │   ├── gen_phy_expert.md   #   P_pg: physics generation with expert knowledge
│   │   ├── eval_phy.md         #   P_pz: physics feedback elicitation
│   │   ├── route_phy.md        #   P_pa: physics convergence check
│   │   ├── gen_nn_full_search.md   #   P_ng: full neural architecture search
│   │   ├── gen_nn_unet_search.md   #   P_ng: U-Net code generation
│   │   ├── eval_nn_hparams.md      #   P_ng: U-Net hyperparameter optimization
│   │   ├── eval_nn_full_search.md  #   P_nz: neural feedback (full search)
│   │   ├── eval_nn_unet_search.md  #   P_nz: neural feedback (U-Net)
│   │   └── route_nn.md             #   P_na: neural convergence check
│   ├── dataset/
│   │   ├── ecobee_phy_dataset.py   # Physics-phase dataset
│   │   ├── ecobee_nn_dataset.py    # Neural-phase dataset
│   │   └── run_data_prep.py        # netCDF → .npz preprocessing + S3 upload
│   ├── model/
│   │   ├── rc_thermal/         # Baseline RC thermal model
│   │   └── unet1d/             # 1D U-Net (encoder, decoder)
│   ├── train/
│   │   ├── train_phy.py        # Physics model training loop
│   │   └── train_nn.py         # Neural model training loop
│   ├── sm/
│   │   └── run_agent.py        # SageMaker job launcher
│   └── utils/
│       ├── agent_utils.py      # Model instantiation, predictions, token stats
│       └── data_utils.py       # S3 file listing
├── pyproject.toml              # Package metadata and dependencies
└── LICENSE                     # CC-BY-NC-4.0

Mapping to the Paper

Paper Concept Code
Algorithm 1 (agent loop) agent/graph.py — LangGraph StateGraph with 7 nodes
Bi-level optimization (Eq. 1–2) Outer: LLM calls in generate_* nodes; Inner: train/train_*.py
Prompt P_pg (physics generation) agent/prompts.py::gen_phy_instructions / gen_phy_instructions_expert
Prompt P_ng (neural generation) agent/prompts.py::gen_nn_instructions_full_search / _unet_search
Prompt P_pz / P_nz (feedback) agent/prompts.py::eval_phy_instructions / eval_nn_instructions_*
Prompt P_pa / P_na (agent check) agent/prompts.py::route_phy_instructions / route_nn_instructions
δ parameter (agent control) config: agent_controlgraph.py::route_phy_eval / route_nn_eval
Figure 2 (prediction example) Generated from outputs.npy / targets.npy saved by agent_utils.py
Table 1 (token usage) token_stats.pkl saved by agent_utils.py::save_token_stats

Citation

If you use ThermalForge in your research, please cite it as follows:

@inproceedings{thermalforge2026,
  title     = {Exploring LLM-Powered Agents for Modeling Thermal Dynamics of Buildings},
  author    = {Krzysztof Walczak and Bergés, Mario},
  booktitle = {Proceedings of the 13th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation (BuildSys '26)},
  year      = {2026},
  publisher = {Association for Computing Machinery},
  address   = {New York, NY, USA},
  doi       = {10.1145/3744256.3812580},
  url       = {https://doi.org/10.1145/3744256.3812580},
  location  = {Banff, AB, Canada},
  series    = {BuildSys '26}
}

License

CC-BY-NC-4.0 — see LICENSE.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors