Developing a sustainable AI system

Overview

The rapid growth of large language models (LLMs) has significantly increased the computational demand and energy consumption of modern GPU servers. This project develops a system-level GPU frequency control algorithm to improve power efficiency and reduce energy consumption for LLM inference workloads under varying throughput requirements.

The proposed system supports both:

Fixed-workload scheduling
Fixed-interval scheduling

A performance and power model is first derived from experiment data and then used as input to an optimization algorithm that determines optimal GPU frequency settings and workload allocations.

Objectives

Improve energy efficiency of multi-GPU LLM inference systems
Reduce overall energy consumption under throughput constraints
Develop system-level frequency and workload optimization strategies

Methodology

The system consists of three main stages:

1. Performance & Power Modeling

Empirical measurements are collected to model:

GPU performance and Power under different frequency settings

2. Optimization Algorithm

A system-level optimization framework is used to determine:

GPU frequency configuration
Workload allocation across GPUs
Idle-state configuration

Implemented in:

optimization.py
opt.py
cal.py

3. Evaluation

The optimized scheduling strategy is evaluated on a multi-GPU system (ARC cluster) and compared against baseline configurations.

Implemented in:

final_test.py
final_test_boot.py
final_run.sh

📁 Repository Structure

.
├── calc.py                 # Efficiency and energy calculation
├── optimization.py         # Core optimization formulation
├── opt.py                  # Main optimization execution logic
├── final_test.py           # Main evaluation script
├── final_test_boot.py      # Reboot experiment script
├── final_run.sh            # Shell script for configuring GPU frequencies on ARC
├── results/                # Output results and figures
│ ├── benchmark/            # Benchmark measurements
│ ├── boot/                 # Boot measurements
│ ├── idle/                 # Algorithm evaluation measurements
└── pre_results             # Pre-experiment measurements on performance and power models
└── README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Developing a sustainable AI system

Overview

Objectives

Methodology

1. Performance & Power Modeling

2. Optimization Algorithm

3. Evaluation

📁 Repository Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
results		results
README.md		README.md
calc.py		calc.py
final_run.sh		final_run.sh
final_test.py		final_test.py
final_test_boot.py		final_test_boot.py
opt.py		opt.py
optimization.py		optimization.py

Folders and files

Latest commit

History

Repository files navigation

Developing a sustainable AI system

Overview

Objectives

Methodology

1. Performance & Power Modeling

2. Optimization Algorithm

3. Evaluation

📁 Repository Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages