Skip to content

Commit 9366cc4

Browse files
authored
Refactor Docker Deployment and Add Linux Server CPU Results (#5)
Refactor Docker setup, update README, and add Linux Server CPU benchmarks.
1 parent 84678db commit 9366cc4

7 files changed

Lines changed: 111 additions & 43 deletions

File tree

Dockerfile

Lines changed: 15 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,27 @@
1-
# Use an official TensorRT base image
2-
FROM nvcr.io/nvidia/tensorrt:23.08-py3
1+
# Argument for base image. Default is a neutral Python image.
2+
ARG BASE_IMAGE=python:3.8-slim
33

4-
# Install system packages
5-
RUN apt-get update && apt-get install -y \
6-
python3-pip \
7-
git \
8-
libjpeg-dev \
9-
libpng-dev
4+
# Use the base image specified by the BASE_IMAGE argument
5+
FROM $BASE_IMAGE
106

11-
# Copy the requirements.txt file into the container
7+
# Argument to determine environment: cpu or gpu (default is cpu)
8+
ARG ENVIRONMENT=cpu
9+
10+
# Install required system packages conditionally
11+
RUN apt-get update && apt-get install -y python3-pip git && \
12+
if [ "$ENVIRONMENT" = "gpu" ] ; then apt-get install -y libjpeg-dev libpng-dev ; fi
13+
14+
# Copy the requirements file based on the environment into the container
1215
COPY requirements.txt /workspace/requirements.txt
1316

1417
# Install Python packages
1518
RUN pip3 install --no-cache-dir -r /workspace/requirements.txt
1619

17-
# Install torch-tensorrt from the special location
18-
RUN pip3 install torch-tensorrt -f https://github.com/NVIDIA/Torch-TensorRT/releases
20+
# Only install torch-tensorrt for GPU environment
21+
RUN if [ "$ENVIRONMENT" = "gpu" ] ; then pip3 install torch-tensorrt -f https://github.com/NVIDIA/Torch-TensorRT/releases ; fi
1922

2023
# Set the working directory
2124
WORKDIR /workspace
2225

2326
# Copy local project files to /workspace in the image
24-
COPY . /workspace
27+
COPY . /workspace

README.md

Lines changed: 75 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -6,44 +6,73 @@
66
2. [Requirements](#requirements)
77
- [Steps to Run](#steps-to-run)
88
- [Example Command](#example-command)
9-
3. [RESULTS](#results) ![Static Badge](https://img.shields.io/badge/update-orange)
9+
3. [GPU-CUDA Results](#gpu-cuda-results) ![Static Badge](https://img.shields.io/badge/update-orange)
1010
- [Results explanation](#results-explanation)
1111
- [Example Input](#example-input)
1212
- [Example prediction results](#example-prediction-results)
13+
- [PC Setup](#pc-setup)
1314
4. [Benchmark Implementation Details](#benchmark-implementation-details) ![New](https://img.shields.io/badge/-New-842E5B)
1415
- [PyTorch CPU & CUDA](#pytorch-cpu--cuda)
1516
- [TensorRT FP32 & FP16](#tensorrt-fp32--fp16)
1617
- [ONNX](#onnx)
1718
- [OpenVINO](#openvino)
18-
5. [Benchmarking and Visualization](#benchmarking-and-visualization) ![New](https://img.shields.io/badge/-New-842E5B)
19+
5. [Extra](#extra) ![New](https://img.shields.io/badge/-New-842E5B)
20+
- [Remote Linux Server - CPU only - Inference](#remote-linux-server-cpu-only-inference)
21+
- [Prediction results](#prediction-results)
1922
6. [Author](#author)
20-
7. [PC Setup](#pc-setup)
21-
8. [References](#references)
23+
7. [References](#references)
2224

2325

2426
<img src="./inference/plot_latest.png" width="100%">
2527

2628
## Overview
27-
This project demonstrates how to perform inference with a PyTorch model and optimize it using ONNX, OpenVINO, and NVIDIA TensorRT. The script loads a pre-trained ResNet-50 model from torch-vision, performs inference on a user-provided image, and prints the top-K predicted classes. Additionally, the script benchmarks the model's performance in the following configurations: PyTorch CPU, ONNX CPU, OpenVINO CPU, PyTorch CUDA, TensorRT-FP32, and TensorRT-FP16, providing insights into the speedup gained through optimization.
29+
This project showcases inference with a PyTorch ResNet-50 model and its optimization using ONNX, OpenVINO, and NVIDIA TensorRT. The script infers a user-specified image and displays top-K predictions. Benchmarking covers configurations like PyTorch CPU, ONNX CPU, OpenVINO CPU, PyTorch CUDA, TensorRT-FP32, and TensorRT-FP16.
30+
31+
The project is Dockerized for easy deployment:
32+
1. **CPU-only Deployment** - Suitable for non-GPU systems (supports `PyTorch CPU`, `ONNX CPU` and `OpenVINO CPU` models only).
33+
2. **GPU Deployment** - Optimized for NVIDIA GPUs (supports all models: `PyTorch CPU`, `ONNX CPU`, `OpenVINO CPU`, `PyTorch CUDA`, `TensorRT-FP32`, and `TensorRT-FP16`).
34+
35+
For Docker instructions, refer to the [Steps to Run](#steps-to-run) section.
36+
2837

2938
## Requirements
3039
- This repo cloned
3140
- Docker
3241
- NVIDIA GPU (for CUDA and TensorRT benchmarks and optimizations)
3342
- Python 3.x
34-
- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#install-guide) (for running the Docker container with GPU support)
35-
- ![New](https://img.shields.io/badge/-New-842E5B)[OpenVINO Toolkit](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/download.html) (for running OpenVINO model)
36-
37-
### Steps to Run
38-
43+
- NVIDIA drivers installed on the host machine.
44+
- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#install-guide) (for running the Docker container with GPU support). Pre-installed withing the GPU docker image.
45+
46+
## Steps to Run
47+
### Building the Docker Image
48+
49+
Depending on target environment (CPU or GPU), choose a different base image.
50+
51+
1. **CPU Deployment**:
52+
For systems without a GPU or CUDA support, simply use the default base image.
53+
```bash
54+
docker build -t my_image_cpu .
55+
```
56+
57+
2. **GPU Deployment**:
58+
If your system has GPU support and you have NVIDIA Docker runtime installed, you can use the TensorRT base image to leverage GPU acceleration.
59+
```bash
60+
docker build --build-arg ENVIRONMENT=gpu --build-arg BASE_IMAGE=nvcr.io/nvidia/tensorrt:23.08-py3 -t my_project_image_gpu .
61+
```
62+
63+
### Running the Docker Container
64+
1. **CPU Version**:
65+
```bash
66+
docker run -it --rm my_image_cpu
67+
```
68+
69+
2. **GPU Version**:
70+
```bash
71+
docker run --gpus all -it --rm my_image_gpu
72+
```
73+
74+
### Run the Script inside the Container
3975
```sh
40-
# 1. Build the Docker Image
41-
docker build -t awesome-tensorrt
42-
43-
# 2. Run the Docker Container
44-
docker run --gpus all --rm -it awesome-tensorrt
45-
46-
# 3. Run the Script inside the Container
4776
python main.py [--mode all]
4877
```
4978

@@ -59,7 +88,7 @@ python main.py --topk 3 --mode=all --image_path="./inference/train.jpg"
5988

6089
This command will run predictions on the chosen image (`./inference/train.jpg`), show the top 3 predictions, and run all available models. Note: plot created only for `--mode=all` and results plotted and saved to `./inference/plot.png`
6190

62-
## RESULTS
91+
## GPU-CUDA Results
6392
### Inference Benchmark Results
6493
<img src="./inference/plot_latest.png" width="70%">
6594

@@ -85,6 +114,11 @@ Here is an example of the input image to run predictions and benchmarks on:
85114
#5: 2% lynx
86115
```
87116

117+
### PC Setup
118+
- CPU: Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz
119+
- RAM: 32 GB
120+
- GPU: GeForce RTX 3070
121+
88122
## Benchmark Implementation Details
89123
Here you can see the flow for each model and benchmark.
90124

@@ -125,16 +159,32 @@ OpenVINO is a toolkit from Intel that optimizes deep learning model inference fo
125159
4. Perform inference on the provided image using the OpenVINO model.
126160
5. Benchmark results, including average inference time, are logged for the OpenVINO model.
127161

128-
## Benchmarking and Visualization
129-
The results of the benchmarks for all modes are saved and visualized in a bar chart, showcasing the average inference times across different backends. The visualization aids in comparing the performance gains achieved with different optimizations.
162+
## Extra
163+
### Remote Linux Server - CPU only - Inference
164+
<img src="./inference/plot_linux_server.png" width="70%">
165+
166+
### Prediction results
167+
`model.log` file content
168+
```
169+
Running prediction for OV model
170+
#1: 15% Egyptian cat
171+
#2: 14% tiger cat
172+
#3: 9% tabby
173+
#4: 2% doormat
174+
#5: 2% lynx
175+
176+
177+
Running prediction for ONNX model
178+
#1: 15% Egyptian cat
179+
#2: 14% tiger cat
180+
#3: 9% tabby
181+
#4: 2% doormat
182+
#5: 2% lynx
183+
```
184+
130185

131186
## Author
132187
[DimaBir](https://github.com/DimaBir)
133-
134-
## PC Setup
135-
- CPU: Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz
136-
- RAM: 32 GB
137-
- GPU: GeForce RTX 3070
138188

139189
## References
140190
- **PyTorch**: [Official Documentation](https://pytorch.org/docs/stable/index.html)

benchmark/benchmark_models.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,9 @@ def run(self):
7272
with torch.no_grad():
7373
for _ in range(self.nwarmup):
7474
features = self.model(input_data)
75-
torch.cuda.synchronize()
75+
76+
if self.device == "cuda":
77+
torch.cuda.synchronize()
7678

7779
# Start timing
7880
print("Start timing ...")
@@ -81,7 +83,8 @@ def run(self):
8183
for i in range(1, self.nruns + 1):
8284
start_time = time.time()
8385
features = self.model(input_data)
84-
torch.cuda.synchronize()
86+
if self.device == "cuda":
87+
torch.cuda.synchronize()
8588
end_time = time.time()
8689
timings.append(end_time - start_time)
8790

benchmark/benchmark_utils.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,6 @@
66
import seaborn as sns
77
from typing import Dict, Any
88
import torch
9-
import onnxruntime as ort
109

1110
from benchmark.benchmark_models import PyTorchBenchmark, ONNXBenchmark, OVBenchmark
1211

@@ -43,6 +42,9 @@ def run_all_benchmarks(
4342
("cuda", torch.float16, True),
4443
]
4544
for device, precision, is_trt in configs:
45+
if not torch.cuda.is_available() and device == "cuda":
46+
continue
47+
4648
model_to_use = models[f"PyTorch_{device}"].to(device)
4749

4850
if not is_trt:

inference/plot_linux_server.png

25.4 KB
Loading

main.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,14 @@
11
import logging
22
import os.path
3-
import torch_tensorrt
3+
import torch
4+
5+
CUDA_AVAILABLE = False
6+
if torch.cuda.is_available():
7+
try:
8+
import torch_tensorrt
9+
CUDA_AVAILABLE = True
10+
except ImportError:
11+
print("torch-tensorrt is not installed. Running on CPU mode only.")
412

513
from benchmark.benchmark_models import benchmark_onnx_model, benchmark_ov_model
614
from benchmark.benchmark_utils import run_all_benchmarks, plot_benchmark_results
@@ -79,6 +87,10 @@ def main():
7987
precision = config["precision"]
8088
is_trt = config["is_trt"]
8189

90+
# check if CUDA is available
91+
if device.lower() == "cuda" and not CUDA_AVAILABLE:
92+
continue
93+
8294
model = init_cuda_model(model_loader, device, precision)
8395

8496
# If the configuration is not for TensorRT, store the model under a PyTorch key

prediction/prediction_utils.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@
44
import torch
55
import onnxruntime as ort
66
import numpy as np
7-
import torch_tensorrt
8-
97

108

119
def make_prediction(

0 commit comments

Comments
 (0)