Skip to content

Commit 2fd7dfe

Browse files
authored
Enhancements to Inference Benchmarking (#6)
## Enhanced Inference Benchmarking - Handled batched input data for benchmarking. - Added throughput measurement in benchmark results. - Fixed TensorRT precision-specific issues. - Updated visualization to include both time and throughput.
1 parent 9078ea3 commit 2fd7dfe

29 files changed

+571
-656
lines changed

README.md

Lines changed: 31 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -6,25 +6,18 @@
66
2. [Requirements](#requirements)
77
- [Steps to Run](#steps-to-run)
88
- [Example Command](#example-command)
9-
3. [GPU-CUDA Results](#gpu-cuda-results) ![Static Badge](https://img.shields.io/badge/update-orange)
10-
- [Results explanation](#results-explanation)
11-
- [Example Input](#example-input)
12-
- [Example prediction results](#example-prediction-results)
13-
- [PC Setup](#pc-setup)
14-
4. [Benchmark Implementation Details](#benchmark-implementation-details) ![New](https://img.shields.io/badge/-New-842E5B)
9+
3. [CPU Results](#cpu-results) ![Static Badge](https://img.shields.io/badge/update-orange)
10+
4. [GPU (CUDA) Results](#gpu-cuda-results) ![Static Badge](https://img.shields.io/badge/update-orange)
11+
5. [Benchmark Implementation Details](#benchmark-implementation-details) ![New](https://img.shields.io/badge/-New-842E5B)
1512
- [PyTorch CPU & CUDA](#pytorch-cpu--cuda)
1613
- [TensorRT FP32 & FP16](#tensorrt-fp32--fp16)
1714
- [ONNX](#onnx)
1815
- [OpenVINO](#openvino)
19-
5. [Extra](#extra) ![New](https://img.shields.io/badge/-New-842E5B)
20-
- [Linux Server Inference](#linux-server-inference)
21-
- [Prediction results](#prediction-results)
22-
- [PC Setup Linux](#pc-setup-linux)
2316
6. [Author](#author)
2417
7. [References](#references)
2518

2619

27-
<img src="./inference/plot_latest.png" width="100%">
20+
<img src="./inference/plot_new_gpu.png" width="100%">
2821

2922
## Overview
3023
This project showcases inference with a PyTorch ResNet-50 model and its optimization using ONNX, OpenVINO, and NVIDIA TensorRT. The script infers a user-specified image and displays top-K predictions. Benchmarking covers configurations like PyTorch CPU, ONNX CPU, OpenVINO CPU, PyTorch CUDA, TensorRT-FP32, and TensorRT-FP16.
@@ -50,13 +43,13 @@ Refer to the [Steps to Run](#steps-to-run) section for Docker instructions.
5043
1. **CPU Deployment**:
5144
For systems without a GPU or CUDA support, simply use the default base image.
5245
```bash
53-
docker build -t my_image_cpu .
46+
docker build -t cpu_img .
5447
```
5548

5649
2. **GPU Deployment**:
5750
If your system has GPU and CUDA support, you can use the TensorRT base image to leverage GPU acceleration.
5851
```bash
59-
docker build --build-arg ENVIRONMENT=gpu --build-arg BASE_IMAGE=nvcr.io/nvidia/tensorrt:23.08-py3 -t my_project_image_gpu .
52+
docker build --build-arg ENVIRONMENT=gpu --build-arg BASE_IMAGE=nvcr.io/nvidia/tensorrt:23.08-py3 -t gpu_img .
6053
```
6154

6255
### Running the Docker Container
@@ -78,7 +71,7 @@ python main.py [--mode all]
7871
### Arguments
7972
- `--image_path`: (Optional) Specifies the path to the image you want to predict.
8073
- `--topk`: (Optional) Specifies the number of top predictions to show. Defaults to 5 if not provided.
81-
- `--mode`: (Optional) Specifies the model's mode for exporting and running. Choices are: `onnx`, `ov`, `cuda`, and `all`. If not provided, it defaults to `all`.
74+
- `--mode`: (Optional) Specifies the model's mode for exporting and running. Choices are: `onnx`, `ov`, `cpu`, `cuda`, `tensorrt`, and `all`. If not provided, it defaults to `all`.
8275

8376
### Example Command
8477
```sh
@@ -87,17 +80,33 @@ python main.py --topk 3 --mode=all --image_path="./inference/train.jpg"
8780

8881
This command will run predictions on the chosen image (`./inference/train.jpg`), show the top 3 predictions, and run all available models. Note: plot created only for `--mode=all` and results plotted and saved to `./inference/plot.png`
8982

90-
## GPU-CUDA Results
83+
## CPU Results
84+
<img src="./inference/plot.png" width="70%">
85+
86+
### Prediction results
87+
```
88+
#1: 15% Egyptian cat
89+
#2: 14% tiger cat
90+
#3: 9% tabby
91+
#4: 2% doormat
92+
#5: 2% lynx
93+
```
94+
### PC Setup Linux
95+
- CPU: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
96+
- RAM: 16 GB
97+
- GPU: None
98+
99+
## GPU (CUDA) Results
91100
### Inference Benchmark Results
92-
<img src="./inference/plot_latest.png" width="70%">
101+
<img src="./inference/plot_new_gpu.png" width="100%">
93102

94103
### Results explanation
95-
- `PyTorch_cpu: 978.71 ms` indicates the average batch time when running the `PyTorch` model on `CPU` device.
96-
- `PyTorch_cuda: 30.11 ms` indicates the average batch time when running the `PyTorch` model on the `CUDA` device.
97-
- `TRT_fp32: 19.20 ms` shows the average batch time when running the model with `TensorRT` using `float32` precision.
98-
- `TRT_fp16: 7.32 ms` indicates the average batch time when running the model with `TensorRT` using `float16` precision.
99-
- ![New](https://img.shields.io/badge/-New-842E5B)`ONNX: 15.95 ms` indicates the average batch inference time when running the `PyTorch` converted to the `ONNX` model on the `CPU` device.
100-
- ![New](https://img.shields.io/badge/-New-842E5B)`OpenVINO: 13.37 ms` indicates the average batch inference time when running the `ONNX` model converted to `OpenVINO` on the `CPU` device.
104+
- `PyTorch_cpu: 32.83 ms` indicates the average batch time when running the `PyTorch` model on `CPU` device.
105+
- `PyTorch_cuda: 5.59 ms` indicates the average batch time when running the `PyTorch` model on the `CUDA` device.
106+
- `TRT_fp32: 1.69 ms` shows the average batch time when running the model with `TensorRT` using `float32` precision.
107+
- `TRT_fp16: 1.69 ms` indicates the average batch time when running the model with `TensorRT` using `float16` precision.
108+
- ![New](https://img.shields.io/badge/-New-842E5B)`ONNX: 16.01 ms` indicates the average batch inference time when running the `PyTorch` converted to the `ONNX` model on the `CPU` device.
109+
- ![New](https://img.shields.io/badge/-New-842E5B)`OpenVINO: 15.65 ms` indicates the average batch inference time when running the `ONNX` model converted to `OpenVINO` on the `CPU` device.
101110

102111
### Example Input
103112
Here is an example of the input image to run predictions and benchmarks on:
@@ -158,33 +167,6 @@ OpenVINO is a toolkit from Intel that optimizes deep learning model inference fo
158167
4. Perform inference on the provided image using the OpenVINO model.
159168
5. Benchmark results, including average inference time, are logged for the OpenVINO model.
160169

161-
## Extra
162-
### Linux Server Inference
163-
<img src="./inference/plot_linux_server.png" width="70%">
164-
165-
### Prediction results
166-
`model.log` file content
167-
```
168-
Running prediction for OV model
169-
#1: 15% Egyptian cat
170-
#2: 14% tiger cat
171-
#3: 9% tabby
172-
#4: 2% doormat
173-
#5: 2% lynx
174-
175-
176-
Running prediction for ONNX model
177-
#1: 15% Egyptian cat
178-
#2: 14% tiger cat
179-
#3: 9% tabby
180-
#4: 2% doormat
181-
#5: 2% lynx
182-
```
183-
### PC Setup Linux
184-
- CPU: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
185-
- RAM: 16 GB
186-
- GPU: None
187-
188170
## Author
189171
[DimaBir](https://github.com/DimaBir)
190172

benchmark/__init__.py

Whitespace-only changes.

benchmark/benchmark_models.py

Lines changed: 0 additions & 247 deletions
This file was deleted.

0 commit comments

Comments
 (0)