You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This project demonstrates how to perform inference with a PyTorch model and optimize it using ONNX, OpenVINO, and NVIDIA TensorRT. The script loads a pre-trained ResNet-50 model from torch-vision, performs inference on a user-provided image, and prints the top-K predicted classes. Additionally, the script benchmarks the model's performance in the following configurations: PyTorch CPU, ONNX CPU, OpenVINO CPU, PyTorch CUDA, TensorRT-FP32, and TensorRT-FP16, providing insights into the speedup gained through optimization.
29
+
This project showcases inference with a PyTorch ResNet-50 model and its optimization using ONNX, OpenVINO, and NVIDIA TensorRT. The script infers a user-specified image and displays top-K predictions. Benchmarking covers configurations like PyTorch CPU, ONNX CPU, OpenVINO CPU, PyTorch CUDA, TensorRT-FP32, and TensorRT-FP16.
30
+
31
+
The project is Dockerized for easy deployment:
32
+
1.**CPU-only Deployment** - Suitable for non-GPU systems (supports `PyTorch CPU`, `ONNX CPU` and `OpenVINO CPU` models only).
33
+
2.**GPU Deployment** - Optimized for NVIDIA GPUs (supports all models: `PyTorch CPU`, `ONNX CPU`, `OpenVINO CPU`, `PyTorch CUDA`, `TensorRT-FP32`, and `TensorRT-FP16`).
34
+
35
+
For Docker instructions, refer to the [Steps to Run](#steps-to-run) section.
36
+
28
37
29
38
## Requirements
30
39
- This repo cloned
31
40
- Docker
32
41
- NVIDIA GPU (for CUDA and TensorRT benchmarks and optimizations)
33
42
- Python 3.x
34
-
-[NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#install-guide) (for running the Docker container with GPU support)
This command will run predictions on the chosen image (`./inference/train.jpg`), show the top 3 predictions, and run all available models. Note: plot created only for `--mode=all` and results plotted and saved to `./inference/plot.png`
61
90
62
-
## RESULTS
91
+
## GPU-CUDA Results
63
92
### Inference Benchmark Results
64
93
<imgsrc="./inference/plot_latest.png"width="70%">
65
94
@@ -85,6 +114,11 @@ Here is an example of the input image to run predictions and benchmarks on:
85
114
#5: 2% lynx
86
115
```
87
116
117
+
### PC Setup
118
+
- CPU: Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz
119
+
- RAM: 32 GB
120
+
- GPU: GeForce RTX 3070
121
+
88
122
## Benchmark Implementation Details
89
123
Here you can see the flow for each model and benchmark.
90
124
@@ -125,16 +159,32 @@ OpenVINO is a toolkit from Intel that optimizes deep learning model inference fo
125
159
4. Perform inference on the provided image using the OpenVINO model.
126
160
5. Benchmark results, including average inference time, are logged for the OpenVINO model.
127
161
128
-
## Benchmarking and Visualization
129
-
The results of the benchmarks for all modes are saved and visualized in a bar chart, showcasing the average inference times across different backends. The visualization aids in comparing the performance gains achieved with different optimizations.
0 commit comments