Skip to content
This repository was archived by the owner on Feb 3, 2025. It is now read-only.
This repository was archived by the owner on Feb 3, 2025. It is now read-only.

Cuda synchronize alternative for profiling #304

@aimilefth

Description

@aimilefth

Greetings,

I am currently using tf-trt and I want to measure the perfomance of my models (Latency, Throughput).

The tensorrt c++ API has the functionality of cuda synchronize via the cuda events API https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#cuda-events

On top of that, Pytorch contains the torch.cuda.synchronize() alternative
https://pytorch.org/docs/stable/generated/torch.cuda.synchronize.html

However in the TF TRT docs, I cant find something similar, which in my opinion is essential in order to correctly measure perfomance metrics

Have I missed anything or are there plans to integrate such functionality?

Thank you

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions