Skip to content

Latest commit

 

History

History
134 lines (106 loc) · 3.41 KB

File metadata and controls

134 lines (106 loc) · 3.41 KB
Table of Contents

N/A : Benchmark not present in a round

X: Change in benchmark. Submission results can be compared across rounds when there has been no change in the benchmark

Training

Model

0.5

0.6

0.7

1.0

1.1

2.0

2.1

3.0

3.1

4.0

4.1

5.0

ResNet-50 v1.5

X

X

N/A

SSD-ResNet34

X

X

N/A

RetinaNet-ResNeXt50

N/A

X

MaskRCNN

X

X

N/A

NCF

X

N/A

NMT

X

N/A

Transformer

X

N/A

MiniGo

X

N/A

DLRM

N/A

X

N/A

DLRM-dcnv2

N/A

X

BERT

N/A

X

RNN-T

N/A

X

X

N/A

3D U-Net

N/A

X

N/A

GPT3

N/A

X

N/A

Stable Diffusionv2

N/A

X

LLama70B-LoRA

N/A

X

RGAT

N/A

X

Llama3.1 405b

N/A

X

Metric: Time-to-train (measured in minutes)

Note: v0.6 ResNet-50 v1.5, SSD-ResNet34, NMT increased accuracy targets, all v0.6 benchmarks changed initializition timing, and v0.7 MiniGo moved to 19x19 board

HPC

Model

0.7

1.0

2.0

CosmoFlow

X

X

X

DeepCAM

X

X

Open Catalyst

N/A

X

X

Metrics: Time-to-train (measured in minutes) and throughput (weak scaling - measured in models/minute)

Inference

Datacenter

Model

0.5

0.7

1.0

1.1

2.0

2.1

3.0

3.1

4.0

4.1

5.0

MobileNet-v1

X

N/A

ResNet-50 v1.5

X

SSD-MobileNets

X

N/A

SSD-ResNet34

X

N/A

RetinaNet-ResNeXt50

N/A

X

NMT

X

N/A

DLRM

N/A

X

N/A

DLRM-v2

N/A

X

BERT

N/A

X

N/A

RNN-T

N/A

X

N/A

3D U-Net

N/A

X

GPT-J

N/A

X

Llama2-70b

N/A

X

Stable-diffusion-xl

N/A

X

Mixtral-8x7b

N/A

X

Llama 3.1 405B

N/A

X

Llama 2 70B interactive

N/A

X

RGAT

N/A

X

Edge

Model

0.5

0.7

1.0

1.1

2.0

2.1

3.0

3.1

4.0

4.1

5.0

MobileNet-v1

X

N/A

ResNet-50 v1.5

X

SSD-MobileNets

X

N/A

SSD-ResNet34

X

N/A

RetinaNet-ResNeXt50

N/A

X

NMT

X

N/A

DLRM

N/A

X

N/A

DLRM-v2

N/A

X

N/A

BERT

N/A

X

RNN-T

N/A

X

N/A

3D U-Net

N/A

X

GPT-J

N/A

X

Stable-diffusion-xl

N/A

X

Automotive PointPainting

N/A

X

Metrics: Queries/second (server), Samples/second (offline), Latency (measured in milliseconds) (single stream), Streams (multi-stream v0.5-v1.1), Latency (measured in milliseconds) (multi-stream 2.0+)

Additional power metrics: System power (measured in watts) (server and offline), system energy per stream (measured in joules) (single stream and multi-stream)

Note: Performance metrics for inference and power submissions are not comparable

Note: Multistream v0.5-v1.1 is not compatible with v2.0 and newer

Note: Inference over Network scenario introduced in v2.1

Mobile

Model

0.7

1.0

1.1

2.0

2.1

3.0

MobileNetEdge

X

SSD-MobileNetsV2

X

N/A

MobileDET

N/A

X

DeeplabV3

X

N/A

MOSAIC

N/A

X

MobileBERT

X

EDSR

N/A

X

Primary metrics: Latency (measured in milliseconds) (single stream), Samples/second (offline)

Note: Submission requires all benchmarks in single stream and MobileNetEdge in single stream and offline

Tiny

Model

0.5

0.7

1.0

MobileNetV1

X

X

ResNet-V1

X*

X

DSCNN

X

X

FC Autoencoder

X

X

Primary metric: Latency (measured in milliseconds)

Secondary metric: Energy per inference (measured in microjoules)

*Latency Compatible, not accuracy: v0.5 and v0.7 use the same model, but changed the evaluation set to improve balance.