Performance#

The performance of NV-CLIP NIM is calculated by measuring the end to end latency of the API call. It is the average over 100 iterations.

Latency values are in seconds; throughput values are inputs per second.

GPU

Precision

Input Type

Resolution

Batch Size

Latency

Throughput

H100

FP16

Image

350x197

64

0.2545

251.87

H100 NVL

FP16

Image

350x197

64

0.2671

239.65

H100 PCIe

FP16

Image

350x197

64

0.2422

264.22

A100 SXM

FP16

Image

350x197

64

0.3678

174.01

A100 PCIe

FP16

Image

350x197

64

0.3678

174.01

L40S

FP16

Image

350x197

64

0.6943

92.18

L4

FP16

Image

350x197

64

0.6956

92.01

A10G

FP16

Image

350x197

64

0.6079

105.28

A6000 Ada

FP16

Image

350x197

64

0.3392

188.68

RTX 4090

FP16

Image

350x197

64

0.291

219.94

RTX 5080-WSL

FP16

Image

350x197

64

0.2856

224.11

RTX 5090-WSL

FP16

Image

350x197

64

0.2154

297.18

GH 200

FP16

Image

350x197

64

0.1775

360.5