Performance#

The performance of NV-CLIP NIM is calculated by measuring the end to end latency of the API call. It is the average over 100 iterations.

Latency values are in seconds; throughput values are inputs per second.

GPU

Precision

Input Type

Resolution

Batch Size

Latency

Throughput

H100 SXM

FP16

Image

350x197

64

0.2568

249.22

H100 PCIe

FP16

Image

350x197

64

0.2568

249.22

A100 SXM

FP16

Image

350x197

64

0.3968

160.57

A100 PCIe

FP16

Image

350x197

64

0.3968

160.57

L40S

FP16

Image

350x197

64

0.3562

179.67

A10G

FP16

Image

350x197

64

0.615

104.07

A6000 Ada

FP16

Image

350x197

64

0.3701

172.93

RTX 4090

FP16

Image

350x197

64

0.339

188.78