Performance#

The performance of NV-CLIP NIM is calculated by measuring the end to end latency of the API call. It is the average over 100 iterations.

Latency values are in seconds; throughput values are inputs per second.

GPU	Precision	Input Type	Resolution	Batch Size	Latency	Throughput
H100	FP16	Image	350x197	64	0.2545	251.87
H100 NVL	FP16	Image	350x197	64	0.2671	239.65
H100 PCIe	FP16	Image	350x197	64	0.2422	264.22
A100 SXM	FP16	Image	350x197	64	0.3678	174.01
A100 PCIe	FP16	Image	350x197	64	0.3678	174.01
L40S	FP16	Image	350x197	64	0.6943	92.18
L4	FP16	Image	350x197	64	0.6956	92.01
A10G	FP16	Image	350x197	64	0.6079	105.28
A6000 Ada	FP16	Image	350x197	64	0.3392	188.68
RTX 4090	FP16	Image	350x197	64	0.291	219.94
RTX 5080-WSL	FP16	Image	350x197	64	0.2856	224.11
RTX 5090-WSL	FP16	Image	350x197	64	0.2154	297.18
GH 200	FP16	Image	350x197	64	0.1775	360.5