Performance#
The performance of NV-CLIP NIM is calculated by measuring the end to end latency of the API call. It is the average over 100 iterations.
Latency values are in seconds; throughput values are inputs per second.
GPU |
Precision |
Input Type |
Resolution |
Batch Size |
Latency |
Throughput |
---|---|---|---|---|---|---|
H100 |
FP16 |
Image |
350x197 |
64 |
0.2545 |
251.87 |
H100 NVL |
FP16 |
Image |
350x197 |
64 |
0.2671 |
239.65 |
H100 PCIe |
FP16 |
Image |
350x197 |
64 |
0.2422 |
264.22 |
A100 SXM |
FP16 |
Image |
350x197 |
64 |
0.3678 |
174.01 |
A100 PCIe |
FP16 |
Image |
350x197 |
64 |
0.3678 |
174.01 |
L40S |
FP16 |
Image |
350x197 |
64 |
0.6943 |
92.18 |
L4 |
FP16 |
Image |
350x197 |
64 |
0.6956 |
92.01 |
A10G |
FP16 |
Image |
350x197 |
64 |
0.6079 |
105.28 |
A6000 Ada |
FP16 |
Image |
350x197 |
64 |
0.3392 |
188.68 |
RTX 4090 |
FP16 |
Image |
350x197 |
64 |
0.291 |
219.94 |
RTX 5080-WSL |
FP16 |
Image |
350x197 |
64 |
0.2856 |
224.11 |
RTX 5090-WSL |
FP16 |
Image |
350x197 |
64 |
0.2154 |
297.18 |
GH 200 |
FP16 |
Image |
350x197 |
64 |
0.1775 |
360.5 |