Performance Results#

Performance data#

The following table shows the performance data for the LipSync NIM for concurrent streams on a single GPU with the provided sample input files and the Python client. The LipSync NIM supports concurrent streams per GPU. The number of concurrent streams can be configured using the NIM_MAX_CONCURRENCY_PER_GPU environment variable (default: 1). Higher concurrency values consume more GPU memory and can cause out-of-memory errors.

GPU	Input Video Resolution	Concurrency	Average FPS
RTX PRO 6000 Blackwell Server Edition	720p	1	50
	720p	2	35
	720p	4	27
	1080p	1	42
	1080p	2	27
	1080p	4	17
	4K	1	33
	4K	2	21
RTX L40S	720p	1	50
	720p	2	35
	720p	4	24
	1080p	1	42
	1080p	2	27
	1080p	4	17
	4K	1	32
	4K	2	21
RTX L4	720p	1	22
	720p	2	13
	720p	4	7
	1080p	1	20
	1080p	2	13
	1080p	4	6
	4K	1	17
	4K	2	13
RTX A10G	720p	1	22
	720p	2	12
	720p	4	6
	1080p	1	20
	1080p	2	13
	1080p	4	6
	4K	1	17
	4K	2	12
RTX 5090	720p	1	62
	720p	2	37
	720p	4	20
	1080p	1	47
	1080p	2	32
	1080p	4	20
	4K	1	33
	4K	2	21
RTX 5080	720p	1	52
	720p	2	29
	720p	4	15
	1080p	1	37
	1080p	2	28
	1080p	4	15
	4K	1	29
	4K	2	18
RTX 4090	720p	1	57
	720p	2	35
	720p	4	19
	1080p	1	45
	1080p	2	30
	1080p	4	18
	4K	1	32
	4K	2	20

The inference FPS is calculated by dividing the total number of frames in input video file by the total inference time in seconds (measured from the time the request is sent until complete output file is received by the client).

Note: Video extension operations significantly increase processing time and memory usage due to frame buffering.

For more information, refer to the NIM clients Github repository: NVIDIA-Maxine/nim-clients.