Performance Results#

Performance data#

The following table shows the performance data for the LipSync NIM for concurrent streams on a single GPU with the provided sample input files and the Python client. The LipSync NIM supports concurrent streams per GPU. The number of concurrent streams can be configured using the NIM_MAX_CONCURRENCY_PER_GPU environment variable (default: 1). Higher concurrency values consume more GPU memory and can cause out-of-memory errors.

GPU

Input Video Resolution

Concurrency

Average FPS

RTX PRO 6000 Blackwell Server Edition

720p

1

50

720p

2

35

720p

4

27

1080p

1

42

1080p

2

27

1080p

4

17

4K

1

33

4K

2

21

RTX L40S

720p

1

50

720p

2

35

720p

4

24

1080p

1

42

1080p

2

27

1080p

4

17

4K

1

32

4K

2

21

RTX L4

720p

1

22

720p

2

13

720p

4

7

1080p

1

20

1080p

2

13

1080p

4

6

4K

1

17

4K

2

13

RTX A10G

720p

1

22

720p

2

12

720p

4

6

1080p

1

20

1080p

2

13

1080p

4

6

4K

1

17

4K

2

12

RTX 5090

720p

1

62

720p

2

37

720p

4

20

1080p

1

47

1080p

2

32

1080p

4

20

4K

1

33

4K

2

21

RTX 5080

720p

1

52

720p

2

29

720p

4

15

1080p

1

37

1080p

2

28

1080p

4

15

4K

1

29

4K

2

18

RTX 4090

720p

1

57

720p

2

35

720p

4

19

1080p

1

45

1080p

2

30

1080p

4

18

4K

1

32

4K

2

20

The inference FPS is calculated by dividing the total number of frames in input video file by the total inference time in seconds (measured from the time the request is sent until complete output file is received by the client).

Note: Video extension operations significantly increase processing time and memory usage due to frame buffering.

For more information, refer to the NIM clients Github repository: NVIDIA-Maxine/nim-clients.