Performance#
This section provides performance benchmarks for VSS across several video lengths and chunk sizes on an 8xH100 PCIE system with 4 GPUs used for the LLM and 4 being shared between the Embedding, ReRanker and VLM.

The summarization latency metrics for summarization are measured with Graph RAG enabled and disabled. If Graph RAG is enabled, it allows for interactive Q&A after the summary has been generated but adds latency.
Video Length
(Minutes)
|
Chunk Size
(Seconds)
|
Chunks
|
Summarization Latency - Graph RAG Off
(Seconds)
|
Summarization Latency - Graph RAG on
(Seconds)
|
---|---|---|---|---|
1 min |
10s |
6 |
12.94 |
28.56 |
1 min |
20s |
3 |
10.60 |
24.19 |
1 min |
30s |
2 |
11.15 |
25.26 |
10 min |
10s |
60 |
55.25 |
89.88 |
10 min |
20s |
30 |
30.73 |
59.99 |
10 min |
30s |
20 |
28.17 |
51.52 |
30 min |
10s |
180 |
75.00 |
197.89 |
30 min |
20s |
90 |
48.67 |
116.96 |
30 min |
30s |
60 |
43.81 |
85.42 |
60 min |
10s |
360 |
138.86 |
382.25 |
60 min |
20s |
180 |
76.38 |
194.90 |
60 min |
30s |
120 |
67.59 |
141.41 |
120 min |
10s |
720 |
204.12 |
684.58 |
120 min |
20s |
360 |
126.31 |
364.79 |
120 min |
30s |
240 |
87.11 |
320.88 |
From the data, the summarization speed up can be plotted. This is summary generation time compared to the length of the video to show how much faster VSS is to generate summaries compared to manually watching the full video.

The following graph shows the relationship between latency, chunk size and video size. Longer videos and shorter chunk sizes require more processing and have higher latency.
