Performance

Training Performance Results

We measured the throughput of training VideNeVA models on different numbers of DGX H100 nodes and achieved near-linear scaling on the DGX H100 platform.

The following table and chart show the pretraining performance results for the NVIDIA DGX SuperPODs (16 x 8 x H100 80GB for VideoNeVA Llama2 Chat 13B Model Pretraining).

		1	2	4	8	16
VideoNeVA Llama2 Chat 13B	Samples per Second	53	106	211	424	822
	Perfect Linear Scaling (Samples)	37	107	214	428	857
	Speedup	1x	1.99x	3.94x	7.93x	15.36x