Optimal Topology#
VSS Deployment Topologies#
VSS supports different deployment topologies optimized for various GPU types and performance requirements. The choice of topology depends on your hardware configurations.
Default Topology#
The default topology dedicates 4 GPUs for LLM NIM, 2 GPUs for VSS ingestion and Retrieval pipeline, and 1 GPU each for Nemo embedding and reranking NIMs. This topology is designed for the system where single GPU is not enough to handle mutliple NIMs. e.g. system with L40s GPUs.
For details on the default topology configuration, see Default Deployment Topology and Models in Use.