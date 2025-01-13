For trtllm_buildable profiles the memory requirements can near the amount of memory used by GPUs.

x86 processor with at least 8 cores (modern processor recommended)

NVIDIA NIMs for large-language models should, but are not guaranteed to, run on any NVIDIA GPU, as long as the GPU has sufficient memory, or on multiple, homogeneous NVIDIA GPUs with sufficient aggregate memory and CUDA compute capability > 7.0 (8.0 for bfloat16).

You can approximate the amount of required memory using the following guidelines. However, these guidelines do not apply to trtllm_buildable profiles:

5–10 GB for OS and other processes

16 GB for Docker (16 GB of shared memory is required by docker in multi-GPU, non-NVLink cases)

# model parameters * 2 GB of memory Llama 8B: ~ 15 GB Llama 70B: ~ 131 GB Mistral 7B Instruct v0.3: ~ 14 GB Mixtral 8x7B Instruct v0.1: ~ 88 GB



These recommendations are a rough guideline and actual memory required can be lower or higher depending on hardware and NIM configuration.

Some model/GPU combinations, including vGPU, are optimized. See the following Supported Models section for further information.