Multi-Node Deep Learning Training with TensorFlow (0.1.0)

NVIDIA AI Enterprise is an end-to-end, cloud-native suite of AI and data analytics software, optimized, certified, and supported by NVIDIA to run on VMware vSphere. The VMware + NVIDIA AI-Ready Enterprise Platform includes vital enabling technologies from NVIDIA for rapid deployment, management, and scaling of AI workloads.

This deployment guide aims to provide guidance on how to set up a high-performance multi-node cluster as Virtual Machines. Within this guide, you will become familiar with GPUDirect RDMA and ATS while using Docker as the platform for running high-performance multi-node Deep Learning Training. ATS is a VMware PCIe support enhancement in vSphere 7 Update 2. GPUDirect RDMA benefits from ATS and is certified and supported by NVIDIA AI Enterprise.

Previous Enterprise-Grade AI Software Platform
Next Compute Workflows
© Copyright 2024, NVIDIA. Last updated on Apr 2, 2024.