NVIDIA GPU Cloud介绍

NVIDIA GPU Cloud (NGC) 是针对深度学习和科学计算优化的基于GPU加速的云平台。在当前的发行版本中,NGC包含了NGC容器,NGC容器注册,NGC网站,以及用以运行深度学习容器的平台软件。本文档提供了NVIDIA GPU Cloud的基本介绍,以及使用方法。

NGC容器

NGC容器旨在提供一个软件平台,这个平台基于最小的操作系统要求、安装在服务器或工作站上Docker和驱动,通过NGC容器注册里的NGC容器提供所有的应用程序和SDK软件。

NGC管理着一份目录,包含了完全集成和优化的深度学习框架容器,适用于单GPU以及多GPU配置环境。这些容器包括:CUDA 工具包,DIGITS工作流,以及以下深度学习框架:NVCaffe, Caffe2, Microsoft Cognitive Toolkit (CNTK), MXNet, PyTorch, TensorFlow, Theano 和 Torch。这些框架容器以开箱即可用的方式交付,包含了所有必须的依赖,比如CUDA运行时环境、NVIDIA库和运行系统环境。

每个框架容器镜像还包含了框架源代码,以支持用户自定义修改和增强功能,以及完整的软件开发栈。

NVIDIA每月更新这些深度学习的容器,以确保提供最佳性能。

在深度学习框架容器的基础上,NGC也提供了一系列高性能计算可视化应用容器,采用支持业界领先的可视化工具,包括集成了NVIDIA Index 立体体渲染的ParaView, NVIDIA OptiX 光线追踪库和NVIDIA Holodeck,以实现高质量可交互的实时视觉效果。这些容器目前处于公测阶段。

NGC也提供流行常用的第三方兼容GPU、符合NGC标准和最佳实践的高性能计算应用容器,使用户可以方便的在最短的时间内启动和运行起来。

NGC容器注册

NGC容器注册利用nvcr.io管理容器镜像的存储和分发。用户使用NGC API 密钥可以从注册里下载并运行NGC容器。

NGC Website

NGC网站 (https://ngc.nvidia.com) 是管理NGC的门户,用来查看NGC容器注册,创建每个用户独立授权的API 密钥,以及查看哪些云服务提供商提供了针对NGC容器优化的虚拟机实例。

Optimized Accelerated Computing Environments

所有NGC容器都经过完整测试,可以充分利用NVIDIA GPU,可以直接运行在支持的加速计算环境(ACEs),例如 NVIDIA DGX系统,以及经验证的云计算服务提供商的NVIDIA GPU实例上。

用NGC运行深度学习框架

运行深度学习框架容器的过程总结如下:

准备加速计算环境

准备运行NGC容器的加速计算环境(ACE),有关设置ACE的说明请查看以下文档:

获取NGC证书,选择NGC容器

https://ngc.nvidia.com上注册一个NGC账号,登录,然后创建您你自己的NGC API 密钥,这个密钥是用于从NGC容器注册下载NGC容器时的身份验证。

浏览NGC网站的注册部分,确定要使用的容器和标签。

更多细节可查看 开始使用NGC。

运行容器

连接到ACE,登录进nvcr.io,输入命令以运行容器(在NGC网站注册部分确定的容器)。

更多关于运行深度学习容器的信息,请查看 NVIDIA 深度学习框架 Docker 容器用户指南 。

更多关于运行其它NGC容器的信息,请查看 NGC容器用户指南 。

Notices

Notice

THE INFORMATION IN THIS GUIDE AND ALL OTHER INFORMATION CONTAINED IN NVIDIA DOCUMENTATION REFERENCED IN THIS GUIDE IS PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE INFORMATION FOR THE PRODUCT, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Notwithstanding any damages that customer might incur for any reason whatsoever, NVIDIA’s aggregate and cumulative liability towards customer for the product described in this guide shall be limited in accordance with the NVIDIA terms and conditions of sale for the product.

THE NVIDIA PRODUCT DESCRIBED IN THIS GUIDE IS NOT FAULT TOLERANT AND IS NOT DESIGNED, MANUFACTURED OR INTENDED FOR USE IN CONNECTION WITH THE DESIGN, CONSTRUCTION, MAINTENANCE, AND/OR OPERATION OF ANY SYSTEM WHERE THE USE OR A FAILURE OF SUCH SYSTEM COULD RESULT IN A SITUATION THAT THREATENS THE SAFETY OF HUMAN LIFE OR SEVERE PHYSICAL HARM OR PROPERTY DAMAGE (INCLUDING, FOR EXAMPLE, USE IN CONNECTION WITH ANY NUCLEAR, AVIONICS, LIFE SUPPORT OR OTHER LIFE CRITICAL APPLICATION). NVIDIA EXPRESSLY DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY OF FITNESS FOR SUCH HIGH RISK USES. NVIDIA SHALL NOT BE LIABLE TO CUSTOMER OR ANY THIRD PARTY, IN WHOLE OR IN PART, FOR ANY CLAIMS OR DAMAGES ARISING FROM SUCH HIGH RISK USES.

NVIDIA makes no representation or warranty that the product described in this guide will be suitable for any specified use without further testing or modification. Testing of all parameters of each product is not necessarily performed by NVIDIA. It is customer’s sole responsibility to ensure the product is suitable and fit for the application planned by customer and to do the necessary testing for the application in order to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect the quality and reliability of the NVIDIA product and may result in additional or different conditions and/or requirements beyond those contained in this guide. NVIDIA does not accept any liability related to any default, damage, costs or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this guide, or (ii) customer product designs.

Other than the right for customer to use the information in this guide with the product, no other license, either expressed or implied, is hereby granted by NVIDIA under this guide. Reproduction of information in this guide is permissible only if reproduction is approved by NVIDIA in writing, is reproduced without alteration, and is accompanied by all associated conditions, limitations, and notices.

Trademarks

NVIDIA, the NVIDIA logo, and Volta are trademarks and/or registered trademarks of NVIDIA Corporation in the Unites States and other countries.

Docker and the Docker logo are trademarks or registered trademarks of Docker, Inc. in the United States and/or other countries.

Other company and product names may be trademarks of the respective companies with which they are associated.