Inference Server Release 18.04 Beta

The NVIDIA container image of the Inference Server, release 18.04, is available as a beta release.

Contents of the Inference Server

This container image contains the Inference Server executable in /opt/inference_server.

The container also includes the following:

Driver Requirements

Release 18.04 is based on CUDA 9, which requires NVIDIA Driver release 384.xx.

Key Features and Enhancements

This Inference Server release includes the following key features and enhancements.
  • This is the beta release of the Inference Server container.
  • The Inference Server container image version 18.04 is based on NVIDIA Inference Server 0.0.1 beta.
  • Multiple model support. The Inference Server can manage any number and mix of models (limited by system disk and memory resources). Supports TensorRT and TensorFlow GraphDef model formats.
  • Multi-GPU support. The server can distribute inferencing across all system GPUs.
  • Multi-tenancy support. Multiple models (or multiple instances of the same model) can run simultaneously on the same GPU.
  • Batching support.
  • Latest version of NCCL 2.1.15
  • Ubuntu 16.04 with March 2018 updates

Known Issues

This is a beta release of the Inference Server. All features are expected to be available, however, some aspects of functionality and performance will likely be limited compared to a non-beta release.