SGLang Release 26.06
This SGLang container release is intended for use on the NVIDIA® Hopper Architecture GPU, NVIDIA H100, the NVIDIA® Ampere Architecture GPU, NVIDIA A100, and the associated NVIDIA CUDA® 12 and NVIDIA cuDNN 9 libraries. The NVIDIA container image for the SGLang release is available on NGC.
Contents of the SGLang container
This container image contains the complete source of the version of SGLang in /opt/sglang. It is pre-built and installed in the Python default environment/usr/local/lib/python3.12/dist-packages/sglang/in the container image. Visit SGLang Docs to learn more about SGLang.
The NVIDIA SGLang Container is optimized for use with NVIDIA GPUs, and contains the following software for GPU acceleration.
- Please see the CUDA section for the list of libraries inherited from CUDA container.
- NVIDIA CUDA 13.3.0
- SGLang 0.5.12.post1
- flashinfer 0.6.12
- transformers 5.6.0
- flash-attention 2.7.4.post1
- xgrammar 0.2.0
- Torch 2.13.0a0+8145d630e8
Driver Requirements
Release 26.06 is based on CUDA 13.3.0. For comprehensive and up-to-date driver compatibility information, please refer to the following documentation:
- NVIDIA CUDA Compatibility Guide - Compatibility information between CUDA versions and driver releases
- CUDA Toolkit Release Notes - Driver version requirements and compatibility matrices
- NVIDIA Drivers Download - Latest NVIDIA drivers.
Key Features and Enhancements
This SGLang release includes the following key features and enhancements.
-
Support for multi-node configurations.
- GB300/B300 support.
- RTX PRO™ 6000 Blackwell Server Edition support.
-
DGX Spark support.
- Jetson Thor support.
- Support for 8-bit floating point (FP8) precision on Hopper GPUs and above.
-
Support NVIDIA innovative 4-bit floating point NVFP4 format on Blackwell GPUs (including Jetson Thor and DGX Spark), which provides better training and inference performance with lower memory utilization.
- Supported for DeepSeek-R1, Llama-3.1-8B-Instruct.
- Support for openai/gpt-oss-20b and openai/gpt-oss-120b.
- Support for Nemotron-3 Nano Omni
- Qwen3.6-35B-A3B-FP8
Announcements
- None.
Known Issues
- MTP is not supported for NVIDIA-Nemotron-3-Super models.
- There is a known issue with Phi 4 Multimodal Instruct FP8.
- The 26.06 SGLang container release includes no known vulnerabilities (CVEs).