SGLang Release Notes
SGLang Release Notes
These release notes describe the key features, software enhancements, improvements, and known issues for this release of SGLang. SGLang is a high-performance runtime system and programming language designed for Large Language Models (LLMs). The framework enables developers to write complex, structured generation programs with simple Python syntax and seamlessly integrates with a wide array of models from hubs like Hugging Face.
These release notes describe the key features, software enhancements, improvements, and known issues for this release of SGLang. SGLang is a high-performance runtime system and programming language designed for Large Language Models (LLMs). The framework enables developers to write complex, structured generation programs with simple Python syntax and seamlessly integrates with a wide array of models from hubs like Hugging Face. Through core innovations like RadixAttention and a dedicated LLM compiler, SGLang is designed to be expressive and exceptionally efficient for demanding, multi-step generation tasks. Common use cases include developing complex agents, implementing chain-of-thought reasoning, and creating sophisticated few-shot prompting strategies. The SGLang container is released monthly to provide you with the latest NVIDIA deep learning software libraries and GitHub code contributions that have been sent upstream. The libraries and contributions have all been tested, tuned, and optimized.
Security Common Vulnerabilities and Exposures (CVEs)
Please review the Security Scanning tab on NGC to see the latest security scan results. For certain open-source vulnerabilities listed in the scan results, NVIDIA provides a response in the form of a Vulnerability Exploitability eXchange (VEX) document. The VEX information can be reviewed and downloaded from the Security Scanning tab.
For a complete view of the supported software and specific versions that are packaged with the frameworks based on the container image, see the Frameworks Support Matrix.