Container Security Development Lifecycle for NVIDIA AI Enterprise on Azure AI Foundry#

NVIDIA’s Product Lifecycle Management (PLC) is an internally designed framework encompassing essential security activities for every product, service, feature, and component. These established software development practices ensure compliance with industry standards and consist of clearly defined phases: planning, requirement gathering, design, coding, verification, release, and operations. Each phase mandates specific work streams, deliverables, and reviews. The PLC outlines a set of security requirements aimed at mitigating risks throughout the lifecycle. These requirements are regularly updated to address emerging security threats and trends. Both manual and automated processes, including threat and vulnerability analysis, security assessments, and continuous security scanning, are integral to the PLC. These processes are consistently monitored to ensure that products and services meet established security objectives before release.

Vulnerability Scanning#

NVIDIA uses industry-standard vulnerability scanning methods for container images. All identified packages are scanned against security feeds to accurately detect known vulnerabilities. To reduce the noise inherent in such scanning efforts, the following logic of applying the severity score is used:

  1. For non-operating system packages (e.g. Python), GitHub Advisory (GHSA) matching and severity determination is applied to a given known vulnerability in a package.

  2. For operating system packages (e.g. Debian), the severity score, as determined by the specific distro security feed on which the container image is based, is applied, such as Ubuntu Security Notices.

  3. For any remaining packages in the image that have no matching from 1 and 2 above, the NIST National Vulnerability Database (NVD) is applied with its severity scoring.

Vulnerability scanning at NVIDIA occurs during three distinct phases in the product lifecycle: iteratively during development, during the publishing process to the Model Catalog in Azure AI Foundry (no critical or high vulnerabilities allowed for publishing), and then as part of continuous scanning of published containers.

In circumstances when a vulnerability cannot be addressed, such as when it has been recently discovered or there is no available fix, an exception process with documentation is used.

In addition, our application development logic in the container follows the leading security practices as part of our PLC to further reduce the risks of vulnerable code.

Vulnerability Patching#

At NVIDIA, we prioritize patching vulnerabilities categorized as critical and high severity based on Common Vulnerability Scoring System (CVSS) and our scanning logic before any release.

NVIDIA and Microsoft closely collaborate to patch CVEs flagged within the NIM containers at regular cadence. NVIDIA’s patching strategy adapts to the specific branch type, offering roll-forward fixes for feature branches and 30 days scheduled updates for production branches while ensuring API stability.

For certain OSS packages, NVIDIA partners with Canonical to provide vulnerability patching support for enterprise containers.

NVIDIA strongly recommends updating to the latest version of NIM in Azure AI Foundry which will have the most recent security updates and critical bug fixes.

Container Build Lifecycle Management#

Container build lifecycle management is the process of defining, deploying, and maintaining the settings and parameters of containers. NVIDIA leverages Open Container Initiative (OCI) containers for a wide range of software products that we use internally and deliver externally.

At NVIDIA we perform secure build lifecycle management by:

  • Applying the principles of least privilege including use of non-root and limiting file system permissions within containers

  • Periodically update container images with the latest patches, scan for vulnerabilities, and use trusted sources for image repositories

We use different sets of requirements for building and hardening these categories of OCI containers to balance the security risk with the value of their intended use case.

Coordinated Vulnerability Disclosure#

The NVIDIA Product Security Incident Response Team (PSIRT) manages the receipt, investigation, internal coordination, remediation, and disclosure of potential security vulnerabilities related to NVIDIA products. The team’s goal is to minimize customers’ risk associated with security vulnerabilities by providing timely notification, guidance, and remediation of vulnerabilities in our products and services. PSIRT follows ISO/IEC 29147/30111.

Security Incident Monitoring#

Customers should contact NVIDIA Enterprise Support for proper routing of security-related issues in NVIDIA products.