Security Development Lifecycle for NVIDIA AI Enterprise#

NVIDIA’s Product Lifecycle Management (PLC) is an internally designed framework encompassing essential security activities for every product, service, feature, and component. These established software development practices ensure compliance with industry standards and consist of clearly defined phases: planning, requirement gathering, design, coding, verification, release, and operations. Each phase mandates specific work streams, deliverables, and reviews. The PLC outlines a set of security requirements aimed at mitigating risks throughout the lifecycle. These requirements are regularly updated to address emerging security threats and trends. Both manual and automated processes, including threat and vulnerability analysis, security assessments, and continuous security scanning, are integral to the PLC. These processes are consistently monitored to ensure that products and services meet established security objectives before release.

Container Build Lifecycle Management#

Container build lifecycle management is the process of defining, deploying, and maintaining the settings and parameters of containers. NVIDIA leverages Open Container Initiative (OCI) containers for a wide range of software products that we use internally and deliver externally. We use different sets of requirements for building and hardening these categories of OCI containers to balance the security risk with the value of their intended use case. In addition, our application development logic in the container follows the leading security practices as part of our PLC to further reduce the risks of vulnerable code.

Secure Design#

NVIDIA has developed a robust and extensive secure software development process, and we have been applying many of these principles across our products. They include:

  • Threat analysis, to determine where risks are found and how likely they are to be exploited. This is based on factors such as attacker motivations, attack paths, and trust boundaries of the code.

  • Code scanning of various types before the software runs, to look for things such as known vulnerabilities, secrets such as API keys that might have been inadvertently hard-coded, and static code analysis to identify issues such as syntax errors and coding standard violations.

  • Testing of the code while it is running, including penetration and fuzz testing, API security tests, and dynamic application security tests.

All of this helps to ensure that our software is designed to make it resistant to security attacks.

Security Hardening#

Hardening is the process of applying security best practices when building and assembling software into delivery containers. It encompasses a few areas:

  • The principles of least privilege - including the use of non-root users, limiting file system permissions within containers, and limiting the use of network ports.

  • Another important step is to reduce the attack surface by eliminating all unused packages and files, cleaning up artifacts of the build process, and disabling any unused services.

  • We also protect against the execution of arbitrary code with measures such as sanitizing input, preventing download and execution of remote code, and defining controllable interfaces.

NVIDIA architects containers from a trusted base container and audits the contents of the container as it’s built to support the model. This includes providing open-source acknowledgments about the container.

Aspects of this process are applied to various components across our software stack to provide comprehensive defense against system-level attacks.

NVIDIA has implemented an automated check to prevent software artifacts from being published unless they pass a rigorous security readiness check. This process is being rolled out as a standard across all NVIDIA AI Enterprise software.

Software Bill of Materials (SBOM)#

Software Bill of Materials (SBOM) is a list of components and dependencies that make up a software product. SBOM helps software developers, vendors, and users identify and manage the security and licensing risks associated with software components. NVIDIA AI Enterprise containers undergo an SBOM review as part of the release process to help maintain an accurate list of components included in the build.

SBOMs for enterprise containers are available as JSON files for download through the NGC Catalog.

_images/ai-enterprise-security-02.png

Figure 2 Software Bill of Materials for Enterprise Containers on NGC Catalog#

Container Signing#

Signing container images adds a digital signature to the image. This is a critical security measure that ensures the integrity and authenticity of the containers. It shows that the container originates from a trusted source and has not been tampered with. For customers, verifying these signatures before deployment is essential to prevent the introduction of malicious or compromised code into their environments. This process helps maintain trust in the supply chain and mitigates risks associated with deploying untrusted software.

_images/ai-enterprise-security-03.png

Figure 3 Public key signature for Enterprise Containers on NGC Catalog#

All NVIDIA container images published on the NGC Catalog are signed by NVIDIA. The signature can be verified using standard open source tooling using NVIDIA’s public key.

Model Signing#

Model signing is a cryptographic process that verifies the integrity of AI models, as well as supporting code, documentation, and datasets, ensuring they have not been altered. It also authenticates the origin of these artifacts, and embeds essential metadata about the model’s development, such as training data sources.

Model signing can reveal unauthorized modifications to the model or supporting code and can enable remote verification of authenticity to prevent model spoofing. In regulated industries, model signing supports compliance with governance requirements, aids in version control by uniquely identifying different versions, and improves traceability with a verifiable chain of custody.

NVIDIA has worked with other industry partners within the Open Source Security Foundation (OpenSSF) to develop a standard for model signing that takes into account the unique characteristics of AI models, such as the relationship between multiple files and the large size of models and datasets. We have started signing models published on the NGC catalog, with the eventual goal of complete coverage for all models.

_images/ai-enterprise-security-04.png

Figure 4 Public key signature for Enterprise Containers on NGC Catalog#