Introduction
Modern enterprise applications are becoming more cloud-native and based on a microservices architecture. Microservices, by definition, consist of a collection of small independent services that communicate over well-defined APIs. AI applications, in most instances, adhere well to this same architectural design, as there are typically many different components that all need to work together in both training and inferencing workflows.
To deploy an application in a production environment, the application must also meet the following criteria:
Reliability
Security
Performance
Scalability
Interoperability
NVIDIA AI Workflows are intended to provide reference solutions of how to leverage NVIDIA frameworks to build AI solutions for solving common use cases. These workflows provide guidance like fine tuning and AI model creation to build upon NVIDIA frameworks. The pipelines to create applications are highlighted, as well as opinions on how to deploy customized applications and integrate them with various components typically found in enterprise environments, such as components for orchestration and management, storage, security, networking, etc.
By leveraging an AI workflow for your specific use case, you can streamline development of AI solutions following the example provided by the workflow to:
Reduce development time, at lower cost
Improve accuracy and performance
Gain confidence in outcome, by leveraging NVIDIA AI expertise
Using the example workflow, you know exactly what AI framework to use, how to bring data into the pipeline, and what to do with the data output. AI Workflows are designed as microservices, which means they can be deployed on Kubernetes alone or with other microservices to create a production-ready application for seamless scaling. The workflow cloud deployable package can be used across different cloud instances and is automatable and interoperable.
NVIDIA AI Workflows are available on NVIDIA NGC for NVIDIA AI Enterprise software customers.
NVIDIA AI Workflows are deployed as a package containing the AI framework as well as the tools for automating a cloud-native solution. AI Workflows also have packaged components that include enterprise-ready implementations with best practices that ensure reliability, security, performance, scalability, and interoperability, while allowing a path for you to deviate.
A typical workflow may look similar to the following diagram:
Within each workflow, opinionated guidance and example components are provided at each of the layers within this stack, along with information about how to integrate the AI solution with these components:
- Hardware
- Infrastructure and Orchestration
- Supporting Software
- Applications
NVIDIA AI Enterprise supported GPU-accelerated on-premise hardware or cloud instances are required. Specific requirements and specifications are provided for each workflow.
NVIDIA Cloud Native Stack is used as an example Kubernetes distribution that the workflows can be deployed and orchestrated with.
NVIDIA Cloud Native Service Add-on Pack is used to deploy a set of components that is used to perform functions typically required in a production enterprise environment, such as authentication/authorization, monitoring, storage/database, etc.
Example microservices are provided as a series of Helm charts and customized containers that are deployed as a part of the workflow, to demonstrate how to customize and build an AI application using NVIDIA frameworks, and integrate this application with other microservices and enterprise software components.
This AI Workflow includes three different guides. While all should be relatively easy to follow, they are targeted towards different intended audiences.
- Deployment Guide
- Deployment/DevOps Customization Guide
- Development Guide
This guide is targeted to IT Admins and/or DevOps Engineer who will setup the infrastructure and deploy the Next Item Prediction reference application.
This guide is targeted to the IT Admin and/or DevOps Engineer who will setup the enterprise sample implementations for authentication, secrets managements, monitoring, reporting, etc.
This guide is targeted to Data Scientists and ML Engineers who will be working on the modeling, feature engineering, and data engineering involved in generating the recommendation system using the NVIDIA Merlin framework.
The components and instructions used in the workflow are intended to be used as examples for integration, and may not be sufficiently production-ready on their own as stated. The workflow should be customized and integrated into one’s own infrastructure, using the workflow as reference. For example, all of the instructions in these workflows assume a single node infrastructure, whereas production deployments should be performed in a high availability (HA) environment.
For more information about the detailed components and software stacks, please refer to the guides for each workflow.