Hardware Requirements
Intelligent Virtual Assistant
NNVIDIA AI Workflows can be deployed on-premise or using a cloud service provider (CSP). This workflow requires at minimum two GPU-enabled nodes for running the provided example workload. Production deployments should be performed in an HA environment.
The following hardware specification is recommended for both the transcription and virtual assistance workflows:
Two instances each with the following:
1X A10/A30/A100 (any GPU with more than or equal to 24GB memory)
12 vCPU Cores
64 GB RAM
500 GB HDD
Ports 443, 80 allowed for ingress and egress
DNS - A wildcard DNS A record must be created for the system along with the DNS A record for the system itself. Reverse lookup PTR records should also exist for both entries.
NoteIf the DNS entries are only resolvable within a local network, such as within a corporate domain, and not directly resolvable by the VMIs, a manual reverse lookup entry can be made in /etc/hosts on the systems for 127.0.0.1 to point to the DNS FQDN as a workaround.
Follow the steps below to set up the instances meeting the above requirements, before proceeding to the next section.
We will start by deploying two On-Demand NVIDIA AI Enterprise VMIs in the cloud, meeting the above specifications. One of these instances will be used for the training pipeline, and the other will be used for the inference pipeline. You can find this VMI under the Marketplaces for major CSPs; follow the instructions from the listing to provision the instances.
Once the instances have been provisioned, if applicable, refer to the NVIDIA AI Enterprise Cloud Guide to authorize the instances and activate your subscription.
If applicable, once your subscription has been activated, review the Prerequisites section to ensure you can access the Enterprise Catalog and create an NGC API Key if you do not already have one.
After the Hardware Requirements and Prerequisites sections have been completed, move on to the Cloud Native Software Requirements section.