Quick Start Guide

This Jarvis Speech Skills Quick Start Guide is a starting point for users to what to try out Jarvis; specifically, this guide enables users to quickly deploy pretrained models on a local workstation and run a sample client.

For more information and questions, visit the NVIDIA Jarvis Developer Forum.

Prerequisites

Before you begin using Jarvis AI Services, it’s assumed that you meet the following prerequisites.

  1. You have access and are logged into NVIDIA GPU Cloud (NGC). For step-by-step instructions, see the NGC Getting Started Guide.

  2. You have access to a Volta, Turing, or an NVIDIA Ampere arcitecture-based A100 GPU. For more information, see the Support Matrix.

  3. You have Docker installed with support for NVIDIA GPUs. For more information, see the Support Matrix.

Models Available for Deployment

To deploy Jarvis AI Services, there are two options:

Option 1: You can use the Quick Start scripts to set up a local workstation and deploy the Jarvis services using Docker. Continue with this section to use the Quick Start scripts.

Option 2: You can use a Helm chart. Included in the NGC Helm Repository is a chart designed to automate the steps for push-button deployment to a Kubernetes cluster. For details, see Kubernetes.

When using either of the push-button deployment options, Jarvis will use pre-trained models from NGC. You can also fine-tune custom models with NVIDIA NeMo. Creating a model repository using a fine-tuned model trained in NeMo is a more advanced approach to generating a model repository.

Local Deployment using Quick Start Scripts

This release of Jarvis includes Quick Start scripts to help you get started with Jarvis AI Services. These scripts are meant for deploying the services locally for testing and running the example applications.

  1. Download the scripts from the File Browser tab for Jarvis Quick Start or download via the command-line with the NGC CLI tool by running:

    ngc registry resource download-version nvidia/jarvis/jarvis_quickstart:1.0.0-b.2
    
  2. Initialize and start Jarvis. The initialization step downloads and prepares Docker images and models. The start script launches the server. Within the quickstart directory, modify the config.sh file with your preferred configuration. Options include which models to retrieve from NGC, where to store them, and which GPU to use if more than one is installed in your system (see Local (Docker) for more details).

    Note: This process may take quite a while depending on the speed of your Internet connection and number of models deployed. Each model is individually optimized for the target GPU after download.

    cd jarvis_quickstart_v1.0.0-b.2
    bash jarvis_init.sh
    bash jarvis_start.sh
    
  3. Start a container with sample clients for each service.

    bash jarvis_start_client.sh
    
  4. From inside the client container, try the different services using the provided Jupyter notebooks.

    jupyter notebook --ip=0.0.0.0 --allow-root --notebook-dir=/work/notebooks
    

For further details on how to customize a local deployment, see Local Deployment (Docker).

Running the Jarvis Client and Transcribing Audio Files

For ASR, the following commands can be run from inside the Jarvis Client container to perform streaming and offline transcription of audio files.

  1. For offline recognition, run: jarvis_asr_client --audio_file=/work/wav/sample.wav

  2. For streaming recognition, run: jarvis_streaming_asr_client --audio_file=/work/wav/sample.wav

Running the Jarvis Client and Converting Text to Audio Files

From within the Jarvis Client container, synthesize the audio files by running:

jarvis_tts_client --voice_name=ljspeech --text="Hello, this is a speech synthesizer."\
    --audio_file=/work/wav/output.wav

The audio files are stored in the /work/wav directory.

The streaming API can be tested by using the command line option --online=true. However, there is no difference between both options with the command-line client since it saves the entire audio to a WAV file.