Local (Docker)

Configuring

General configuration for the services deployed using the Quick Start script is done by editing the file config.sh. By default, the configuration file is set to launch all available services on the supported GPU.

Important: By default, the Jarvis Speech Services API server will listen on port 50051.

All of the configuration options are documented within the configuration file itself. Follow the instructions in the config.sh file to change the default deployment behavior of the script. Advanced users can select which specific models to deploy for each service by commenting out lines corresponding to the pre-built model configuration files.

Downloading Required Models and Containers from NGC

The jarvis_init.sh script downloads all required models and containers from NGC and generates the model repository. The download can take some time depending on your internet bandwidth. You will need to provide an NGC API key for it to work. The key can be provided either through the environment variable NGC_API_KEY or as a configuration file (which is automatically generated by running ngc config set).

If the NGC key cannot be automatically discovered from your environment, the init script will prompt you to enter it.

Run the script with the command bash jarvis_init.sh. Upon successful completion of this command, users should see the following output:

Logging into NGC Docker registry if necessary...
Pulling required Docker images if necessary...
 > Pulling Jarvis Speech Server images.
 > Pulling nvcr.io/nvidia/jarvis/jarvis-speech:1.0.0-b.2-server. This may take some time...
Jarvis initialization complete. Run bash jarvis_start.sh to launch services.

Launching the Servers and Client Container

After downloading the required models and containers, the Jarvis Services servers can be started by running bash jarvis_start.sh. This will launch the Jarvis Speech Service API server.

Example output:

Starting Jarvis Speech Services
 > Waiting for Triton server to load all models...retrying in 10 seconds
 > Waiting for Triton server to load all models...retrying in 10 seconds
 > Waiting for Triton server to load all models...retrying in 10 seconds
 > Triton server is ready…

To verify that the servers have started correctly, users can check that the output of docker logs jarvis-speech shows:

I0428 03:14:50.440943 1 jarvis_server.cc:66] TTS Server connected to Triton Inference Server at 0.0.0.0:8001
I0428 03:14:50.440943 1 jarvis_server.cc:66] NLP Server connected to Triton Inference Server at 0.0.0.0:8001
I0428 03:14:50.440951 1 jarvis_server.cc:68] ASR Server connected to Triton Inference Server at 0.0.0.0:8001
I0428 03:14:50.440955 1 jarvis_server.cc:71] Jarvis Conversational AI Server listening on 0.0.0.0:50051

To start a container with sample clients for each service, run bash jarvis_start_client.sh. From inside the client container, users can try the different services using the provided Jupyter notebooks by simply running:

jupyter notebook --ip=0.0.0.0 --allow-root --notebook-dir=/work/notebooks

Stopping

To shut down the Jarvis Services server containers, run bash jarvis_stop.sh.

Clean-up

To clean-up the local Jarvis installation, run bash jarvis_clean.sh. This will stop and remove all Jarvis-related containers, as well as delete the Docker volume used to store model files. The Docker images themselves will not be removed.