General configuration for the services deployed using the Quick Start
script is done by editing the file
config.sh. By default, the
configuration file is set to launch all available services on the
Important: By default, the Riva Speech Services API server will listen on port 50051.
All of the configuration options are documented within the
configuration file itself. Follow the instructions in the
config.sh file to change
the default deployment behavior of the script. Advanced users can select
which specific models to deploy for each service by commenting out
lines corresponding to the pre-built model configuration files.
Downloading Required Models and Containers from NGC¶
riva_init.sh script downloads all required models and
containers from NGC and generates the model repository. The download can take some time depending on your internet bandwidth. You
will need to provide an NGC API key for it to work. The key can be provided either through the
NGC_API_KEY or as a configuration file (which
is automatically generated by running
ngc config set).
If the NGC key cannot be automatically discovered from your
init script will prompt you to enter it.
Run the script with the command
bash riva_init.sh. Upon
successful completion of this command, you should see the following output:
Logging into NGC Docker registry if necessary... Pulling required Docker images if necessary... > Pulling Riva Speech Server images. > Pulling nvcr.io/nvidia/riva/riva-speech:1.8.0-beta-server. This may take some time... Riva initialization complete. Run bash riva_start.sh to launch services.
Launching the Servers and Client Container¶
After downloading the required models and containers, the Riva
Services servers can be started by running
This launches the Riva Speech Service API server.
Starting Riva Speech Services > Waiting for Triton server to load all models...retrying in 10 seconds > Waiting for Triton server to load all models...retrying in 10 seconds > Waiting for Triton server to load all models...retrying in 10 seconds > Triton server is ready…
To verify that the servers have started correctly, check that
the output of
docker logs riva-speech shows:
I0428 03:14:50.440943 1 riva_server.cc:66] TTS Server connected to Triton Inference Server at 0.0.0.0:8001 I0428 03:14:50.440943 1 riva_server.cc:66] NLP Server connected to Triton Inference Server at 0.0.0.0:8001 I0428 03:14:50.440951 1 riva_server.cc:68] ASR Server connected to Triton Inference Server at 0.0.0.0:8001 I0428 03:14:50.440955 1 riva_server.cc:71] Riva Conversational AI Server listening on 0.0.0.0:50051
To start a container with sample clients for each service, run
bash riva_start_client.sh. From inside the client container, you
can try the different services using the provided Jupyter notebooks by
jupyter notebook --ip=0.0.0.0 --allow-root --notebook-dir=/work/notebooks
To shut down the Riva Services server containers, run
To clean-up the local Riva installation, run
This will stop and remove all Riva-related containers, as well as
delete the Docker volume used to store model files. The Docker images
themselves will not be removed.