Troubleshooting¶

To share feedback or ask questions about this release, access our NVIDIA Jarvis Developer Forum.

NGC¶

Before pulling images/models from NGC, ensure you run the following command with your API key.
```
$ docker login nvcr.io
$ ngc config set
```
To locate Jarvis related materials, go to NGC Catalogs. Under the name section on the right, select ea-jarvis-stage, then navigate to Private Registry to find Jarvis related items.

If tlt ... export from TLT launcher is used for exporting from NeMo to Jarvis, the recent NeMo release (1.0.0.b4) is recommended for training models.
In case the .ejrvs file is encrypted, you will need to provide the encryption key to both jarvis-build and jarvis-deploy.
If you want to overwrite a previously generated JMIR file or directory that contains Triton Inference Server artifacts, pass -f to either jarvis-build or jarvis-deploy.

ASR

For very long audio, it is recommended to use the streaming recognition service.
For noisy audio, it is recommended to use the Jasper acoustic model for improved accuracy. The provided English-language Jasper model has been trained to be robust to medium levels of background noise. The provided English QuartzNet model will not be as robust to background noise. The model can be set in RecognitionConfig.
When using streaming recognition, the client will send chunks of audio, normally from a microphone. Clients can use whatever chunksize. The Jarvis server will create chunks of 100ms or 800ms depending on the server configuration. Note that streaming recognition mode will use more GPU memory than offline recognition mode.
In the offline recognition mode, clients send a request containing all the audio and the server will segment the audio into chunks under the hood.
The server will automatically upsample 8khz audio to 16khz.
In case you want to use domain-specific ASR models, you can either fine-tune the acoustic model or train a domain-specific language model and deploy with Jarvis ServiceMaker.
To fully saturate hardware, you might want to use multiple streams in real-time. In the client example, simply specify the number of parallel streams by setting --num_parallel_requests.

TTS

For realtime applications, you should use the online streaming mode (option --online=True in the command line client, or function SynthesizeOnline from the gRPC API).
The input is limited to 400 tokens (characters or ARPABET symbols). If you need to synthezise long paragraphs, break them down at the sentence level.
To fully saturate hardware, users might want to use multiple streams in real-time. In the client example, simply specify the number of parallel streams by setting --num_parallel_requests. Note that using more streams than supported by your hardware configuration might cause issues, such as some requests timing out.

If you encounter the following error Cannot create GRPC channel at uri localhost:50051, check whether the Jarvis Server is on by looking at the log docker logs jarvis-speech.

A variety of issues can happen when installing using helm. Some of the more common issues are captured below:

During installation, to watch the logs for the init container, run:
```
kubectl logs $POD jarvis-model-init --follow
```
During installation, to watch the logs of the server (the server won’t start till the above finishes), run:
```
kubectl logs $POD --follow
```
To ensure the pods are correctly launched, run:
```
kubectl get pods -A
```
To verify which services are running, run:
```
kubectl get services
```
If using load balancing from traefik to validate the ingress route, run:
```
kubectl get ingressroutes
```

To ensure the ingress route points at the correct service, run:

kubectl get service `kubectl get ingressroutes jarvis-ingressroute -o=jsonpath='{..spec.routes[0].services[0].name}'

The SpeechSquad container connectivity must be functioning.

POD=`kubectl get pods | grep speechsquad | awk '{print $1}'` kubectl exec -it $POD -- /bin/bash