Running Inference#

Verify Server Health#

Perform a health check on the gRPC endpoint.

  1. Install grpcurl from here

    Example commands to run on Ubuntu:

    wget https://github.com/fullstorydev/grpcurl/releases/download/v1.9.1/grpcurl_1.9.1_linux_amd64.deb
    sudo dpkg -i grpcurl_1.9.1_linux_amd64.deb
    
  2. Download the health checking proto:

    wget https://raw.githubusercontent.com/grpc/grpc/master/src/proto/grpc/health/v1/health.proto
    
  3. Run the health check on <server_ip> or localhost:

    grpcurl --plaintext --proto health.proto <server_ip>:8001 grpc.health.v1.Health/Check
    

If the service is ready, you will get a response similar to the following:

{ "status": "SERVING" }

Note

For using grpcurl with an SSL enabled server, avoid using --plaintext argument, and use --cacert with CA certificate, --key with private key and --cert with certificate files. Refer to grpcurl --help for more details.

Running Inference via Script#

1. Clone the Repository#

Download the Studio Voice Python client code by cloning the NVIDIA Maxine NIM Clients Repository:

git clone https://github.com/NVIDIA-Maxine/nim-clients.git

# Go to the 'studio-voice' folder

cd nim-clients/studio-voice/

2. Install Dependencies#

sudo apt-get install python3-pip
pip install -r requirements.txt

3. Run the Python Client#

You can use the sample client script in the Studio Voice GitHub repo to send a gRPC request to the hosted NIM server:

  1. Go to the scripts directory.

    cd scripts
    
  2. Run the command to send gRPC request.

    python studio_voice.py --target <server_ip:port> --input <input_audio_file_path> --output <output_audio_file_path>
    

Transactional mode: The following example command processes the packaged sample audio file and generates a studio_voice_48k_output.wav file in the current folder.

python studio_voice.py --target 127.0.0.1:8001 --input ../assets/studio_voice_48k_input.wav --output studio_voice_48k_output.wav

Streaming mode: The following example command processes the packaged sample audio file and generates a studio_voice_48k_output.wav file in the current folder.

python studio_voice.py --target 127.0.0.1:8001 --input ../assets/studio_voice_48k_input.wav --output studio_voice_48k_output.wav --streaming --model-type 48k-ll

Note

When using --streaming mode, ensure the selected --model-type (48k-hq, 48k-ll, or 16k-hq) aligns with the NIM_MODEL_PROFILE Model Type configuration to maintain compatibility .

Note

To use the client in Streaming mode, launch the NIM in Streaming mode. Similarly, to use the client in Transactional mode, launch the NIM in Transactional mode.

Note

Only WAV files are supported.

Note

Please note that the first inference isn’t indicative of the model’s performance.

Usage for Preview API Request#

python studio_voice.py --preview-mode \
    --ssl-mode TLS \
    --target grpc.nvcf.nvidia.com:443 \
    --function-id <function_id> \
    --api-key $API_KEY_REQUIRED_IF_EXECUTING_OUTSIDE_NGC \
    --input <input_file_path> \
    --output <output_file_path>

Command Line Arguments#

To view the details of command line arguments, run this command:

python studio_voice.py -h
  • --target <ip:port> - URI of NIM’s gRPC service. Use grpc.nvcf.nvidia.com:443 when hosted on NVCF. Default value is 127.0.0.1:8001.

  • --preview-mode - Flag to send request to preview NVCF server on Try API.

  • --ssl-mode - Set the SSL mode to TLS or MTLS. Defaults to no SSL. When running preview, TLS mode must be used with the default root certificate.

  • --ssl-key - The path to client’s PEM encoded private key. Only required for MTLS mode.

  • --ssl-cert - The path to client’s PEM encoded public certificate. Only required for MTLS mode.

  • --ssl-root-cert - The path to PEM encoded root certificate. Used only for TLS or MTLS modes.

  • --api-key <NGC_API_KEY> - NGC API key required for authentication. Utilized when using TRY API, ignored otherwise.

  • --function-id - NVCF function ID for the service. Utilized when using TRY API, ignored otherwise.

  • --input - The path to the input audio file. Default value is ../assets/studio_voice_48k_input.wav.

  • --output - The path for the output audio file. Default is ./studio_voice_48k_output.wav.

  • --streaming Flag to enable grpc streaming mode.

  • --model-type {48k-hq,48k-ll,16k-hq} Studio Voice model type, default is 48k-hq.