End-to-End Demo Chart#

Overview#

The nvidia-studio-voice-h4m-sample Helm chart deploys a complete Studio Voice demo pipeline on Holoscan for Media. This includes:

Sender pipeline.
Studio Voice NIM service (nvidia-studio-voice-h4m-service).
Receiver pipeline (SMPTE ST 2110 or NMOS).
SRT output for preview.

The Studio Voice NIM processes incoming audio streams and performs real-time AI-based speech enhancement (noise suppression, dereverberation, restoration of clarity). Enhanced audio is transmitted via ST 2110-30.

Installation#

Prerequisites#

Complete all prerequisite steps (Rivermax license, image pull and model pull secrets, high-speed network attachment) before running helm install. For details, refer to Getting Started.

Pull the Chart#

If not already done, add the Helm repository and pull the chart:

helm pull nim-repo/nvidia-studio-voice-h4m-sample --version 1.2.0

For the full repository setup, refer to Pull Helm Charts.

Helm Installation#

The chart includes a default values.yaml. For SMPTE ST 2110 static or NMOS pipelines, copy the configuration from Starter Configuration Files into values-st2110.yaml or values-nmos.yaml, adjust fields, and run the matching install command.

Important

The default values.yaml ships with example values for cluster-specific fields such as node selector (example-gpu-node), image pull secret, network name, and scheduler. Before deploying, you must update these fields to match your cluster. The easiest approach is to use the global overrides so that each value is specified once for all components (sender, receiver, and NIM service). For details, refer to Global Overrides.

Minimal deployment using global overrides (NMOS mode):

helm upgrade --install studio-voice-h4m-sample \
  nvidia-studio-voice-h4m-sample-1.2.0.tgz \
  --set global.nodeSelector.hostname=<gpu-node-name> \
  --set global.image.secret=<image-pull-secret> \
  --set global.network.name=<multus-net-attach-def> \
  --set global.schedulerName=<scheduler-name> \
  --set nvidia-studio-voice-h4m-service.ngc.secretName=<model-pull-secret>

Alternatively, set values per component when the sender, receiver, and NIM service run on different nodes or use different pull secrets:

helm upgrade --install studio-voice-h4m-sample \
  nvidia-studio-voice-h4m-sample-1.2.0.tgz \
  --set sender.nodeSelector.hostname=<gpu-node-name> \
  --set receiver.nodeSelector.hostname=<gpu-node-name> \
  --set nvidia-studio-voice-h4m-service.nodeSelector.hostname=<gpu-node-name> \
  --set sender.image.secret=<image-pull-secret> \
  --set receiver.image.secret=<image-pull-secret> \
  --set nvidia-studio-voice-h4m-service.image.secret=<image-pull-secret> \
  --set nvidia-studio-voice-h4m-service.ngc.secretName=<model-pull-secret>

For all available Helm values, refer to the Configuration Reference.

Recommended: Two-Phase Deploy#

First, deploy the receiver and NIM service. When the pipeline is ready, enable the sender. This procedure prevents audio from accumulating in receiver queues before the NIM is ready.

# Phase 1: receiver + NIM service only
helm upgrade --install studio-voice-h4m-sample \
  nvidia-studio-voice-h4m-sample-1.2.0.tgz \
  --set sender.enabled=false

kubectl rollout status deployment/<receiver-appName> --timeout=180s
kubectl rollout status deployment/<nim-appName> --timeout=300s

# Phase 2: enable sender
helm upgrade --install studio-voice-h4m-sample \
  nvidia-studio-voice-h4m-sample-1.2.0.tgz

kubectl rollout status deployment/<sender-appName> --timeout=180s

On Red Hat OpenShift, replace kubectl with oc.

Note

A rollout status timeout is not a failure; it means the pod did not become ready within the allotted time. The NIM pod might still be pulling its image or initializing. If a timeout occurs, check the pod state before taking action:

kubectl get pods -o wide
kubectl describe pod <pod-name>

Look for Pulling image in the events (normal; wait longer) or CrashLoopBackOff / ErrImagePull (actionable: check secrets and node resources).

After installation, verify all pods are running:

kubectl get pods

Confirm that the sender, NIM service, and receiver pods all show READY 1/1 before continuing.

Sender Sample Media#

Input Media Files#

The sender expects an MPEG Transport Stream (.ts) file containing an audio stream for Studio Voice enhancement.

Codecs#

Audio codec: Opus, 48 kHz, mono.

Bundled Sample Files#

The sender container includes a sample .ts file:

/workspace/assets/studio_voice_48k_2_looped_10min.ts

File	Sample Rate	Channels
`studio_voice_48k_2_looped_10min.ts`	48 kHz	Mono

Using a Custom Input Asset#

To test with your own audio, upload a .ts file to a PersistentVolumeClaim (PVC) and enable the built-in PVC mount on the sender:

sender:
  inputAssets:
    audioFile: /workspace/assets/my-custom-audio.ts
  inputData:
    pvc:
      enabled: true
      claimName: studiovoice-sender-input-data

The PVC is mounted at /workspace/assets inside the sender pod, overlaying the default bundled assets.

Starter Configuration Files#

Copy the appropriate configuration into values-st2110.yaml or values-nmos.yaml, adjust fields for your environment, and pass it to helm upgrade --install with -f.

For a full reference of all Helm keys and pipeline tuning parameters, refer to the Configuration Reference.

SMPTE ST 2110 Configuration — Static (values-st2110.yaml)#

sender:
  enabled: true
  appName: nvidia-studio-voice-sender-st2110
  replicas: 1
  nodeSelector:
    hostname: example-gpu-node
  image:
    repository: nvcr.io/nim/nvidia/studio-voice-h4m-sample-client
    tag: "1.2.0"
    secret: ngc-image-secret
  network:
    name: media-a-tx-net
  inputAssets:
    audioFile: /workspace/assets/studio_voice_48k_2_looped_10min.ts
  st2110:
    multicastIp: "234.5.8.9"
    audioPort: 5003
    multicastTtl: 24
  nmos:
    enabled: false
    label: Studio Voice Audio Sender

receiver:
  enabled: true
  appName: nvidia-studio-voice-receiver-st2110
  replicas: 1
  nodeSelector:
    hostname: example-gpu-node
  image:
    repository: nvcr.io/nim/nvidia/studio-voice-h4m-sample-client
    tag: "1.2.0"
    secret: ngc-image-secret
  network:
    name: media-a-tx-net
  receiverMode: processed
  st2110:
    multicastIp: "234.5.8.9"
    audioPort: 5004
    multicastTtl: 24
  audioParams:
    opusBitrate: 128000
  nmos:
    enabled: false
    label: Studio Voice Audio Receiver
  srtPort:
    internal: 9090
    external: 32090

nvidia-studio-voice-h4m-service:
  enabled: true
  appName: nvidia-studio-voice-nim-st2110
  replicas: 1
  nodeSelector:
    hostname: example-gpu-node
  image:
    repository: nvcr.io/nim/nvidia/studio-voice-h4m
    tag: "1.2.0"
    pullPolicy: IfNotPresent
    secret: ngc-image-secret
  network:
    name: media-a-tx-net
  nmos:
    enabled: false
    httpPort: 9010
    seed: nvidia-studio-voice-nim-st2110
  audio:
    pcmFormat: S24BE
    samplingRate: 48000
    numChannels: 1
  input:
    audio:
      sessionName: audio_in
      localInterfaceName: net1
      hostIp: "234.5.8.9"
      hostPort: 5003
      hostNumSubnetBits: 24
  output:
    audio:
      sessionName: audio_out
      localInterfaceName: net1
      hostIp: "234.5.8.9"
      hostPort: 5004
      hostNumSubnetBits: 24
  studiovoice:
    nmosEnabled: false
    nimEnabled: true

NMOS Configuration (values-nmos.yaml)#

sender:
  enabled: true
  appName: nvidia-studio-voice-sender-nmos
  replicas: 1
  nodeSelector:
    hostname: example-gpu-node
  image:
    repository: nvcr.io/nim/nvidia/studio-voice-h4m-sample-client
    tag: "1.2.0"
    secret: ngc-image-secret
  network:
    name: media-a-tx-net
  inputAssets:
    audioFile: /workspace/assets/studio_voice_48k_2_looped_10min.ts
  nmos:
    enabled: true
    hostname: studio-voice-sender.local
    description: Studio Voice Audio Sender
    label: Studio Voice Audio Sender

receiver:
  enabled: true
  appName: nvidia-studio-voice-receiver-nmos
  replicas: 1
  nodeSelector:
    hostname: example-gpu-node
  image:
    repository: nvcr.io/nim/nvidia/studio-voice-h4m-sample-client
    tag: "1.2.0"
    secret: ngc-image-secret
  network:
    name: media-a-tx-net
  receiverMode: processed
  nmos:
    enabled: true
    hostname: studio-voice-receiver.local
    description: Studio Voice Audio Receiver
    label: Studio Voice Audio Receiver
  srtPort:
    internal: 9090
    external: 32090

nvidia-studio-voice-h4m-service:
  enabled: true
  appName: nvidia-studio-voice-nim-nmos
  replicas: 1
  nodeSelector:
    hostname: example-gpu-node
  image:
    repository: nvcr.io/nim/nvidia/studio-voice-h4m
    tag: "1.2.0"
    pullPolicy: IfNotPresent
    secret: ngc-image-secret
  network:
    name: media-a-tx-net
  nmos:
    enabled: true
    httpPort: 9010
    seed: nvidia-studio-voice-nim-nmos
    hostname: studio-voice-nim.local
    description: Studio-Voice-NIM
    label: Studio-Voice-NIM
  audio:
    pcmFormat: S24BE
    samplingRate: 48000
    numChannels: 1
  input:
    audio:
      sessionName: audio_in
  output:
    audio:
      sessionName: audio_out
  studiovoice:
    nmosEnabled: true
    nimEnabled: true

Note

In NMOS mode, the transport parameters (multicast IP address, port, subnet) are negotiated dynamically by the NMOS Connection Manager over IS-04 and IS-05, so the preceding NIM service block needs only sessionName for each input/output. The values are discovered at runtime.

On Holoscan for Media clusters, the NMOS Connection Manager web UI is typically reached from a browser on the cluster network. For remote graphical access, refer to Chrome Remote Desktop in Getting Started.

Note

In NMOS mode, connect receivers to the Studio Voice NIM before connecting the sender via the NMOS Connection Manager UI. In SMPTE ST 2110 mode (static), ensure that the “sender → NIM → receiver IP address and port” chain is consistent across all values.

Install#

SMPTE ST 2110 (static):

helm upgrade --install studio-voice-h4m-sample \
  nvidia-studio-voice-h4m-sample-1.2.0.tgz \
  -f values-st2110.yaml

Note

NMOS is enabled by default. To deploy in fixed SMPTE ST 2110 mode (static), append the following:

--set sender.nmos.enabled=false \
--set receiver.nmos.enabled=false \
--set nvidia-studio-voice-h4m-service.nmos.enabled=false

NMOS:

```bash
helm upgrade --install studio-voice-h4m-sample \
  nvidia-studio-voice-h4m-sample-1.2.0.tgz \
  -f values-nmos.yaml

View Output via SRT#

Get the internal IP addressof the node:

kubectl get nodes -o wide

The SRT output is on port 32090 by default. Open the stream in VLC or ffplay:

ffplay "srt://<internal-ip>:32090"

Disabling Components#

Individual pipeline components can be disabled during helm install or helm upgrade without removing the release:

--set sender.enabled=false
--set receiver.enabled=false
--set nvidia-studio-voice-h4m-service.enabled=false

Only the selected component stops; the rest of the pipeline continues to run.

Verify#

helm status studio-voice-h4m-sample
kubectl get pods -o wide

Verify configured Helm values:

helm get values studio-voice-h4m-sample

On Red Hat OpenShift, replace kubectl with oc. For log access, refer to Observability.

Uninstall#

helm uninstall studio-voice-h4m-sample

Note

The *-model-cache and *-nim-logs PVCs carry helm.sh/resource-policy: keep and are not removed by helm uninstall. This preserves the cached model across reinstalls; clean them up manually when no longer needed:

kubectl delete pvc -l app.kubernetes.io/instance=studio-voice-h4m-sample -n <your-namespace>

End-to-End Verification#

Create values-st2110.yaml or values-nmos.yaml from Starter Configuration Files and run the matching Install command.
Confirm all pods show READY 1/1 with kubectl get pods.
Open the SRT preview; refer to View Output via SRT. For VLC steps, NMOS wiring, and screenshots, refer to Verification.
Run helm uninstall studio-voice-h4m-sample.

For troubleshooting, refer to Advanced Usage.