Kubernetes Operator Chart#

Overview#

The nvidia-active-speaker-detection-h4m-operator Helm chart installs the Active Speaker Detection Kubernetes operator. The controller watches NvidiaActiveSpeakerDetectionMediaFunction custom resources and reconciles a deployment for the NIM, SDP ConfigMaps (/sdps), and status.

  • Operator chart: Deploys only the controller and CRDs. It does not install a NvidiaActiveSpeakerDetectionMediaFunction custom resource or the NIM workload—apply a custom resource after installation to provision the NIM.

  • Operand: Each custom resource instance drives one NIM workload (video and ancillary output) according to the inputs, outputs, and parameters declared in that resource.


Installation#

Prerequisites#

Complete all prerequisite steps (Rivermax license, image pull and model pull secrets, GPU node, Multus attachment for SMPTE ST 2110) before running helm install. For more information, refer to Getting Started.

Pull the Chart#

If not already done, add the Helm repository and pull the chart:

helm pull nim-repo/nvidia-active-speaker-detection-h4m-operator --version 1.0.0

For the full repository setup, refer to Pull Helm Charts.

Helm Installation#

The chart installs the operator deployment and registers the NvidiaActiveSpeakerDetectionMediaFunction CRD. For all available Helm values and --set keys, refer to Operator Configuration.

The chart expects a pre-created image pull secret for nvcr.io (default name ngc-api-key). If your secret names differ, override as follows:

helm install ai4m-asd-operator \
  nvidia-active-speaker-detection-h4m-operator-1.0.0.tgz \
  --set imagePullSecrets[0].name=<operator-pull-secret> \
  --set mediaFunction.imagePullSecrets[0].name=<nim-pull-secret>

When the defaults match your cluster, you can use the following command:

helm install ai4m-asd-operator \
  nvidia-active-speaker-detection-h4m-operator-1.0.0.tgz

--set Flag

Purpose

Default

imagePullSecrets[0].name

Image pull secret for the operator pod.

ngc-api-key

mediaFunction.imagePullSecrets[0].name

Image pull secret for NIM pods that the operator creates.

ngc-api-key

imageCredentials.password

Inline registry password for the operator (alternative to a pre-created secret).

""

mediaFunction.imageCredentials.password

Inline registry password for NIM pods.

""

Note

The model pull secret and node hostname are set on the custom resource, not Helm values (spec.parameters.ngcModelDownload, spec.nodeSelector). Refer to Create the Custom Resource.

Wait for the controller to be ready:

kubectl rollout status deployment/ai4m-asd-operator-controller-manager --timeout=180s

On Red Hat OpenShift, replace kubectl with oc.


Create the Custom Resource#

After the operator controller is ready, save a manifest (for example, nim-media-function.yaml) and configure the following areas:

Area

Field

Notes

GPU node

spec.nodeSelector

Labels for the node that runs the NIM pod (for example, kubernetes.io/hostname).

Model pull secret

spec.parameters.ngcModelDownload.secretName

Must match the model pull secret created in Getting Started.

Pipeline I/O

spec.inputs / spec.outputs

SMPTE ST 2110 static (fixed sessions and addresses) or NMOS-managed connections.

Pipeline tuning

spec.parameters

Refer to Pipeline Tuning and Operator Configuration.

Audio inputs

spec.inputs.audio_inputs

Define audio_input_0audio_input_<n-1> for n streams—one block per stream.

Apply the manifest:

kubectl apply -f nim-media-function.yaml

On Red Hat OpenShift, replace kubectl with oc.

Example Manifest#

Reference a custom resource with a placeholder node name, ngc-model-pull-api-key for model download, and common video format defaults. The following spec.parameters.resources block below is part of this CR; it sets Kubernetes requests and limits for the operand workload, using the same numeric defaults as the standalone NIM service chart in Pipeline Configuration.

apiVersion: nvidia.com/v1alpha1
kind: NvidiaActiveSpeakerDetectionMediaFunction
metadata:
  labels:
    app.kubernetes.io/name: asd-operator
    app.kubernetes.io/managed-by: kustomize
  name: nvidia-active-speaker-detection-h4m-media-function
spec:
  schedulerName: topo-aware-scheduler
  nodeSelector:
    kubernetes.io/hostname: <gpu-node-name>
  parameters:
    highSpeedNetwork:
      - name: media-a-tx-net
    resources:
      requests:
        cpu: "12"
        memory: "8Gi"
        hugepages-2Mi: "8Gi"
        nvidia.com/gpu: "1"
      limits:
        cpu: "12"
        memory: "8Gi"
        hugepages-2Mi: "8Gi"
        nvidia.com/gpu: "1"
    ngcModelDownload:
      secretName: ngc-model-pull-api-key
      secretKey: NGC_API_KEY
    nimLogs:
      path: /workspace/nim-logs
      pvc:
        enabled: false
        claimName: asd-nim-logs
    nimModelCache:
      path: /opt/nim/.cache
      pvc:
        enabled: false
        claimName: asd-nim-model-cache
    securityContext:
      runAsUser: 1000
      runAsGroup: 1000
      runAsNonRoot: true
      allowPrivilegeEscalation: true
      capabilities:
        add:
          - IPC_LOCK
          - NET_RAW
          - SYS_NICE
          - DAC_READ_SEARCH
        drop:
          - ALL
    podSecurityContext:
      fsGroup: 1000
      seccompProfileType: RuntimeDefault
    label: "my-asd-instance"
    description: "Active Speaker Detection - 1080p30, 2 audio streams"
    syncTolerance: "0.5986"
    testFrameOverlayMode: false
    outputFrameBufferSize: 30
    useAudioThresholdToDetectActiveAudioStream: false
    audioThresholdDb: "-40.0"
    loggingLevel: 3

  inputs:
    video_input:
      control:
        nmos: {}
      transport:
        rtp: {}
      format:
        video:
          frame_width: 1920
          frame_height: 1080
          frame_rate: "30"
          # Optional — typical defaults:
          media_type: "video/raw"
          interlace_mode: "progressive"
          color_sampling: "YCbCr-4:2:2"
          component_depth: 10
          colorspace: "BT709"
          transfer_characteristic: "SDR"

    audio_inputs:
      audio_input_0:
        control:
          nmos: {}
        transport:
          rtp: {}
        format:
          audio:
            sample_rate: 48000
            channel_count: 1
            sample_depth: 24
            media_type: "audio/L24"

      audio_input_1:
        control:
          nmos: {}
        transport:
          rtp: {}
        format:
          audio:
            sample_rate: 48000
            channel_count: 1
            sample_depth: 24
            media_type: "audio/L24"

  outputs:
    video_output:
      control:
        nmos: {}
      transport:
        rtp: {}
      format:
        video:
          frame_width: 1920
          frame_height: 1080
          frame_rate: "30"
          media_type: "video/raw"
          interlace_mode: "progressive"
          color_sampling: "YCbCr-4:2:2"
          component_depth: 10
          colorspace: "BT709"
          transfer_characteristic: "SDR"

    ancillary_data_output:
      control:
        nmos: {}
      transport:
        rtp: {}
      format:
        data:
          media_type: "video/smpte291"

Extended Parameters (Optional)#

As needed, you can add optional spec.parameters fields, including resources (CPU, memory, GPU, hugepages), nimLogs / nimModelCache PVC settings, useAudioThresholdToDetectActiveAudioStream, and testFrameOverlayMode. For field-by-field details, refer to Operator Configuration in the configuration reference. For PVC configuration, refer to Advanced Usage.


Verify#

After applying the custom resource:

# List all custom resources with their Provisioned status.
kubectl get nvidiaactivespeakerdetectionmediafunctions

# List all pods managed by the operator.
kubectl get pods -l app.kubernetes.io/managed-by=asd-operator

Inspect custom resource status conditions:

kubectl get nvidiaactivespeakerdetectionmediafunction <cr-name> \
  -o json | jq '.status.conditions'

Confirm the CRD is installed:

kubectl get crd nvidiaactivespeakerdetectionmediafunctions.nvidia.com

On Red Hat OpenShift, replace kubectl with oc. For log access, see Observability.


Uninstall#

First, delete all custom resource instances so the operator can remove the managed deployments and ConfigMaps; then remove the Helm release. Delete the CRD cluster-wide only if coordinated with cluster administrators; it affects all namespaces.

kubectl delete nvidiaactivespeakerdetectionmediafunctions.nvidia.com --all
helm uninstall ai4m-asd-operator
# Optional — removes the custom resource definition cluster-wide:
kubectl delete crd nvidiaactivespeakerdetectionmediafunctions.nvidia.com

On Red Hat OpenShift, replace kubectl with oc.

For troubleshooting and known limitations, refer to Advanced Usage.


See Also#