Operator Configuration#

This page covers Helm values for the nvidia-active-speaker-detection-h4m-operator chart (controller pod) and fields for the NvidiaActiveSpeakerDetectionMediaFunction custom resource (NIM workload).

Pipeline tuning semantics on the custom resource match the Pipeline Configuration Helm keys.


Operator Helm Chart (nvidia-active-speaker-detection-h4m-operator)#

Values for the operator controller pod (not the NIM deployment):

Configuration

Helm Key

Comment

Operator image

image.repository, image.tag, image.pullPolicy

Defaults: nvcr.io/nim/nvidia/active-speaker-detection-h4m-operator, tag 1.0.0, pull Always.

Inline registry credentials (operator pod)

imageCredentials.*

Defaults: registry nvcr.io, username $oauthtoken, password "", email ""; empty password skips chart-created pull secret.

Inline registry credentials (NIM operands)

mediaFunction.imageCredentials.*

Same shape as imageCredentials; empty password skips the media-function registry secret.

Pull secrets

imagePullSecrets

Default: [{name: ngc-api-key}] (operator pod image pull).

Default NIM image for operands

mediaFunction.image, mediaFunction.imagePullSecrets

Defaults: image nvcr.io/nim/nvidia/active-speaker-detection-h4m-nim:1.0.0, pull secrets [{name: ngc-api-key}].

Default NIM scheduler

mediaFunction.schedulerName

Default: topo-aware-scheduler.

Operator replicas

replicas

Default: 1.

Leader election

leaderElection

Default: true.

Operator pod resources

resources.*

Defaults: limits 500m CPU / 128Mi memory; requests 10m CPU / 64Mi memory.

Metrics

metrics.enabled, metrics.port

Default: true, port 8443.

ServiceAccount

serviceAccount.create, serviceAccount.name

Default: create true, name "".

Important

The media function image is configured when the operator is installed, so NvidiaActiveSpeakerDetectionMediaFunction instances managed by an operator use the same validated operator and media-function image combination. This approach ensures the operator and workload are always supported together and avoids compatibility issues.

As a result, CRs managed by an operator share the chart-level mediaFunction.image and mediaFunction.imagePullSecrets configured at install time. To run multiple custom resources with different NIM image tags, deploy multiple operators, each configured with its own mediaFunction.image.


Custom Resource (NvidiaActiveSpeakerDetectionMediaFunction)#

Full YAML example: Example Manifest.

Scheduling and Cluster#

Configuration

CR Path

Comment

Node selector

spec.nodeSelector

Kubernetes node labels (for example, kubernetes.io/hostname).

Parameters: Network, Resources, Secrets, Security#

Configuration

CR Path

Comment

High-speed networks

spec.parameters.highSpeedNetwork

Default name: media-a-tx-net; for Multus, list of {name, ip?}.

Pod resources

spec.parameters.resources

Kubernetes requests and limits for the operand NIM workload.

NGC model download

spec.parameters.ngcModelDownload

secretName, secretKey; default key: NGC_API_KEY.

Model cache / logs

spec.parameters.nimModelCache / spec.parameters.nimLogs

Optional PVC-backed model cache and NIM logs. Subkeys include path, pvc.enabled, and pvc.claimName; refer to Advanced Usage.

Security context

spec.parameters.securityContext

Example: runAsUser: 1000, capabilities for Rivermax and RTP; refer to Getting Started

Pod security context

spec.parameters.podSecurityContext

Example: fsGroup: 1000, seccompProfileType: RuntimeDefault; applies at the pod level.

NMOS metadata

spec.parameters.label, spec.parameters.description

NMOS IS-04 registration strings.

Pod Resources (spec.parameters.resources)#

Resource

Requests

Limits

cpu

12

12

hugepages-2Mi

8Gi

8Gi

memory

8Gi

8Gi

nvidia.com/gpu

1

1

Parameters: Pipeline Tuning#

Configuration

CR Path (spec.parameters.*)

Environment Variable (on NIM Pod)

Comment

Sync tolerance

syncTolerance

AI4M_ACTIVESPEAKERDETECTION_SYNC_TOLERANCE

Default: 0.5986; valid range: (0, 1). Invalid values fall back to default.

Bounding box overlay

testFrameOverlayMode

AI4M_ACTIVESPEAKERDETECTION_TEST_FRAME_OVERLAY_MODE

true or false

Output frame buffer size

outputFrameBufferSize

AI4M_ACTIVESPEAKERDETECTION_OUTPUT_FRAME_BUFFER_SIZE

Default: 30 frames.

Audio silence detection

useAudioThresholdToDetectActiveAudioStream

AI4M_ACTIVESPEAKERDETECTION_USE_AUDIO_THRESHOLD_TO_DETECT_ACTIVE_AUDIO_STREAM

Default: false.

Audio silence threshold

audioThresholdDb

AI4M_ACTIVESPEAKERDETECTION_AUDIO_THRESHOLD_DB

Default: -40.0 dB.

Log level

loggingLevel

(mapped by operator to NIM env)

0 FATAL, 1 ERROR, 2 WARN, 3 INFO (default), 4 DEBUG, 5 VERBOSE.

Inputs and Outputs#

Area

CR Path

Comment

Video in

spec.inputs.video_input

control, transport, format.video; see subfields in next table.

Audio in

spec.inputs.audio_inputs (audio_input_0audio_input_<n-1>)

One block per stream; count must match pipeline.

Video out

spec.outputs.video_output

Optional; resolution must match input when present.

Ancillary out

spec.outputs.ancillary_data_output

Required for SMPTE ST 2110-40; format.data.media_type default: video/smpte291; payload layout: Ancillary Data Payload.

Video Format Subfields#

These subfields apply to video input and video output.

Field

Example Value

Comment

media_type

video/raw

Fixed for uncompressed ST 2110-20.

frame_width

1920

Must match NIM video.width.

frame_height

1080

Must match NIM video.height.

frame_rate

"30"

Frames per second as a string.

interlace_mode

progressive

color_sampling

YCbCr-4:2:2

component_depth

10

Bits per component.

colorspace

BT709

transfer_characteristic

SDR

frame_rate must be identical for spec.inputs.video_input.format.video and spec.outputs.video_output.format.video. Invalid values are rejected when the CR is applied. The following values are supported:

Value

Rate

"24"

24 fps

"25"

25 fps

"30"

30 fps

"30000/1001"

29.97 fps (fractional)

"50"

50 fps

"60"

60 fps

"60000/1001"

59.94 fps (fractional)

Use a plain integer string for whole rates and a numerator/denominator string for fractional rates. Always quote the value in YAML so fractions are not interpreted as numbers.

Audio Format Subfields#

These subfields apply to each audio_input_n stream.

Field

Example Value

Comment

sample_rate

48000

Must match NIM audio.samplingRate.

channel_count

1

Mono per stream; must match NIM audio.numChannels.

sample_depth

24

Bits; corresponds to PCM L24.

media_type

audio/L24

Fixed for ST 2110-30 mono L24.


CR Schema Reference#

With the nvidiaactivespeakerdetectionmediafunctions.nvidia.com CRD installed, inspect field types and OpenAPI details from the cluster:

kubectl explain nvidiaactivespeakerdetectionmediafunctions.nvidia.com --recursive
kubectl explain nvidiaactivespeakerdetectionmediafunctions.nvidia.com.spec
kubectl explain nvidiaactivespeakerdetectionmediafunctions.nvidia.com.spec.inputs.video_input.format.video
kubectl explain nvidiaactivespeakerdetectionmediafunctions.nvidia.com.spec.outputs.ancillary_data_output

Some validation rules might not appear in kubectl explain output but are still enforced when the CR is applied.

Status Conditions (status.conditions)#

Status Overview#

The operator reports health using Kubernetes conditions in status.conditions (type, status, reason, message).

A media function is considered healthy and ready when the following are all True:

  • Provisioned: Deployment and pod are running.

  • Registered: Successfully registered with NMOS / SDP.

  • Configured: Required configuration has been applied.

Important Statuses#

Status

What It Means

Provisioned

App deployment is running.

Registered

Registration succeeded.

Configured

Configuration applied successfully.

Connected

Connections are established.

Active

Reserved for future media-flow status.

Degraded

Reserved for future health reporting.

When to Alert#

Alert if any of these stay False:

  • Provisioned

  • Registered

  • Configured

Most important failure reason values:

  • DeploymentNotAvailable

  • NmosQueryFailed

Connected False is usually informational or transient and can occur when connections are still being established.

Ignore Active and Degraded for now.

Check Status#

Use the custom resource metadata.name (not a Pod name).

kubectl get nvidiaactivespeakerdetectionmediafunction <cr-name> \
  -o jsonpath='{range .status.conditions[*]}{.type}{"\t"}{.status}{"\t"}{.reason}{"\n"}{end}'

See Also#