Custom Resource Definition (CRD) Reference#
The NVIDIA Nsight Operator uses Custom Resource Definitions (CRDs) as its
configuration API. When you install the operator using Helm, the Helm values are
templated into CRD instances (Custom Resources). After installation, you can also
create or modify CRs directly using kubectl.
Two ways to configure:
During installation: Set Helm values which get templated into CRs.
After installation: Create/modify CRs directly with
kubectl apply.
The sections below document each CRD field along with its corresponding Helm value path (where applicable).
NsightCoordinator#
API Version: nvidia.com/v1alpha1 | Kind: NsightCoordinator
Manages the coordinator deployment for profiling session control. The operator controller reconciles this CR to deploy coordinator pods and services.
Field |
Description |
Default |
Helm Value |
|---|---|---|---|
|
Resource limits and requests for the coordinator container |
|
|
|
Kubernetes service type (LoadBalancer, NodePort, ClusterIP) |
|
|
|
Annotations for the coordinator service |
|
|
|
Labels for the coordinator service |
|
|
|
Enable REST API for the coordinator |
|
|
|
Port for the REST API service |
|
|
|
Number of REST API workers |
|
|
|
Enable Ingress for the REST API |
|
|
|
Authentication type for the coordinator |
|
|
|
Enable ZeroMQ CURVE encryption for secure communication |
|
|
|
Use custom encryption keys instead of auto-generated |
|
|
|
Node selector for pod assignment |
|
|
|
Tolerations for pod assignment |
|
|
|
Affinity rules for pod assignment |
|
|
|
Topology spread constraints for pods |
|
|
|
Pod-level security context (inherits from
|
(inherits global) |
|
|
Container-level security context (inherits from
|
(inherits global) |
|
Example:
apiVersion: nvidia.com/v1alpha1
kind: NsightCoordinator
metadata:
name: nsight-coordinator
namespace: my-namespace
spec:
service:
type: LoadBalancer
curveEncryption:
enabled: true
resources:
requests:
cpu: 100m
memory: 128Mi
NsightCloudStorageConfig#
API Version: nvidia.com/v1alpha1 | Kind: NsightCloudStorageConfig
Configures cloud storage (S3/MinIO) for profiling results. When MinIO is enabled, the operator deploys and manages MinIO instances.
Field |
Description |
Default |
Helm Value |
|---|---|---|---|
|
Enable this storage configuration |
– |
|
|
Storage backend type: |
|
|
|
S3 bucket name for storing reports |
|
|
|
Path where storage config is mounted in pods |
|
|
|
Reference to external S3 credentials secret |
(auto-generated) |
|
|
Enable operator-managed MinIO deployment. The CRD default is
|
|
|
|
Enable persistent storage for MinIO |
|
|
|
Storage class for the PVC |
|
|
|
Size of the persistent volume |
|
|
|
Resource limits for MinIO container |
|
|
|
MinIO service type |
|
|
|
MinIO API port |
|
|
|
MinIO console port |
|
|
Example:
apiVersion: nvidia.com/v1alpha1
kind: NsightCloudStorageConfig
metadata:
name: nsight-cloud-storage
namespace: my-namespace
spec:
enabled: true
storage_type: s3
bucketName: my-profiling-results
minio:
enabled: true
persistence:
enabled: true
storageClassName: gp2
size: 20Gi
NsightOtelCollector#
API Version: nvidia.com/v1alpha1 | Kind: NsightOtelCollector | Short Name: noc
Configures OTLP collector infrastructure for trace mirroring. Creates a StatefulSet with the OpenTelemetry collector and optional converter sidecar.
Field |
Description |
Default |
Helm Value |
|---|---|---|---|
|
Number of replicas for the collector StatefulSet |
|
|
|
Resource limits for the collector container |
|
|
|
Kubernetes service type |
|
|
|
OTLP gRPC receiver port |
|
|
|
OTLP HTTP receiver port |
|
|
|
Enable persistent storage for OTLP data |
|
|
|
Size of the persistent volume |
|
|
|
Enable OTLP-to-Nsight report converter sidecar |
|
|
|
Resource limits for the converter container |
|
|
|
How often to check for conversion requests |
|
– |
|
Grace period for conversion completion |
|
|
Example:
apiVersion: nvidia.com/v1alpha1
kind: NsightOtelCollector
metadata:
name: nsight-otel-collector
namespace: my-namespace
spec:
replicas: 1
receivers:
otlpGRPCPort: 4317
otlpHTTPPort: 4318
persistentStorage:
enabled: true
size: 20Gi
otlpConverter:
enabled: true
conversionGracePeriodSeconds: 10
OTLPProxyConfig#
API Version: nvidia.com/v1alpha1 | Kind: OTLPProxyConfig | Short Name: opc
Configures OTLP proxy injection for pods. The injector webhook uses this to inject Envoy sidecars for trace mirroring.
Field |
Description |
Default |
Helm Value |
|---|---|---|---|
|
Enable OTLP proxy injection |
|
|
|
Container image for the OTLP proxy sidecar |
|
– |
|
Endpoint of the Nsight OTel Collector (auto-discovered if empty) |
(auto-discovered) |
– |
|
Resource requirements for the proxy sidecar |
|
– |
Example:
apiVersion: nvidia.com/v1alpha1
kind: OTLPProxyConfig
metadata:
name: nsight-otlp-proxy-config
namespace: my-namespace
spec:
enabled: true
resources:
requests:
memory: 50Mi
cpu: 100m
limits:
memory: 200Mi
cpu: 500m
NsightAnalysis#
API Version: nvidia.com/v1alpha1 | Kind: NsightAnalysis
Configures Nsight Analysis service infrastructure for running recipes.
Field |
Description |
Default |
Helm Value |
|---|---|---|---|
|
Resource requirements for the analysis container |
|
|
|
Kubernetes service type |
|
|
|
Port for the REST API service |
|
|
|
Annotations for the analysis service |
|
|
|
Node selector for pod assignment |
|
|
|
Tolerations for pod assignment |
|
|
|
Affinity rules |
|
|
|
Topology spread constraints for pods |
|
|
Example:
apiVersion: nvidia.com/v1alpha1
kind: NsightAnalysis
metadata:
name: nsight-analysis
namespace: my-namespace
spec:
resources:
requests:
cpu: 1000m
memory: 1Gi
NsightStreamer#
API Version: nvidia.com/v1alpha1 | Kind: NsightStreamer | Short Name: ns
Deploys a browser-based Nsight Systems viewer for profiling reports stored in the cluster. Nsight Operator integrates with Nsight Systems only; the upstream streamer container can host other Nsight tools, but they are out of scope for the operator.
Field |
Description |
Default |
Helm Value |
|---|---|---|---|
|
Nsight tool to run. Only |
|
|
|
Kubernetes service type |
|
|
|
HTTP port for web interface |
|
|
|
WebRTC TURN port |
|
|
|
Enable resizing to fit remote resolution to client window |
|
|
|
Maximum resolution for streaming |
|
|
|
Username for browser authentication (string or secret ref) |
|
|
|
Password for browser authentication (string or secret ref). Warning The |
|
|
|
Preserve Nsight Tool configuration between restarts |
|
|
|
Reference to NsightCloudStorageConfig for report access |
– |
(auto-configured) |
|
Resource limits (set |
|
|
|
RuntimeClass for GPU acceleration (set to |
– |
|
|
Additional volumes for mounting report files |
|
|
|
Volume mounts for the streamer container |
|
|
Note
SecurityContext is intentionally not configurable. Nsight Streamer runs as the
nvidia user internally and requires write access to its home directory.
Example:
apiVersion: nvidia.com/v1alpha1
kind: NsightStreamer
metadata:
name: nsight-viewer
namespace: my-namespace
spec:
service:
type: ClusterIP
httpPort: 30080
turnPort: 30478
enableResize: true
maxResolution: "1920x1080"
# Placeholders only -- for production, reference credentials from a
# Kubernetes Secret (see the "Example with Kubernetes Secret" below).
webUsername:
value: "<USERNAME>"
webPassword:
value: "<PASSWORD>"
cloudStorageConfigRef:
name: nsight-cloud-storage
Example with GPU Hardware Acceleration:
See Hardware Acceleration Prerequisites for the GPU, driver, and runtime requirements.
apiVersion: nvidia.com/v1alpha1
kind: NsightStreamer
metadata:
name: nsight-viewer-gpu
namespace: my-namespace
spec:
runtimeClassName: nvidia
resources:
limits:
nvidia.com/gpu: 1
cloudStorageConfigRef:
name: nsight-cloud-storage
Example with Kubernetes Secret:
Store webUsername / webPassword in a Kubernetes Secret rather
than inline plaintext:
apiVersion: nvidia.com/v1alpha1
kind: NsightStreamer
metadata:
name: nsight-viewer
namespace: my-namespace
spec:
webUsername:
secretName: nsight-streamer-auth
secretKey: username
webPassword:
secretName: nsight-streamer-auth
secretKey: password
cloudStorageConfigRef:
name: nsight-cloud-storage
NsightGateway#
API Version: nvidia.com/v1alpha1 | Kind: NsightGateway | Short Name: ngw
Deploys an Envoy-based gateway that provides a unified HTTP entry point for the Coordinator and Analysis REST APIs. The operator controller reconciles this CR to create a Deployment, Service, and Envoy configuration ConfigMap. The gateway automatically discovers NsightCoordinator and NsightAnalysis CRs in the same namespace and routes traffic to them.
Field |
Description |
Default |
Helm Value |
|---|---|---|---|
|
Container image for the Envoy gateway |
|
|
|
Gateway service port |
|
|
|
Virtual host domains for routing |
|
|
|
Reference to a Kubernetes TLS secret to enable HTTPS. The secret must
contain |
– |
|
|
Kubernetes service type (ClusterIP, NodePort, LoadBalancer) |
|
|
|
Annotations for the gateway service |
|
|
|
Enable JWT authentication |
|
|
|
Expected JWT issuer (iss claim) |
|
|
|
Expected JWT audiences (aud claim) |
|
|
|
Reference to a Secret containing the JWKS (key: |
– |
|
|
Enable API key authentication |
|
|
|
API key value (prefer |
|
|
|
Reference to a Secret containing the API key |
– |
|
|
Enable OAuth2 authentication |
|
|
|
OIDC issuer URL. Endpoints and JWKS are auto-discovered. |
|
|
|
OAuth2 client ID |
|
|
|
Reference to a Kubernetes Secret containing the OAuth2 client secret
and HMAC secret. The secret must have keys named |
– |
Chart-managed from |
|
OAuth2 scopes to request |
|
|
|
URL prefix for coordinator routes |
|
|
|
URL prefix for analysis routes |
|
|
|
URL prefix for NsightTenantOperator routes (used by the NsightCloudUI) |
|
|
|
URL prefix for NsightStreamer routes (used by the NsightCloudUI to reach launched streamers) |
|
|
|
Reference to a NsightCoordinator CR (auto-discovered if not set) |
(auto-discovered) |
– |
|
Reference to a NsightAnalysis CR (auto-discovered if not set) |
(auto-discovered) |
– |
|
Resource limits and requests for the gateway container |
|
|
Example with JWT authentication:
apiVersion: nvidia.com/v1alpha1
kind: NsightGateway
metadata:
name: nsight-operator-gateway
namespace: my-namespace
spec:
port: 8888
service:
type: LoadBalancer
authentication:
jwt:
enabled: true
issuer: https://example.com
audiences:
- nsight-cloud
jwksSecretRef:
name: my-jwks-secret
Example with API key authentication:
apiVersion: nvidia.com/v1alpha1
kind: NsightGateway
metadata:
name: nsight-operator-gateway
namespace: my-namespace
spec:
service:
type: ClusterIP
authentication:
apikey:
enabled: true
keySecretRef:
name: my-apikey-secret
key: api-key
Example with TLS and OAuth2 authentication:
apiVersion: nvidia.com/v1alpha1
kind: NsightGateway
metadata:
name: nsight-operator-gateway
namespace: my-namespace
spec:
service:
type: LoadBalancer
tlsSecretRef:
name: gateway-tls
authentication:
oauth2:
enabled: true
issuer: https://login.example.com
clientId: "<client-id>"
clientSecretRef:
name: gateway-oauth2
scopes:
- openid
- profile
- email
NsightTenantOperator#
API Version: nvidia.com/v1alpha1 | Kind: NsightTenantOperator | Short Name: nto
Deploys the tenant-scoped FastAPI service that manages per-session Nsight Streamer launches for the Nsight Cloud UI. The operator controller reconciles this CR into a Deployment, Service, ServiceAccount, Role, and RoleBinding that allow the service to create and delete NsightStreamer resources in its namespace.
Field |
Description |
Default |
Helm Value |
|---|---|---|---|
|
Container image for the tenant operator service |
(operator default) |
|
|
Image pull policy |
|
|
|
References to image pull secrets |
|
|
|
Kubernetes service type |
|
|
|
Service port for the tenant operator API |
|
|
|
Annotations for the service |
|
|
|
Reference to a NsightCloudStorageConfig used for storage-backed session discovery |
(auto-discovered) |
– |
|
Maximum number of concurrently active streamers launched by the API.
The CRD default is |
|
|
|
Additional environment variables to inject into launched streamer containers |
|
– |
|
Resource requirements for the tenant operator container |
|
|
|
Node selector for pod assignment |
|
|
|
Tolerations for pod assignment |
|
|
|
Affinity rules for pod assignment |
|
|
|
Topology spread constraints for pods |
|
|
|
Pod-level security context |
(inherits global) |
|
|
Container-level security context |
(inherits global) |
|
REST API#
When the service is reachable via the gateway (prefixed by
spec.routing.tenantOperatorPrefix on NsightGateway,
default /tenant-operator/) it exposes the following endpoints:
Method |
Path |
Description |
|---|---|---|
|
|
Launch a streamer for a session. The request body describes the tool,
credentials, and session ID. The controller creates a corresponding
NsightStreamer CR subject to the
|
|
|
Return the streamer associated with the given session ID, including its URL and status. |
|
|
Terminate and delete the streamer for the given session ID. |
|
|
Liveness / readiness probe. |
Example:
apiVersion: nvidia.com/v1alpha1
kind: NsightTenantOperator
metadata:
name: nsight-tenant-operator
namespace: my-team-ns
spec:
service:
type: ClusterIP
port: 8001
streamerLaunch:
maxActive: 3
cloudStorageConfigRef:
name: nsight-cloud-storage
resources:
requests:
cpu: 100m
memory: 256Mi
NsightCloudUI#
API Version: nvidia.com/v1alpha1 | Kind: NsightCloudUI | Short Name: nui
Deploys the Nsight Cloud UI, a static single-page web application for browsing profiling sessions, collections, and analysis jobs. The UI is served behind the NsightGateway and talks to the NsightTenantOperator to launch per-session streamers.
Field |
Description |
Default |
Helm Value |
|---|---|---|---|
|
Container image for the UI service |
(operator default) |
|
|
Image pull policy |
|
|
|
References to image pull secrets |
|
|
|
Number of UI replicas |
|
|
|
Kubernetes service type |
|
|
|
Service port |
|
|
|
Container target port |
|
|
|
Annotations for the service |
|
|
|
Labels for the service |
|
|
|
URL or gateway-relative path the UI uses to reach the tenant operator.
Injected into the UI as |
|
– |
|
Resource requirements for the UI container |
|
|
|
Node selector for pod assignment |
|
|
|
Tolerations for pod assignment |
|
|
|
Affinity rules for pod assignment |
|
|
|
Topology spread constraints for pods |
|
|
|
Pod-level security context |
(inherits global) |
|
|
Container-level security context |
(inherits global) |
|
Accessing the UI#
When the gateway is enabled (default), the UI is served at the gateway root on
the configured gateway port (8888 by default). The gateway’s
tenantOperatorPrefix (default /tenant-operator/) and streamerPrefix
(default /streamer/) allow the UI to reach the tenant operator and launch
per-session streamers without additional configuration.
Example:
apiVersion: nvidia.com/v1alpha1
kind: NsightCloudUI
metadata:
name: nsight-cloud-ui
namespace: my-team-ns
spec:
replicas: 1
service:
type: ClusterIP
port: 80
targetPort: 8080
tenantOperatorUrl: "/tenant-operator"
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
NsightOperatorProfileConfig#
API Version: nvidia.com/v1 | Kind: NsightOperatorProfileConfig | Short Name: nopc
Defines profiling configurations and injection rules. Namespace-scoped, allowing per-tenant customization. This CRD is managed by the Injector Webhook.
Field |
Description |
Default |
Helm Value |
|---|---|---|---|
|
Default profile name from nsightToolConfigs |
– |
|
|
List of reusable nsight tool configurations |
|
|
|
Unique name identifying the profile |
– |
– |
|
Arguments for Nsight Systems |
– |
|
|
Regex patterns for processes to profile |
|
|
|
Regex patterns for processes to exclude |
|
|
|
Enable coordinator mode |
|
|
|
Coordinator service name (format: |
(auto-discovered) |
– |
|
Volumes to inject into profiled containers |
|
|
|
Volume mounts to inject |
|
|
|
Environment variables for profiled process only |
|
|
|
Environment variables for container (visible to all processes) |
|
|
|
Logging output: |
– |
– |
|
Enable OTLP mirroring (injects Envoy proxy). The CRD itself has no
default; when the default profile is generated from the parent Helm
chart, this is set to |
– |
|
|
List of injection rules |
|
|
|
Unique name for this injection rule |
– |
– |
|
Profile to use for matched pods |
– |
– |
|
Enable/disable this rule |
|
– |
|
Match namespaces by labels |
– |
– |
|
Match pods by labels |
– |
– |
|
CEL expressions for advanced matching |
|
– |
Example:
apiVersion: nvidia.com/v1
kind: NsightOperatorProfileConfig
metadata:
name: team-profile-config
namespace: my-team-ns
spec:
defaultNsightToolConfigRef: "default-profile"
nsightToolConfigs:
- name: "default-profile"
coordinator: true
nsightToolArgs: "--python-sampling=true --cuda-graph-trace=node"
injectionIncludePatterns:
- ".*python.*"
- ".*myapp.*"
otlpMirroringEnabled: true
injectionRules:
- name: "profile-labeled-pods"
objectSelector:
matchLabels:
nvidia-nsight-profile: enabled