Appendix A: Helm Chart Parameters

Refer to the Kubernetes API reference for details on customizing values in the Helm chart.

The values shown are the default values.

Deployment parameters

Refer to the Deployment API reference for details.

Name

Description

Value

affinity

[default: {}] Affinity settings for deployment.

{}

containerSecurityContext

Sets privilege and access control settings for container (Only affects the main container, not pod-level).

{}

customCommand

Overrides command line options sent to the NIM with the array listed here.

[]

customArgs

Overrides command line arguments of the NIM container with the array listed here.

[]

envVars

Adds arbitrary environment variables to the main container using key-value pairs, for example NAME: value.

{}

extraVolumes

Adds arbitrary additional volumes to the deployment set definition.

{}

extraVolumeMounts

Specify volume mounts to the main container from extraVolumes.

{}

image.repository

NIM Image Repository.

""

image.tag

Image tag or version.

""

image.pullPolicy

Image pull policy.

""

imagePullSecrets

Specify list of secret names that are needed for the main container and any init containers.

initContainers

Specify init containers, if needed.initContainers are defined as an object with the name of the container as the key. All other elements of the initContainer definition are the value.

{}

nodeSelector

Sets node selectors for the NIM – for example nvidia.com/gpu.present: "true".

{}

podAnnotations

Sets additional annotations on the main deployment pods.

{}

podSecurityContext

Specify privilege and access control settings for pod.

podSecurityContext.runAsUser

Specify user UID for pod.

1000

podSecurityContext.runAsGroup

Specify group ID for pod.

1000

podSecurityContext.fsGroup

Specify file system owner group id.

1000

replicaCount

Specify static replica count for deployment.

1

resources

Specify resources limits and requests for the running service.

resources.limits.nvidia.com/gpu

Specify number of GPUs to present to the running service.

1

serviceAccount.create

Specifies whether a service account should be created.

false

serviceAccount.annotations

Sets annotations to be added to the service account.

{}

serviceAccount.name

Specifies the name of the service account to use. If it is not set and create is true, a name is generated using a fullname template.

""

statefulSet.enabled

Enables statefulset deployment. Enabling statefulSet allows PVC templates for scaling. If using central PVC with RWX accessMode, this isn’t needed.

false

tolerations

Specify tolerations for pod assignment. Allows the scheduler to schedule pods with matching taints.

Autoscaling parameters

Values used for creating a Horizontal Pod Autoscaler. If autoscaling is not enabled, the rest are ignored. NVIDIA recommends that you use the custom metrics API, commonly implemented with the prometheus-adapter. Standard metrics of CPU and memory are of limited use in scaling NIM.

Refer to the HorizontalPodAutoscaler API reference for details.

Name

Description

Value

autoscaling.enabled

Enables horizontal pod autoscaler.

false

autoscaling.minReplicas

Specify minimum replicas for autoscaling.

1

autoscaling.maxReplicas

Specify maximum replicas for autoscaling.

10

autoscaling.metrics

Array of metrics for autoscaling.

[]

Ingress parameters

Refer to the Ingress API reference for details.

Name

Description

Value

ingress.enabled

Enables ingress.

false

ingress.className

Specify class name for Ingress.

""

ingress.annotations

Specify additional annotations for ingress.

{}

ingress.hosts

Specify list of hosts each containing lists of paths.

ingress.hosts[0].host

Specify name of host.

chart-example.local

ingress.hosts[0].paths[0].path

Specify ingress path.

/

ingress.hosts[0].paths[0].pathType

Specify path type.

ImplementationSpecific

ingress.tls

Specify list of pairs of TLS secretName and hosts.

[]

Probe parameters

Refer to the Pod Probe API reference for details.

Name

Description

Value

livenessProbe.enabled

Enables livenessProbe

true

livenessProbe.path

livenessProbe endpoint path

/v1/health/live

livenessProbe.initialDelaySeconds

Initial delay seconds for livenessProbe

15

livenessProbe.timeoutSeconds

Timeout seconds for livenessProbe

1

livenessProbe.periodSeconds

Period seconds for livenessProbe

10

livenessProbe.successThreshold

Success threshold for livenessProbe

1

livenessProbe.failureThreshold

Failure threshold for livenessProbe

3

readinessProbe.enabled

Enables readinessProbe

true

readinessProbe.path

readinessProbe endpoint path

/v1/health/ready

readinessProbe.initialDelaySeconds

Initial delay seconds for readinessProbe

15

readinessProbe.timeoutSeconds

Timeout seconds for readinessProbe

1

readinessProbe.periodSeconds

Period seconds for readinessProbe

10

readinessProbe.successThreshold

Success threshold for readinessProbe

1

readinessProbe.failureThreshold

Failure threshold for readinessProbe

3

startupProbe.enabled

Enables startupProbe

true

startupProbe.path

startupProbe endpoint path

/v1/health/ready

startupProbe.initialDelaySeconds

Initial delay seconds for startupProbe

40

startupProbe.timeoutSeconds

Timeout seconds for startupProbe

1

startupProbe.periodSeconds

Period seconds for startupProbe

10

startupProbe.successThreshold

Success threshold for startupProbe

1

startupProbe.failureThreshold

Failure threshold for startupProbe

180

Metrics parameters

Refer to the ServiceMonitor API reference for details.

Name

Description

Value

metrics.port

For NIMs with a separate metrics port, this opens that port on the container

0

serviceMonitor

Options for serviceMonitor to use the Prometheus Operator and the primary service object.

metrics.serviceMonitor.enabled

Enables serviceMonitor creation.

false

metrics.serviceMonitor.additionalLabels

Specify additional labels for ServiceMonitor.

{}

NIM parameters

Name

Description

Value

nim.nimCache

Path to mount writeable storage or pre-filled model cache for the NIM

""

nim.modelName

Optionally specifies the name of the model in the API. This can be used in helm tests.

""

nim.ngcAPISecret

Name of pre-existing secret with a key named NGC_API_KEY that contains an API key for NGC model downloads

""

nim.ngcAPIKey

NGC API key literal to use as the API secret and image pull secret when set

""

nim.openaiPort

Specify Open AI Port, for NIM.

0

nim.httpPort

Specify HTTP Port.

8000

nim.grpcPort

Specify GRPC Port.

0

nim.labels

Specify extra labels to be added to the deployed pods.

{}

nim.jsonLogging

Whether to enable JSON lines logging. Defaults to true.

true

nim.logLevel

Log level of NIM service. Possible values of the variable are TRACE, DEBUG, INFO, WARNING, ERROR, CRITICAL.

INFO

Storage parameters

Refer to the PersistentVolumeClaim API reference and the PersistentVolume API reference for details.

Name

Description

Value

persistence

Specify settings to modify the path /model-store if model.legacyCompat is enabled else /.cache volume where the model is served from.

persistence.enabled

Enables the use of persistent volumes.

false

persistence.existingClaim

Specifies an existing persistent volume claim. If using existingClaim, run only one replica or use a ReadWriteMany storage setup.

""

persistence.storageClass

Specifies the persistent volume storage class. If set to "-", this disables dynamic provisioning. If left undefined or set to null, the cluster default storage provisioner is used.

""

persistence.accessMode

Specify accessMode. If using an NFS or similar setup, you can use ReadWriteMany.

ReadWriteOnce

persistence.stsPersistentVolumeClaimRetentionPolicy.whenDeleted

Specifies persistent volume claim retention policy when deleted. Only used with Stateful Set volume templates.

Retain

persistence.stsPersistentVolumeClaimRetentionPolicy.whenScaled

Specifies persistent volume claim retention policy when scaled. Only used with Stateful Set volume templates.

Retain

persistence.size

Specifies the size of the persistent volume claim (for example 40Gi).

50Gi

persistence.annotations

Adds annotations to the persistent volume claim.

{}

hostPath

Configures model cache on local disk on the nodes using hostPath – for special cases. You should understand the security implications before using this option.

hostPath.enabled

Enable hostPath.

false

hostPath.path

Specifies path on the node used as a hostPath volume.

/model-store

nfs

Configures the model cache to sit on shared direct-mounted NFS. NOTE: you cannot set mount options using direct NFS mount to pods without a node-intalled nfsmount.conf. An NFS-based PersistentVolumeClaim is likely better in most cases.

nfs.enabled

Enable direct pod NFS mount

false

nfs.path

Specify path on NFS server to mount

/exports

nfs.server

Specify NFS server address

nfs-server.example.com

nfs.readOnly

Set to true to mount as read-only

false

Service parameters

Refer to the Service API reference for details.

Name

Description

Value

service.type

Specifies the service type for the deployment.

ClusterIP

service.name

Overrides the default service name

""

service.openaiPort

Specifies the OpenAI API Port for the service.

0

service.httpPort

Specifies the HTTP Port for the service.

8000

service.grpcPort

Specifies the GRPC Port for the service.

0

service.metricsPort

Specifies the metrics port on the main service object. Some NIMs do not use a separate port.

0

service.annotations

Specify additional annotations to be added to service.

{}

service.labels

Specifies additional labels to be added to service.

{}