Appendix A: Helm Chart Parameters#
Refer to the Kubernetes API reference for details on customizing values in the Helm chart.
The values shown are the default values.
Deployment Parameters#
Name |
Description |
Value |
---|---|---|
|
Affinity settings for deployment. Allows to constraint pods to nodes. |
|
|
Specify privilege and access control settings for Container(Only affects the main container). |
|
|
Adds arbitrary environment variables to the main container - Key Value Pairs. |
|
|
Adds arbitrary additional volumes to the deployment set definition. |
|
|
NIM-LLM Image Repository. |
|
|
Image tag. |
|
|
Image pull policy. |
|
|
Specify secret names that are needed for the main container and any init containers. Object keys are the names of the secrets. |
|
|
Specify labels to ensure that NeMo Inference is deployed only on certain nodes (likely best to set this to |
|
|
Specify additional annotation to the main deployment pods. |
|
|
Specify privilege and access control settings for pod (Only affects the main pod). |
|
|
Specify user UID for pod. |
|
|
Specify group ID for pod. |
|
|
Specify file system owner group id. |
|
|
Specify replica count for deployment. |
|
|
Specify resources limits and requests for the running service. |
|
|
Specify number of GPUs to present to the running service. |
|
|
Specifies whether a service account should be created. |
|
|
Specifies annotations to be added to the service account. |
|
|
Specifies whether to automatically mount the service account to the container. |
|
|
Specify name of the service account to use. If it is not set and create is true, a name is generated using a fullname template. |
|
|
Specify tolerations for pod assignment. Allows the scheduler to schedule pods with matching taints. |
Autoscaling Parameters#
Values used for autoscaling. If autoscaling is not enabled, these are ignored. They should be overridden on a per-model basis based on quality-of-service metrics as well as cost metrics. This isn’t recommended except with usage of the custom metrics API using something like the prometheus-adapter. Standard metrics of CPU and memory are of limited use in scaling NIM.
Name |
Description |
Value |
---|---|---|
|
Enable horizontal pod autoscaler. |
|
|
Specify minimum replicas for autoscaling. |
|
|
Specify maximum replicas for autoscaling. |
|
|
Array of metrics for autoscaling. |
|
Ingress Parameters#
Name |
Description |
Value |
---|---|---|
|
Enables ingress. |
|
|
Specify class name for Ingress. |
|
|
Specify additional annotations for ingress. |
|
|
Specify list of hosts each containing lists of paths. |
|
|
Specify name of host. |
|
|
Specify ingress path. |
|
|
Specify path type. |
|
|
Specify service type. It can be nemo or openai – make sure your model serves the appropriate port(s). |
|
|
Specify list of pairs of TLS secretName and hosts. |
|
Probe Parameters#
Name |
Description |
Value |
---|---|---|
|
Enable livenessProbe. |
|
|
LivenessProbe http or script, but no script is currently provided. |
|
|
LivenessProbe endpoint path. |
|
|
Initial delay seconds for livenessProbe. |
|
|
Timeout seconds for livenessProbe. |
|
|
Period seconds for livenessProbe. |
|
|
Success threshold for livenessProbe. |
|
|
Failure threshold for livenessProbe. |
|
|
Enable readinessProbe. |
|
|
Readiness Endpoint Path. |
|
|
Initial delay seconds for readinessProbe. |
|
|
Timeout seconds for readinessProbe. |
|
|
Period seconds for readinessProbe. |
|
|
Success threshold for readinessProbe. |
|
|
Failure threshold for readinessProbe. |
|
|
Enable startupProbe. |
|
|
StartupProbe Endpoint Path. |
|
|
Initial delay seconds for startupProbe. |
|
|
Timeout seconds for startupProbe. |
|
|
Period seconds for startupProbe. |
|
|
Success threshold for startupProbe. |
|
|
Failure threshold for startupProbe. |
|
Storage Parameters#
Name |
Description |
Value |
---|---|---|
|
Specify settings to modify the path |
|
|
Enable persistent volumes. |
|
|
Secify existing claim. If using existingClaim, run only one replica or use a ReadWriteMany storage setup. |
|
|
Specify persistent volume storage class. If null (the default), no storageClassName spec is set, choosing the default provisioner. |
|
|
Specify whether the Persistent Volume should survive when the helm chart is upgraded or deleted. |
|
|
True if you need to have the chart create a PV for hostPath use cases. |
|
|
Specify accessModes. If using an NFS or similar setup, you can use ReadWriteMany. |
|
|
Specify size of claim (e.g. 8Gi). |
|
|
Configures model cache on local disk on the nodes using hostPath – for special cases. One should investigate and understand the security implications before using this option. |
|
Service Parameters#
Name |
Description |
Value |
---|---|---|
|
Specify service type for the deployment. |
|
|
Override the default service name. |
|
|
Specify HTTP Port for the service. |
|
|
Specify additional annotations to be added to service. |
|