Configure Resource Quotas#

It’s recommended that you create Kubernetes resource quotas for the namespace where the NVIDIA NIM Operator is deployed as well as any namespace where you deploy NIM Operator managed microservices.

The following is a list of default namespaces where you should consider adding resource quotas. Your cluster may use different namespaces.

  • nim-operator: The namespace where the NIM Operator is deployed.

  • nim-service: The namespace where NIM microservices are deployed.

  • nemo: The namespace where NeMo microservices are deployed.

In addition to using resource quotas, you should make sure your cluster has enough available resources for the models and microservices your are deploying. Resource quotas can help manage and priortize resource consumption on your cluster, however, if a cluster’s available resources are much smaller than required, you may still experience pod evictions even with resource quotas applied.
Refer to the Platform Support page for details on resource requirements.

Important

Resource quotas are required when using the NIM Operator on GKE clusters.

Create a Namespace Quota#

The sample below configures a resource quota in the NIM Operator namespace, nim-operator. Update the namespace name to the namespace where you are applying the resource quota.

  1. Create a manifest for the resource quota like the resource-quota.yaml file below.

    apiVersion: v1
    kind: ResourceQuota
    metadata:
      name: nim-operator-quota
    spec:
      hard:
        pods: 100
      scopeSelector:
        matchExpressions:
        - operator: In
          scopeName: PriorityClass
          values:
            - system-node-critical
            - system-cluster-critical
    

    Note

    This sample manifest uses PriortityClass to manage resource useage.

    Refer to the Kubernetes resource quota documentation for more details on setting resource limits in your cluster.

  2. Apply the manifest:

    $ kubectl apply -f resource-quota.yaml -n nim-operator
    
  3. Optional: View the resource quota.

    $ kubectl describe -n nim-operator resourcequota
    

    Example Output

    Name:                   nim-operator-quota
    Namespace:              nim-operator
    Resource                Used   Hard
    --------                ----   ----
    configmaps              1      50
    limits.cpu              1      40
    limits.memory           256Mi  128Gi
    persistentvolumeclaims  0      10
    pods                    1      100
    requests.cpu            500m   20
    requests.memory         128Mi  64Gi
    secrets                 1      50
    services                1      20