Managing NIM Services as a NIM Pipeline

About NIM Pipelines

As an alternative to managing NIM services individually using multiple NIMService custom resources, you can manage multiple NIM services using one NIMPipeline custom resource.

The following sample manifest deploys NIM for LLMs only.

apiVersion: apps.nvidia.com/v1alpha1
kind: NIMPipeline
metadata:
  name: pipeline-all
spec:
  services:
    - name: meta-llama3-8b-instruct
      enabled: true
      spec:
        image:
          repository: nvcr.io/nim/meta/llama3-8b-instruct
          tag: 1.0.3
          pullPolicy: IfNotPresent
          pullSecrets:
          - ngc-secret
        authSecret: ngc-api-secret
        storage:
          nimCache:
            name: meta-llama3-8b-instruct
            profile: ''
        replicas: 1
        resources:
          limits:
            nvidia.com/gpu: 1
        expose:
          service:
            type: ClusterIP
            port: 8000

Refer to the following table for information about the commonly modified fields:

Field

Description

Default Value

spec.services.enabled

When set to true, the Operator deploys the NIM service.

false

spec.services.name

Specifies a name for the NIM service.

None

spec.services.spec

Specifies a NIMService custom resource that represents the NIM microservice.

None

Prerequisites

  • A NIM cache for each NIM microservice or a PVC that you can specify in the spec.storage.pvc field of the NIM service specification.

Procedure

  1. Create a file, such as pipeline-all.yaml, with contents like the following example:

    apiVersion: apps.nvidia.com/v1alpha1
    kind: NIMPipeline
    metadata:
      name: pipeline-all
    spec:
      services:
        - name: meta-llama3-8b-instruct
          enabled: true
          spec:
            image:
              repository: nvcr.io/nim/meta/llama3-8b-instruct
              tag: 1.0.3
              pullPolicy: IfNotPresent
              pullSecrets:
              - ngc-secret
            authSecret: ngc-api-secret
            storage:
              nimCache:
                name: meta-llama3-8b-instruct
                profile: ''
            replicas: 1
            resources:
              limits:
                nvidia.com/gpu: 1
            expose:
              service:
                type: ClusterIP
                port: 8000
        - name: nv-embedqa-e5-v5
          enabled: true
          spec:
            image:
              repository: nvcr.io/nim/nvidia/nv-embedqa-e5-v5
              tag: 1.0.4
              pullPolicy: IfNotPresent
              pullSecrets:
              - ngc-secret
            authSecret: ngc-api-secret
            storage:
              nimCache:
                name: nv-embedqa-e5-v5
                profile: ''
            replicas: 1
            resources:
              limits:
                nvidia.com/gpu: 1
            expose:
              service:
                type: ClusterIP
                port: 8000
        - name: nv-rerank-mistral-4b-v3
          enabled: true
          spec:
            image:
              repository: nvcr.io/nim/nvidia/nv-rerankqa-mistral-4b-v3
              tag: 1.0.4
              pullPolicy: IfNotPresent
              pullSecrets:
              - ngc-secret
            authSecret: ngc-api-secret
            storage:
              nimCache:
                name: nv-rerankqa-mistral-4b-v3
                profile: ''
            replicas: 1
            resources:
              limits:
                nvidia.com/gpu: 1
            expose:
              service:
                type: ClusterIP
                port: 8000
    
  2. Apply the manifest:

    $ kubectl apply -n nim-service -f pipeline-all.yaml
    
  3. Optional: View information about the pipeline:

    $ kubectl describe nimpipelines.apps.nvidia.com -n nim-service
    

Refer to Verification to confirm the NIM for LLMs microservice is available.

Deleting NIM Pipelines

To delete a pipeline and remove the resources and objects associated with the services, perform the following steps:

  1. View the pipeline custom resources:

    $ kubectl get nimpipelines.apps.nvidia.com -A
    

    Example Output

    NAMESPACE    NAME          STATUS
    nim-service  pipeline-all  deployed
    
  2. Delete the custom resource:

    $ kubectl delete nimpipelines.apps.nvidia.com -n nim-service pipeline-all
    

Next Steps