Managing NIM Services as a NIM Pipeline
About NIM Pipelines
As an alternative to managing NIM services individually using multiple NIMService
custom resources,
you can manage multiple NIM services using one NIMPipeline
custom resource.
The following sample manifest deploys NIM for LLMs only.
apiVersion: apps.nvidia.com/v1alpha1
kind: NIMPipeline
metadata:
name: pipeline-all
spec:
services:
- name: meta-llama3-8b-instruct
enabled: true
spec:
image:
repository: nvcr.io/nim/meta/llama3-8b-instruct
tag: 1.0.0
pullPolicy: IfNotPresent
pullSecrets:
- ngc-secret
authSecret: ngc-api-secret
storage:
nimCache:
name: meta-llama3-8b-instruct
profile: ''
replicas: 1
resources:
limits:
nvidia.com/gpu: 1
expose:
service:
type: ClusterIP
port: 8000
Refer to the following table for information about the commonly modified fields:
Field |
Description |
Default Value |
---|---|---|
|
When set to |
|
|
Specifies a name for the NIM service. |
None |
|
Specifies a |
None |
Prerequisites
A NIM cache for each NIM microservice or a PVC that you can specify in the
spec.storage.pvc
field of the NIM service specification.
Procedure
Create a file, such as
pipeline-all.yaml
, with contents like the following example:apiVersion: apps.nvidia.com/v1alpha1 kind: NIMPipeline metadata: name: pipeline-all spec: services: - name: meta-llama3-8b-instruct enabled: true spec: image: repository: nvcr.io/nim/meta/llama3-8b-instruct tag: 1.0.0 pullPolicy: IfNotPresent pullSecrets: - ngc-secret authSecret: ngc-api-secret storage: nimCache: name: meta-llama3-8b-instruct profile: '' replicas: 1 resources: limits: nvidia.com/gpu: 1 expose: service: type: ClusterIP port: 8000 - name: nv-embedqa-e5-v5 enabled: true spec: image: repository: nvcr.io/nim/nvidia/nv-embedqa-e5-v5 tag: 1.0.0 pullPolicy: IfNotPresent pullSecrets: - ngc-secret authSecret: ngc-api-secret storage: nimCache: name: nv-embedqa-e5-v5 profile: '' replicas: 1 resources: limits: nvidia.com/gpu: 1 expose: service: type: ClusterIP port: 8000 - name: nv-rerank-mistral-4b-v3 enabled: true spec: image: repository: nvcr.io/nim/nvidia/nv-rerankqa-mistral-4b-v3 tag: 1.0.0 pullPolicy: IfNotPresent pullSecrets: - ngc-secret authSecret: ngc-api-secret storage: nimCache: name: nv-rerankqa-mistral-4b-v3 profile: '' replicas: 1 resources: limits: nvidia.com/gpu: 1 expose: service: type: ClusterIP port: 8000
Apply the manifest:
$ kubectl apply -n nim-service pipeline-all.yaml
Optional: View information about the pipeline:
$ kubectl describe nimpipelines.apps.nvidia.com -n nim-service
Refer to Verification to confirm the NIM for LLMs microservice is available.
Deleting NIM Pipelines
To delete a pipeline and remove the resources and objects associated with the services, perform the following steps:
View the pipeline custom resources:
$ kubectl get nimpipelines.apps.nvidia.com -A
Example Output
NAMESPACE NAME STATUS nim-service pipeline-all deployed
Delete the custom resource:
$ kubectl delete nimpipelines.apps.nvidia.com -n nim-service pipeline-all
Next Steps
Deploy applications to use the NIM services, such as the Sample RAG Application.