About Deploying and Running Inference on NIM#
The NeMo microservices platform provides capabilities of simplifying the deployment and management of NIM and proxying them through a single NeMo platform host endpoint. In this section, you learn how to use the capabilities for deploying NIM to your Kubernetes cluster and proxying them.
Tutorials#
The following guides provide detailed information on how to deploy NIM, proxy them through a single API, and run inference.
Deploy NIM to your Kubernetes cluster.
Proxy deployed NIM and run inference on them.
Task Guides#
Perform common tasks for deploying NIM and running inference on them.
Manage NIM deployments and their configurations.
Discover models deployed as NIM microservices and run inference on them through the single API endpoint of the NIM Proxy microservice.