Configure RBAC#
Inject Istio#
Label the namespace to enable Istio injection.
kubectl label namespace <namespace> istio-injection=enabled --overwrite
Replace the
<namespace>
with your target namespace.Delete the existing pods to recreate them with Istio sidecar containers.
kubectl delete pod $(kubectl get pods -n <namespace> | awk '{print $1}') -n <namespace>
Deploy Manifests#
The following sample manifest deploys a gateway and ingress virtual service.
Update the target namespace for the virtual service resource.
The sample manifest applies to NVIDIA NIM for LLMs. For other NVIDIA microservices, update the
match
androute
for the microservice endpoints.For information about the microservice endpoints, refer to the following documents:
--- apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: rag-gateway namespace: istio-system spec: selector: istio: ingressgateway servers: - port: number: 80 name: http2 protocol: HTTP hosts: - "*" --- apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: sample-vs namespace: <namespace> spec: hosts: - "*" gateways: - istio-system/rag-gateway http: - match: - uri: prefix: /admin - uri: prefix: /resources - uri: prefix: /welcome - uri: prefix: /realms route: - destination: host: keycloak.default.svc.cluster.local port: number: 8080 - match: - uri: prefix: /v1/completions - uri: prefix: /v1/chat/completions route: - destination: host: inferencing port: number: 8080
Apply the manifest.
kubectl apply -f istio-sample-manifest.yaml
Determine the Istio ingress gateway node port.
kubectl get svc -n istio-system | grep ingress
Example Output
istio-ingressgateway LoadBalancer 10.102.8.149 10.28.234.101 15021:32658/TCP,80:30611/TCP,443:31874/TCP,31400:30160/TCP,15443:32430/TCP 22h
List the worker IP addresses.
for node in `kubectl get nodes | awk '{print $1}' | grep -v NAME`; do echo $node ' ' | tr -d '\n'; kubectl describe node $node | grep -i 'internalIP:' | awk '{print $2}'; done
Example Output
nim-test-cluster-03-worker-nbhk9-56b4b888dd-8lpqd 10.120.199.16 nim-test-cluster-03-worker-nbhk9-56b4b888dd-hnrxr 10.120.199.23
The following manifest creates request authentication resources.
Update the target namespace.
Modify the issuer in the manifest with one of the preceding IP addresses and preceeding ingress Istio gateway node ports, mapped to port 80.
--- apiVersion: security.istio.io/v1beta1 kind: RequestAuthentication metadata: name: nim-request-authentication namespace: <namespace> spec: selector: matchLabels: app.kubernetes.io/name: inferencing jwtRules: - issuer: "http://10.176.21.249:30669/realms/nvidia-nim" jwksUri: "http://keycloak.default.svc.cluster.local:8080/realms/nvidia-nim/protocol/openid-connect/certs" forwardOriginalToken: true fromHeaders: - name: Authorization prefix: "Bearer" - issuer: "http://10.176.21.249/realms/nvidia-nim" jwksUri: "http://keycloak.default.svc.cluster.local:8080/realms/nvidia-nim/protocol/openid-connect/certs" forwardOriginalToken: true fromHeaders: - name: Authorization prefix: "Bearer" --- apiVersion: security.istio.io/v1beta1 kind: RequestAuthentication metadata: name: nim-request-authentication-gw namespace: istio-system spec: selector: matchLabels: istio: ingressgateway jwtRules: - issuer: "http://10.176.21.249:30669/realms/nvidia-nim" jwksUri: "http://keycloak.default.svc.cluster.local:8080/realms/nvidia-nim/protocol/openid-connect/certs" forwardOriginalToken: true fromHeaders: - name: Authorization prefix: "Bearer" - issuer: "http://10.176.21.249/realms/nvidia-nim" jwksUri: "http://keycloak.default.svc.cluster.local:8080/realms/nvidia-nim/protocol/openid-connect/certs" forwardOriginalToken: true fromHeaders: - name: Authorization prefix: "Bearer"
Apply the manifest.
kubectl apply -f requestAuthentication.yaml
The following manifest creates an authorization policy resource.
Update the target namespace.
Update the rules that apply to the target microservices.
apiVersion: security.istio.io/v1beta1 kind: AuthorizationPolicy metadata: name: nim-auth-policy namespace: <namespace> spec: selector: matchLabels: app.kubernetes.io/name: inferencing rules: - from: - source: requestPrincipals: ["*"] to: - operation: methods: ["POST"] paths: ["/v1/completions*"] when: - key: request.auth.claims[realm_access][roles] values: ["completions"] - from: - source: requestPrincipals: ["*"] to: - operation: methods: ["POST"] paths: ["/v1/chat/completions*"] when: - key: request.auth.claims[realm_access][roles] values: ["chat"]
Apply the manifest.
kubectl apply -f authorizationPolicy.yaml
Create a token for Keycloak authentication. Update the node IP address and ingress gateway node port.
TOKEN=`curl -X POST -d "client_id=nvidia-nim" -d "username=nim" -d "password=nvidia123" -d "grant_type=password" "http://10.217.19.114:30611/realms/nvidia-nim-llm/protocol/openid-connect/token"| jq .access_token| tr -d '"' `
Verify access to the microservice from Keycloak through the Istio gateway.
curl -v -X POST http://10.217.19.114:30611/v1/completions -H "Authorization: Bearer $TOKEN" -H 'accept: application/json' -H 'Content-Type: application/json' -d '{ "model": "llama-2-13b-chat","prompt": "What is Kubernetes?","max_tokens": 16,"temperature": 1, "n": 1, "stream": false, "stop": "string", "frequency_penalty": 0.0 }'
Update the node IP address and ingress gateway port. Update the model name if it is not
llama-2-13b-chat
.Generate some more data so it can be visualized in the next step on the Kiali dashboard.
for i in $(seq 1 100); do curl -X POST http://10.217.19.114:30611/v1/chat/completions -H 'accept: application/json' -H "Authorization: Bearer $TOKEN" -H 'Content-Type: application/json' -d '{"model": "llama-2-13b-chat","messages": [{"role": "system","content": "You are a helpful assistant."},{"role": "user", "content": "Hello!"}]}' -s -o /dev/null; done
Access the Istio Dashboard, specifying your client system IP address.
istioctl dashboard kiali --address <system-ip>
Access in browser with system-ip
and port 20001
.
Conclusion#
This architecture offers a robust solution for deploying NVIDIA NeMo MicroServices in a secure, scalable, and efficient manner. Integrating advanced service mesh capabilities with OIDC authentication sets a new standard for building sophisticated AI-driven applications.