Route Optimization

Services running in production require authentication to verify users and their access. This guide discusses authentication practices used in this workflow (and generally) a bit more in depth.

Running a microservice in production involves making sure that only users who have access to the microservice are the ones who can access it. In production, a client, be it an user or another service, usually is given an access/bearer token that it passes along with the request headers to the service URL, which uses the token to verify if the user is who they say they are and if they have right level of access to the service.

The problem of authentication, in general, has been solved; therefore, building custom authentication for every service you create is poor practice. It makes sense to delegate the job of authentication to a third-party service while your service takes care of the business logic. We leverage pre-existing services called Identity Providers (IDP) for authentication. Some popular examples of IDPs are Azure Active Directory, AWS Cognitio, and Keycloak. These identity providers use the current industry standards for authentication – Open ID Connect (OIDC) and OAuth 2.0 protocol. In this workflow, Keycloak is used as the IDP. You can learn more about the OIDC Authentication flow here.

The OIDC provider (the OpenID Provider or Identity Provider IdP) provides user authentication and consent as well as token issuance. The client or service requesting a user’s identity is called the Relying Party (RP). It can be, for example, a web application or, in this case, the request from the client notebook.

A JSON web token (JWT) is an open standard (RFC-7519) that defines a mechanism for securely transmitting information between two parties. The JWT is composed of three parts separated by periods. The most critical parts are a header, a payload, and the signature. The signature is calculated by encoding the header and the payload using Base64 encoding. Next, the encode64 is signed using a secret key and cryptographic algorithms specified in the header section. The signature is used to verify the token has not been changed or modified.


Envoy is an open-source service proxy for cloud-native applications. Envoy can be used as a reverse proxy to load balance HTTP and gRPC requests, i.e., an application level (L7) load balancer. Envoy proxy has network filters that can route traffic to IDPs for authentication. As mentioned earlier, this workflow uses Keycloak as its IDP. If the access token in the request is valid, then the request is redirected to the downstream application, in our case, the Triton server. The authentication sub-workflow looks similar to the one below.


Envoy like any other proxy requires a config file to route requests accordingly. The config file is stored inside the Envoy container that you created by running the workflow helm chart. It is stored at /etc/envoy-config.yaml inside the Envoy container. The envoy config for the deployment looks similar to the configuration below.


static_resources: listeners: - name: listener_backend address: socket_address: address: # [1] port_value: 8443 filter_chains: - filters: - name: typed_config: "@type": codec_type: AUTO stat_prefix: ingress_http route_config: name: local_route virtual_hosts: - name: local_service domains: - "*" routes: - match: prefix: "/" # [2] route: cluster: backend_service timeout: 0s http_filters: - name: envoy.filters.http.jwt_authn typed_config: "@type": providers: auth_keycloak: issuer: ${KEYCLOAK_URL}/realms/${KEYCLOAK_REALM} # [3] remote_jwks: http_uri: uri: ${KEYCLOAK_URL}/realms/${KEYCLOAK_REALM}/protocol/openid-connect/certs cluster: keycloak timeout: 5s cache_duration: seconds: 300 rules: - match: {prefix: /} requires: {provider_name: auth_keycloak} - name: envoy.filters.http.router typed_config: "@type": transport_socket: name: envoy.transport_sockets.tls typed_config: "@type": common_tls_context: alpn_protocols: "h2" tls_certificates: - certificate_chain: filename: /mnt/certificates/tls.crt private_key: filename: /mnt/certificates/tls.key clusters: - name: backend_service type: STRICT_DNS lb_policy: round_robin http2_protocol_options: {} load_assignment: cluster_name: backend_service endpoints: - lb_endpoints: - endpoint: address: socket_address: address: riva-api # [4] port_value: 50051 - name: keycloak connect_timeout: 0.25s type: STRICT_DNS http2_protocol_options: {} lb_policy: ROUND_ROBIN load_assignment: cluster_name: keycloak endpoints: - lb_endpoints: - endpoint: address: socket_address: address: keycloak.nvidia-platform.svc # [5] port_value: 8443 transport_socket: name: envoy.transport_sockets.tls typed_config: "@type": sni: keycloak.nvidia-platform.svc common_tls_context: validation_context: trusted_ca: filename: /mnt/certificates/ca.crt

A few elements in the config are worth noting in the lines referenced above:

  • [1] Envoy has listeners in this case it means that the proxy is listening for HTTPS requests on port 8443

  • [2] Any request to envoy at / is being redirected to Triton service

  • [3] Before passing the request to the Triton service, Envoy send the request to Keycloak for JWT authentication.

  • [4] and [5] define the cluster information for Triton and Keycloak

The Envoy deployment can be viewed in depth at:

./deploy/charts/templates/next-item-wf-infer/envoyDeployment.yaml and ./deploy/charts/templates/next-item-wf-infer/envoyService.yaml

© Copyright 2022-2023, NVIDIA. Last updated on Apr 27, 2023.