Advanced Usage#

Installer Flags and Commands#

  • Available flags:

    -h, –help

    -v, –version

  • Available Commands:

    • Completion - Generates the autocompletion script for ./cnpctl_Linux_x86_64 for the specified shell. See each sub-command’s help for details on how to use the generated script.

      Usage:

      ./cnpctl_Linux_x86_64 completion [command]
      

      Available Commands:

      bash - Generate the autocompletion script for bash

      fish - Generate the autocompletion script for fish

      powershell -Generate the autocompletion script for powershell

      zsh - Generate the autocompletion script for zsh

      Flags:

      -h, –help - Help for completion

    • Create/Install - Creates the NVIDIA cloud-native platform.

      -d, –directory - String, if non-empty, write working files to this directory. (default “.”)

      -f, –filename - String, the path to a file that contains the configuration to apply.

      -h, –help - Help for create

      -kubeconfig - String, the path to the kubeconfig file to use for CLI requests. By default, the installer will look for a KUBECONFIG environment variable to determine the location of kubeconfig, followed by the default $HOME/.kube/config location, unless the kubeconfig location is specified manually via this flag.

      -v, –verbose - Enables more detailed logging for debugging purposes.

    • Delete - Deletes the NVIDIA cloud-native platform.

      Usage:

      ./cnpctl_Linux_x86_64 delete [flags]
      

      Aliases:

      delete, destroy

      Flags:

      -d, –directory - String, if non-empty, write working files to this directory. (default “.”)

      -h, –help - Help for delete

      -kubeconfig –kubeconfig - String, the path to the kubeconfig file to use for CLI requests. By default, the installer will look for a KUBECONFIG environment variable to determine the location of kubeconfig, followed by the default $HOME/.kube/config location, unless the kubeconfig location is specified manually via this flag.

      -v, –verbose - Increase the verbosity.

Configuration YAML Customization#

The CNPack installer can be configured at install time with a configuration file. This file allows all components of the platform to be enabled/disabled and configured to meet different use cases.

Note

There is currently no dependency checking on the configuration file. If a component is disabled that is required for a different component, the installation will fail.

The configuration file below is a YAML formatted file that has a structure similar to that of a Kubernetes resource. Below is all of the configuration options with documentation on how to use them.

  1apiVersion: v1alpha1
  2kind: nvidiaplatform
  3spec:
  4    # The platform block contains general configuration that is important to all components
  5    platform:
  6        # Required value specifying the Wildcard Domain to configure for ingress.
  7        wildcardDomain: *.my-cluster.my-domain.com
  8        # Required value to specify the port to configure for ingress.
  9        externalPort: 443
 10        # Optional infrastructure provider configuration for AWS EKS
 11        eks:
 12            # The region in-which the cluster is installed.
 13            region: us-west-1
 14
 15    # The ingress block configures the ingress controller
 16    ingress:
 17        # Whether this component should be enabled Default is True.
 18        enabled: True
 19
 20    # The postgres block configures the postgres operator
 21    postgres:
 22        # Whether this component should be enabled Default is True.
 23        enabled: True
 24
 25    # The certManager block configures the certificate management system
 26    certManager:
 27        # Whether this component should be enabled Default is True.
 28        enabled: True
 29        # Optional configuration for the AWS Private CA service integration.
 30        #
 31        # Dependencies:
 32        #   - EKS Infrastructure provider configuration (spec.platform.eks)
 33        awsPCA:
 34            # Whether this component should be enabled Default is True.
 35            enabled: True
 36            # The ARN required to communicate with the AWS Private CA service.
 37            arn: ...
 38            # The common name of the configured Private CA.
 39            commonName: my-cert.my-domain.com
 40            # The domain name of the configured Private CA.
 41            domainName: my-domain.com
 42
 43    # The trustManager block configures the trust bundle management system
 44    #
 45    # Dependencies:
 46    #   - cert-manager
 47    trustManager:
 48        # Whether this component should be enabled Default is True.
 49        enabled: True
 50
 51    # The keycloack block configures Keycloak as an OIDC provider
 52    #
 53    # Dependencies:
 54    #   - cert-manager
 55    #   - postgres
 56    #   - ingress
 57    keycloak:
 58        # Whether this component should be enabled Default is True.
 59        enabled: True
 60        # The persitent value claim spec options to be used to request database storage. All Kubernets PVC Spec values are supported, but only the most typical are shown here.
 61        databaseStorage:
 62            # The access modes supported by your storage provider.
 63            accessModes:
 64                - ReadWriteOnce
 65            # The volume mode supported by your storage provider.
 66            volumeMode: Filesystem
 67            # The amount of storage requested.
 68            resources:
 69                requests:
 70                    storage: 10G
 71            # The name of your storage class.
 72            storageClassName: local-path
 73        # Optional value to override the hostname used to expose keycloak.
 74        customHostname: my-host.my-cluster.my-domain.com
 75        # Optional value to set the initial admin password to a specified value. By default, a random pasword will be generated.
 76        initialAdminPassword: My-Secret-Password-1
 77
 78    # The prometheus block configures the Prometheus metrics service
 79    #
 80    # Dependencies:
 81  #   - cert-manager
 82  prometheus:
 83      # Whether this component should be enabled Default is True.
 84      enabled: True
 85      # The persitent value claim spec options to be used to request Prometheus storage. All Kubernets PVC Spec values are supported, but only the most typical are shown here.
 86      databaseStorage:
 87          # The access modes supported by your storage provider.
 88          accessModes:
 89              - ReadWriteOnce
 90          # The volume mode supported by your storage provider.
 91          volumeMode: Filesystem
 92          # The amount of storage requested.
 93          resources:
 94              requests:
 95                  storage: 10G
 96          # The name of your storage class.
 97          storageClassName: local-path
 98      # Optional configuration for connecting Prometheus to an AWS Managed Prometheus instance.
 99      awsRemoteWrite:
100          # The URL of the AWS managed prometheus service.
101          url: https://...
102          # The ARN required to communicate with the AWS Managed Prometheus Service.
103          arn: ...
104
105  # The grafana block configures the Grafana dashboard service
106  #
107  # Dependencies:
108  #   - prometheus
109  #   - cert-manager
110  #   - ingress
111  grafana:
112      # Whether this component should be enabled Default is True.
113      enabled: True
114      # Optional value to override the hostname used to expose grafana.
115      customHostname: my-host.my-cluster.my-domain.com
116
117  # The elastic block configures the Elastic Cloud on Kubernetes operator
118  elastic:
119      # Whether this component should be enabled Default is True.
120      enabled: True
121
122  # The fluentbit block configures the fluentbit log aggregation service
123  #
124  # Dependencies:
125  #   - Infrastructure provider configuration (spec.platform.eks)
126  fluentbit:
127      # Whether this component should be enabled Default is True.
128      enabled: True

Ingress Controller Default Certificate Configuration#

As a part of the deployment of the HAProxy ingress controller, a secret has been created in the nvidia-platform namespace, called nvidia-ingress-kubernetes-ingress-default-cert, that contains the TLS cert and TLS key used for the wildcard domain name. This certificate can be replaced by a signed certificate of the user’s choosing that is signed for the wildcard domain name of .my-cluster.my-domain.com.