Setup Guide#

This document provides a comprehensive guide for deploying Tokkio Workflow on Microsoft Azure. While multiple approaches exist for implementing Tokkio Workflow in the Azure Cloud environment, this guide focuses on a specific deployment method utilizing Tokkio’s opinionated deployment scripts. These scripts automate the process of setting up Tokkio Workflow along with its required infrastructure components.

Prerequisites#

Azure Setup#

  • Azure account with admin access

  • Azure service principal to enable the automated deployment scripts to authenticate themselves with

  • Azure storage account and container to host the state of the automated deployment scripts so that the created infrastructure can be modified or torn down at a later date or time

  • Registered a domain for hosting the Tokkio Application

  • Azure app certificate for SSL support

For instructions on how to set this up, refer to the Azure documentation. You may also follow the instructions in Environment Variables and Prerequisites Setup section.

Hardware#

Controller Instance#

Note

Instructions to set up some of these prerequisites, such as setting up an SSH key pair and passwordless sudo access, can be found at Common Setup Procedures.

Access#

  • Access to all the artifacts used during the bringing up of the Tokkio Pipeline Application, for example, the Tokkio Application Helm chart on NGC. For instructions on how to generate a personal API key, refer to Generating a Personal API Key.

Infrastructure Layout#

Tokkio Workflow setup on Azure requires several Azure resources to be created, such as Virtual Machines, Network Security Groups, Application Gateways, FrontDoor CDN for hosting UI content, etc. The diagram below shows the infrastructure layout this deployment script is going to create.

Azure Deployment Architecture

Installation Steps#

Note

Before proceeding with the installation steps, ensure you have reviewed the Deployment section to understand the deployment workflow and building blocks, including the Controller Instance, Config File, Environment Variables File, and Application Instance.

  1. Clone the NVIDIA/ACE.git repository and navigate to the Azure deployment scripts directory.

git clone --single-branch --branch 5.0.0-beta https://github.com/NVIDIA/ACE.git
cd ACE/workflows/tokkio/5.0.0-beta/scripts/one-click/azure
  1. Prepare a config file, either by copying the base config-template.yml or by copying one of the example config files available under config-template-examples folder.

#list of example templates
config-template-examples/
└── tokkio-3streams
    ├── config-template.yml
    └── my-config.env

Copy a config template of your choice as the base config template for this installation.

cp config-template-examples/tokkio-3streams/config-template.yml ./my-config.yml
  1. Modify the my-config.yml with your specific settings. For more details about each parameter, refer to Advanced Configuration

vi my-config.yml
  1. Set up environment variables.

  • Similar to config-template file, you can copy an example env file from config-template-examples folder

  • Modify the environment variables file with your specific settings. For more details about each environment variable, refer to Environment Variables and Prerequisites Setup

cp config-template-examples/tokkio-3streams/my-config.env my-env-file.env
vi my-env-file.env
  1. Source the environment variables file

source my-env-file.env
  1. Run the installation command

./envbuild.sh install --tf-binary terraform --component all --config-file ./my-config.yml
  1. Capture installation results at the end of logs. Output will look like the sample below.

access_urls:
  ace_configurator_endpoint: "https://<ace_configurator_sub_domain>.<base_domain>"
  api_endpoint: "https://<api_sub_domain>.<base_domain>"
  grafana_endpoint: "https://<grafana_sub_domain>.<base_domain>"
  ui_endpoint: "https://<ui_sub_domain>.<base_domain>"
ssh_command:
  app:
    bastion: ssh -i /home/my-user/.ssh/id_rsa -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null <username>@<bastion-instance-ip-address>
    master: ssh -i /home/my-user/.ssh/id_rsa -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ProxyCommand="ssh -i /home/my-user/.ssh/id_rsa -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -W %h:%p <username>@<bastion-instance-ip-address>" <username>@<app-instance-ip-address>
  turn:
    master: ssh -i /home/my-user/.ssh/id_rsa -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null <username>@<turn-instance-ip-address>
  1. Verify installation

  • Login to the application instance using the SSH command from the installation results ssh_command.app.master

  • Check pod status: kubectl get pods \-n \<application-namespace\>

  • Wait for all pods to reach Ready status (may take up to 60 minutes)

  • Once all the pods are healthy, you can access the UI using the URL from installation output access_urls.ui_endpoint

  • For the first time, the browser should prompt for permissions such as Mic, Speaker necessary for the UI to operate. Upon accepting the permissions, the UI should load.

Verify Installation

Uninstallation Steps#

Uninstalling Just the Application#

  • Source the correct environment variables file

source my-env-file.env
  • Run the below commands to uninstall just the application components.

./envbuild.sh uninstall --tf-binary terraform --component app --config-file ./my-config.yml

Uninstalling the Whole Setup#

  • Source the correct environment variables file

source my-env-file.env
  • Run the below uninstall command.

./envbuild.sh uninstall --tf-binary terraform --component all --config-file ./my-config.yml

Caution

This step uninstalls the entire Kubernetes cluster and Azure infrastructure that was brought up during the install step. So use this with caution.

Additional Considerations#

Cost#

Many of the resources in this setup may not fall within Azure’s Free tier. To understand the cost implications, you should consult the Azure pricing calculator and Azure Cost Management, Billing documentation.

Security#

The security of Tokkio in production environments is the responsibility of the end users deploying it. When deploying in a production environment, please have security experts review any potential risks and threats; define the trust boundaries, secure the communication channels, integrate AuthN & AuthZ with appropriate access controls, keep the deployment including the containers, up to date, and ensure the containers are secure and free of vulnerabilities.

Essential Skills and Background#

Familiarity With Azure CSP#

  • Users should have a basic understanding of the Azure Cloud Solution Provider program, including its billing model and available services.

  • Familiarity with Azure Resource Manager (ARM) is crucial, as all services in CSP are based on the ARM deployment model

  • Users should be aware of any CSP-specific limitations or restrictions that may affect the deployment of certain Azure services or offers

  • Additionally, knowledge of CSP-specific tools and portals for managing customer subscriptions, provisioning services, and handling support is beneficial and will help in effectively deploying and managing Tokkio on Azure CSP.

Familiarity With Command-Line-Interface (CLI)#

  • Basic Commands: Users should be comfortable with basic command-line operations, such as navigating directories, executing scripts, and managing files.

  • Environment Configuration: Understanding how the environment variables and how the PATH setup works on Linux will greatly help in operating the OneClick script.

  • Scripting Basics: Basic scripting knowledge (e.g., shell scripting) is beneficial for understanding how the OneClick script operates and for troubleshooting any issues that may arise.

Familiarity With YAML#

  • YAML Syntax and Structure: YAML is often used for configuration files in cloud-native applications due to its readability and flexibility. The configuration templates used in OneClick script use YAML format. Users should be familiar with YAML syntax and structure.

Familiarity With Kubernetes Eco System#

Tokkio pipeline is a cloud-native application and uses concepts like containerization, Kubernetes, Helm, etc. Users need to be familiar with these to get the best results from using the deployment scripts and the app.

  • Kubernetes Basics: Users should have a basic understanding of Kubernetes core concepts such as pods, services, and deployments.

  • kubectl: Familiarity with kubectl, the command line tool used to interact with Kubernetes clusters, including querying the status or logs of running application pods, etc.

  • Helm: Understanding the Helm package manager for Kubernetes, which simplifies application deployment by managing charts (collections of pre-configured Kubernetes resource definitions). How to use Helm with overridden values will help to configure the templates appropriately.