Setup Guide#

This document provides a comprehensive guide for deploying Tokkio Workflow on Amazon Web Services (AWS). While multiple approaches exist for implementing Tokkio Workflow in the AWS Cloud environment, this guide focuses on a specific deployment method utilizing Tokkio’s opinionated deployment scripts. These scripts automate the process of setting up Tokkio Workflow along with its required infrastructure components.

Prerequisites#

AWS Prerequisites#

  1. Create an IAM user with administrator access and obtain Access key ID and Secret access key

  2. Create an S3 bucket for storing deployment state

  3. Create a DynamoDB table for managing concurrent access to the deployment state

  4. Set up a domain and Route53 hosted zone for HTTPS support

In case if you need help in setting these up you can refer to the AWS documentation. Also, see Environment Variables And Prerequisites Setup section for more information on getting this setup.

Hardware#

Controller Instance#

  • Ubuntu 22.04 Operating system

  • Generate an SSH key pair

  • Ensure passwordless sudo access

Access#

  • Access to all the artifacts used during the bringing up of Tokkio Pipeline Application. For e.g. Tokkio Application Helm chart on NGC.

Infrastructure Layout#

Tokkio Workflow setup on AWS requires several AWS resources to be created such as EC2 instances, Security Groups, Application load balancer, CloudFront for hosting UI content, S3 bucket, etc. Below picture shows the overall layout of components brought up on AWS.

Tokkio AWS Infrastructure

Installation steps#

  1. Clone the NVIDIA/ACE.git repository and navigate to the aws deployment scripts directory.

git clone https://github.com/NVIDIA/ACE.git
cd ACE/workflows/tokkio/4.1/scripts/one-click/aws
  1. Prepare a config file, either by copying base config-template.yml or by copying one of the example config files available under config-template-examples folder.

Copy a config template of your choice as base config template for this installation.

cp config-template-examples/llm-ov-3d-rp-6x-streams/config-template.yml ./my-config.yml
  1. Modify the my-config.yml with your specific settings.

vi my-config.yml
  1. Set up environment variables.

  • Similar to config-template file, you can copy an example env file from config-template-examples folder.

  • Modify environment variables file with your specific settings.

cp config-template-examples/llm-ov-3d-rp-6x-streams/my-config.env my-env-file.env
vi my-env-file.env
  1. Source the environment variables file.

source my-env-file.env
  1. Run the installation command.

./envbuild.sh install --component all --config-file ./my-config.yml
  1. Capture installation results at the end of logs. Output will look like the sample below:

access_urls:
  api_endpoint: "https://<api_sub_domain>.<base_domain>"
  elasticsearch_endpoint: "https://elastic-<project_name>.<base_domain>"
  grafana_endpoint: "https://grafana-<project_name>..<base_domain>"
  kibana_endpoint: "https://kibana-<project_name>..<base_domain>"
  ui_endpoint: "https://<ui_sub_domain>.<base_domain>"
ssh_command:
app:
  bastion: ssh -i /home/my-user/.ssh/id_rsa -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null <username>@<bastion-instance-ip-address>
  master: ssh -i /home/my-user/.ssh/id_rsa -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ProxyCommand="ssh -i /home/my-user/.ssh/id_rsa -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -W %h:%p <username>@<bastion-instance-ip-address>" <username>@<app-instance-ip-address>
turn:
  master: ssh -i /home/my-user/.ssh/id_rsa -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null <username>@<turn-instance-ip-address>
  1. Verify installation.

  • Check pod status: kubectl get pods -n <application-namespace>

  • Wait for all pods to reach Ready status (may take up to 60 minutes)

  • Once all the pods are healthy you can access the UI using the URL from installation output access_urls.ui_endpoint

  • For the first time, the browser should prompt permissions such as Mic, Speaker or Camera necessary for the UI to operate. Upon accepting the permissions, the UI should load.

Verify AWS Deployment

Uninstallation Steps#

Uninstalling Just The Application#

  • Source the correct environment variables file

source my-env-file.env
  • Run the below commands to uninstall just the application components

./envbuild.sh uninstall --component app --config-file ./my-config.yml
  • Clear any persistent-volumes from previous installation by below command

kubectl delete pv -n <application-namespace>

Uninstalling The Whole Setup#

  • Source the correct environment variables file

source my-env-file.env
  • Run the below un-install command

./envbuild.sh uninstall --component all --config-file ./my-config.yml

Caution

This step uninstalls the entire Kubernetes cluster, AWS infrastructure that was brought up during the install step. So use this with caution.

Additional Considerations#

Cost#

Many of the resources in this setup may not fall in Free tier, you can check AWS billing reference pages for understanding cost implications.

Security#

The security of Tokkio in production environments is the responsibility of the end users deploying it. When deploying in a production environment, please have security experts review any potential risks and threats; define the trust boundaries, secure the communication channels, integrate AuthN & AuthZ with appropriate access controls, keep the deployment including the containers up to date, and ensure the containers are secure and free of vulnerabilities.

Essential Skills And Background#

Familiarity With Amazon Web Services (AWS)#

Users should have a basic understanding of Amazon Web Services, including its core services and billing model. Below are some of the key areas users should be familiar with:

  • AWS Global Infrastructure: Understanding of AWS Regions, Availability Zones.

  • Identity and Access Management (IAM): Familiarity with IAM users, groups, roles, and policies for secure access control.

  • Core Services: Knowledge of fundamental AWS services such as EC2, S3, and DynamoDB for compute, storage, and database.

  • Networking: Understanding of VPC, subnets, security groups, and route tables for secure and efficient network design.

  • Management Tools: Familiarity with AWS Management Console for setting up one-time pre-requisites.

  • Others include Route53 for DNS service and Loadbalancers for traffic distribution.

Familiarity with Command-Line-Interface (CLI)#

  • Basic Commands: Users should be comfortable with basic command-line operations, such as navigating directories, executing scripts, and managing files.

  • Environment Configuration: Understanding how the environment variables and how PATH setup works on Linux will greatly help operating OneClick script.

  • Scripting Basics: Basic scripting knowledge (e.g., shell scripting) is beneficial for understanding how the OneClick script operates and for troubleshooting any issues that may arise.

Familiarity with YAML#

  • YAML Syntax and Structure: YAML is often used for configuration files in cloud-native applications due to its readability and flexibility. The Configuration templates used in OneClick script uses YAML format. Users should be familiar with YAML syntax and structure.

Familiarity with Kubernetes eco system#

Tokkio pipeline is a Cloud native application, and uses concepts like Containerization, Kubernetes, helm etc. Users need to be familiar with these to get the best results from using the deployment scripts and the app.

  • Kubernetes Basics: Users should have basic understanding of Kubernetes core concepts such as pods, services and deployments

  • kubectl: Familiarity with the kubectl the command line tool used to interact with Kubernetes clusters including querying the status or logs of Running application pods etc.

  • Helm: Understanding Helm package manager for Kubernetes that simplifies application deployment by managing charts (collections of pre-configured Kubernetes resource definitions). And how to use helm with override values will help configuring the templates appropriately.