AWS EKS Deployment Guide#
This guide provides complete step-by-step instructions for deploying CDS (Cosmos Dataset Search) on Amazon Elastic Kubernetes Service (EKS).
Overview#
This deployment uses a pre-configured Docker container that includes all necessary tools (AWS CLI, eksctl, kubectl, helm) to deploy CDS on AWS EKS. The following services will be deployed:
EKS cluster with GPU and CPU node groups
S3 bucket for storage
Milvus vector database
Cosmos-embed NIM for video embeddings
CDS service
React-based web UI
Total deployment time: The total time to deploy CDS on AWS EKS is approximately 30-40 minutes. Total pods deployed: 17
Prerequisites#
AWS Account Requirements#
IAM Permissions: For creating EKS, EC2, S3, IAM, VPC, CloudFormation
Service Quotas: Sufficient quotas for GPU instances (g6.xlarge)
Required Credentials#
NGC API Key
Get this key from NGC
Required for NVIDIA container images
Docker Hub Credentials
Username and Personal Access Token (PAT)
Get a PAT from Docker Hub
AWS Credentials: One of the following:
Permanent:
AWS_ACCESS_KEY_ID(starts with AKIA) +AWS_SECRET_ACCESS_KEYTemporary: Above +
AWS_SESSION_TOKEN
IAM Permissions: The ability to create and manage the following:
EKS clusters
EC2 instances (including g6.xlarge GPU instances)
S3 buckets
IAM roles and policies
VPC resources
CloudFormation stacks
Service Quotas: Ensure you have sufficient quotas for the following:
EC2 instances (specifically g6.xlarge for GPU nodes)
EBS volumes
Elastic IPs
VPC resources
Local Machine Requirements#
Docker Desktop or Docker Engine installed and running
10GB+ free disk space
Stable internet connection
Deployment Steps#
Step 1: Build Host Setup Container#
Navigate to your CDS repository and build the deployment container:
cd /path/to/cds
make build-host-setup
Expected output: Container builds successfully (~1 minute with caching)
Step 2: Configure Environment Variables#
To configure environment variables, copy the template file and fill in your credentials. Once we copy to my-env.sh in the project root, we will refer to it throughout this guide.
First, copy the template file to my-env.sh:
cp infra/blueprint/host-setup-docker/env_vars_template.sh my-env.sh
Then, edit the file with to add required values:
nano my-env.sh # or use your preferred editor
The file should look like the following:
# NGC and Docker credentials
NGC_API_KEY=<your-ngc-api-key>
DOCKER_USER=<your-dockerhub-username>
DOCKER_PAT=<your-dockerhub-personal-access-token>
# Deployment configuration
CLUSTER_NAME=<your-cluster-name> # Max 20 characters, alphanumeric and hyphens
AWS_REGION=us-east-2 # or your preferred region
S3_BUCKET_NAME=<unique-bucket-name> # Must be globally unique
# AWS Credentials
AWS_ACCESS_KEY_ID=<your-access-key>
AWS_SECRET_ACCESS_KEY=<your-secret-key>
AWS_SESSION_TOKEN= # Leave empty if using permanent credentials
# Custom S3 Credentials (For Data Ingestion)
# You can reference the main credentials above, no need to change unless using different bucket with different credentials
CUSTOM_AWS_ACCESS_KEY_ID=<>
CUSTOM_AWS_SECRET_ACCESS_KEY=<>
CUSTOM_AWS_REGION=<>
CUSTOM_S3_BUCKET_NAME=<bucket-name>
CUSTOM_S3_FOLDER=<folder-name-in-bucket>
To verify the configuration, use the following command:
set -a && source my-env.sh && set +a
echo "Cluster: $CLUSTER_NAME"
echo "Region: $AWS_REGION"
Important
Environment variables are the backbone of the deployment. Set them carefully to ensure a successful deployment.
Sourcing the Shell Script#
To properly source the my-env.sh file and ensure all environment variables are exported, always use the following syntax:
set -a && source my-env.sh && set +a
This practice ensures that every variable in my-env.sh is exported into your shell environment and available to all subsequent commands. Omitting set -a or set +a may lead to subtle bugs where some variables are not exported.
Step 3: Start Deployment Container#
Run the cds-deployment container in detached mode. For the rest of the guide, we will use this deployment container whenever we reference cds-deployment.
docker run -it -d \
--env-file my-env.sh \
-v ~/.kube:/root/.kube \
-v $(pwd)/infra/blueprint:/workspace/blueprint \
--name cds-deployment \
host-setup:latest \
/bin/bash
Verifying AWS Credentials#
To verify the AWS credentials work, use the following command:
docker exec cds-deployment bash -c "aws sts get-caller-identity"
You should see your AWS account ID and user ARN.
Validating the Configuration#
To validate the configuration, use the following command:
docker exec cds-deployment bash -c "cd /workspace/blueprint/bringup && ./configuration.sh"
All credentials should be validated with checkmarks (✓).
Step 4: Create EKS Cluster#
Create the cluster with all node groups:
docker exec cds-deployment bash -c "cd /workspace/blueprint/bringup && ./cluster_up.sh -y"
This process will take approximately 15-20 minutes to complete. The following steps will be performed:
Create VPC and networking resources
Provision 4 node groups via CloudFormation:
2x g6.xlarge (GPU nodes)
1x r7i.4xlarge (high-memory for Milvus)
5x m6i.2xlarge (general compute)
1x c6i.2xlarge (optimized compute)
Configure high-performance GP3 storage class
Install EBS CSI driver
Install NVIDIA device plugin
Verifying the Nodes are Ready#
To verify the nodes are ready, use the following command:
docker exec cds-deployment bash -c "kubectl get nodes"
All 9 nodes should show “Ready” status.
Troubleshooting#
Here are solutions to common issues you may encounter when creating the EKS cluster:
“Cluster already exists” messages: The script will skip the creation steps and use the existing cluster.
CloudFormation errors: Check the AWS console for details.
Quota errors: Request service quota increases from AWS.
Step 5: Set up S3 Bucket and Permissions#
Create the S3 bucket and configure the IAM policy:
docker exec cds-deployment bash -c "cd /workspace/blueprint/bringup && ./s3_up.sh -y"
This process will take approximately 2-3 minutes to complete. The following steps will be performed:
Create the S3 bucket in your region.
Set up the OIDC provider for EKS.
Create the IAM policy for S3 access.
Create the Kubernetes service account with IAM role.
To verify the S3 bucket and permissions are set up correctly, use the following command:
docker exec cds-deployment bash -c "kubectl describe serviceaccount s3-access-sa"
The output should show the IAM role ARN annotation.
Step 6: Deploy Kubernetes Services#
Use the following command to deploy all CDS services:
docker exec cds-deployment bash -c "cd /workspace/blueprint/bringup && ./k8s_up.sh -y"
This process will take approximately 30-40 minutes to complete. The following services will be deployed:
Kubernetes secrets for image pulling
Cosmos-embed NIM (GPU-accelerated embedding service)
Milvus vector database (15 pods)
CDS API service
React web UI
Nginx ingress controller with TLS
To monitor the progress, use the following command in another terminal:
# Watch pods come up
docker exec cds-deployment bash -c "kubectl get pods -w"
# Check specific service
docker exec cds-deployment bash -c "kubectl get pods -l app.kubernetes.io/name=nvidia-nim-cosmos-embed"
Key stages:
Milvus components start (~5 min)
CDS and UI start (~2-3 min)
Cosmos-embed downloads model (~10-15 min) - this is the longest part
Ingress controller creates AWS load balancer (~2-3 min)
Verifying All Pods are Ready#
To verify all pods are ready, use the following command:
docker exec cds-deployment bash -c "kubectl get pods"
All pods should show “Running” with “1/1” or “2/2” ready.
Troubleshooting#
Here are solutions to common issues you may encounter when deploying CDS services:
Cosmos-embed ContainerCreating process lasts longer than 5 minutes: This is expected because the deployment scritp is downloading large model.
Milvus-querynode pod pending: Check if scheduled on r7i.4xlarge node.
Image pull errors: Verify that
NGC_API_KEYin environment variables is correct.
Step 7: Get Deployment URLs#
Get your ingress URL as follows:
docker exec cds-deployment bash -c "kubectl get ingress simple-ingress -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'"
To access your deployment, use the following URLs:
Web UI:
https://<hostname>/cosmos-dataset-searchAPI:
https://<hostname>/api
To test the API, use the following commands:
# Get hostname
HOSTNAME=$(docker exec cds-deployment bash -c "kubectl get ingress simple-ingress -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'")
# Test health endpoint
curl -k https://$HOSTNAME/api/v1/health
# Test pipelines endpoint
curl -k https://$HOSTNAME/api/v1/pipelines
Step 8: Install and Configure CDS CLI on Your Local Machine#
Use the following commands to install and configure the CDS CLI on your local machine:
# Navigate to CDS repository on your local machine
cd /path/to/cds
# Run the client setup script from bringup directory
cd infra/blueprint/bringup
bash client_up.sh -y
This process will take approximately 2-3 minutes to complete. The following steps will be performed:
Install CDS CLI from packaged source.
Create Python virtual environment with dependencies.
Automatically configure CLI with your deployment’s API endpoint.
Validate deployment by listing pipelines.
The following is the expected output:
Installing CDS CLI from source...
CDS CLI installed successfully!
CDS CLI version 0.6.0
Configuring CDS CLI...
Pipelines:
{
"pipelines": [
{
"id": "cosmos_video_search_milvus",
"enabled": true,
"missing": []
}
]
}
CDS blueprint running at <cluster-name>. Installation complete.
Verifying the CLI works#
To verify the CLI works, use the following commands:
# Activate the virtual environment
source /path/to/cds/.venv/bin/activate
# List pipelines
cds pipelines list
# List collections (will be empty initially)
cds collections list
If you see duplicate section error, edit the config file as follows:
# Edit the config file if you have duplicates in [default] profile
nano ~/.config/cds/config # or on your preferred editor
# Should have only one [default] section with your API endpoint
Data Ingestion#
After deployment, you can ingest videos into CDS. This section describes how to ingest the MSR-VTT sample dataset.
Setup: Source Environment Variables#
Before starting data ingestion, source your environment variables once:
# Navigate to CDS repository
cd /path/to/cds
# Source environment variables (sets all AWS credentials and S3 bucket info)
set -a && source my-env.sh && set +a
# Activate virtual environment
source .venv/bin/activate
This makes all variables available for the remaining steps (AWS credentials, S3 bucket, CUSTOM_* variables).
Step 1: Prepare the Dataset#
Download and prepare the MSR-VTT dataset using the provided configuration. The sample config in scripts has max_records set to 100, so 100 videos will be downloaded.
# Download and prepare videos
make prepare-dataset CONFIG=scripts/msrvtt_test_100.yaml
The scripts/msrvtt_test_100.yaml configuration contains the following:
source: hf
hf_repo: friedrichor/MSR-VTT
hf_config: test_1k
split: test
video_zip_filename: MSRVTT_Videos.zip
max_records: 100
id_field: video_id
video_field: video
text_field: caption
output_jsonl: ~/datasets/msrvtt_test_100.jsonl
copy_videos_to: ~/datasets/msrvtt/videos
Duration: 5-10 minutes (downloads ~22MB of videos)
The following steps will be performed:
Download
MSRVTT_Videos.zipfrom HuggingFaceExtract videos to
/tmp/msr_vtt_videosCopy 100 videos to
~/datasets/msrvtt/videos/
Note
The prepare-dataset script only supports HuggingFace datasets that provide videos as a downloadable ZIP file. For datasets without ZIPs, you’ll need to download videos separately or provide them locally.
Step 2: Upload Videos to S3#
This step configures an AWS profile using CUSTOM_* credentials and uploads videos to the S3 bucket.
Important
You MUST configure CUSTOM_* variables in my-env.sh:
Scenario 1: Same bucket as deployment (most common):
# In my-env.sh, set CUSTOM_* to reference main credentials:
CUSTOM_AWS_ACCESS_KEY_ID=... # Same as AWS_ACCESS_KEY_ID
CUSTOM_AWS_SECRET_ACCESS_KEY=... # Same as AWS_SECRET_ACCESS_KEY
CUSTOM_AWS_REGION=... # Same as AWS_REGION
CUSTOM_S3_BUCKET_NAME=... # Same bucket configured
CUSTOM_S3_FOLDER=msrvtt-videos # Folder where you uploaded videos
Scenario 2: Different dataset bucket with different credentials:
# In my-env.sh, set CUSTOM_* to different values:
CUSTOM_AWS_ACCESS_KEY_ID=AKIA... # Different AWS key
CUSTOM_AWS_SECRET_ACCESS_KEY=... # Different secret
CUSTOM_AWS_REGION=us-west-2 # Different region
CUSTOM_S3_BUCKET_NAME=my-other-bucket # Different bucket
CUSTOM_S3_FOLDER=videos # Folder in that bucket
Re-source the Environment#
After configuring CUSTOM_* variables, re-source the environment as follows:
set -a && source my-env.sh && set +a
source .venv/bin/activate
Install and Configure the AWS CLI#
Make sure you have the AWS CLI installed on your machine. If not, install it by following AWS CLI installation guide.
To verify your installation, use the following command:
aws --version
The following steps require that your aws CLI is available and on your PATH.
Configure AWS CLI Credentials#
The profile you create below using aws configure set ... --profile ... works locally and does not touch your default AWS credentials. If you have not previously used the AWS CLI, you may need to run the following to set up your default credentials:
aws configure
Note
All aws commands that follow use the profile derived from your CUSTOM_* variables.
Configure AWS Profile for S3 Access#
Use the following commands to configure the AWS profile for S3 access:
# Create AWS profile for the S3 bucket
export PROFILE_NAME="${CUSTOM_S3_BUCKET_NAME}-profile"
aws configure set aws_access_key_id "${CUSTOM_AWS_ACCESS_KEY_ID}" --profile "${PROFILE_NAME}"
aws configure set aws_secret_access_key "${CUSTOM_AWS_SECRET_ACCESS_KEY}" --profile "${PROFILE_NAME}"
aws configure set region "${CUSTOM_AWS_REGION}" --profile "${PROFILE_NAME}"
Verify the S3 Bucket#
To verify the S3 bucket exists, use the following command:
# Check if bucket exists
aws s3 ls s3://$CUSTOM_S3_BUCKET_NAME/$CUSTOM_S3_FOLDER --profile $PROFILE_NAME
If bucket doesn’t exist, create it and set the environment variables for bucket.
Upload Videos#
To upload videos to S3, use the following command:
# Upload videos to the bucket - Adjust where the videos are saved
aws s3 cp ~/datasets/msrvtt/videos/ s3://$CUSTOM_S3_BUCKET_NAME/$CUSTOM_S3_FOLDER/ --recursive --profile $PROFILE_NAME
# Verify upload
echo "Videos uploaded. Checking count:"
aws s3 ls s3://$CUSTOM_S3_BUCKET_NAME/$CUSTOM_S3_FOLDER/ --profile $PROFILE_NAME | wc -l
Duration: 1-2 minutes
Step 3: Ingest Videos into CDS#
The ingest_custom_videos.sh script requires CUSTOM_* variables to be set. These variables specify which S3 bucket has the videos to ingest.
To run ingestion, use the following commands:
cd infra/blueprint
bash ingest_custom_videos.sh
The following steps will be performed using CUSTOM_* variables:
Create AWS profile using
CUSTOM_AWS_ACCESS_KEY_IDandCUSTOM_AWS_SECRET_ACCESS_KEYVerify and create a Kubernetes secret with these credentials.
Use
CUSTOM_S3_BUCKET_NAMEandCUSTOM_S3_FOLDERto locate videosIngest from
s3://CUSTOM_S3_BUCKET_NAME/CUSTOM_S3_FOLDER/
Duration: 2-5 minutes for 5 videos (default limit)
The following is the expected output:
Status code 200: 5/5 100%
Processed 5 files successfully
Step 4: Verify and Configure CORS#
Before ingesting videos or using the web UI, you must verify and configure CORS (Cross-Origin Resource Sharing) on your S3 buckets. Without a correct CORS policy, the web application will not be able to load video assets from your S3 bucket, resulting in browser errors.
Run the provided verification script:
docker exec cds-deployment bash -c "cd /workspace/blueprint/bringup && ./verify_and_configure_cors.sh"
To apply CORS automatically, add the
-yor--applyflag:
docker exec cds-deployment bash -c "cd /workspace/blueprint/bringup && ./verify_and_configure_cors.sh -y"
This script will perform the following steps:
Check your IAM permissions for GetBucketCors and PutBucketCors.
Verify the current CORS configuration on all buckets relevant to your deployment, including custom buckets if defined.
Offer to configure CORS interactively if needed, or apply recommended defaults automatically with
-y.Allow you to choose between a secure (allow only your ingress hostname) or permissive (
*) CORS policy.
Note
By default, the deployment does not set a CORS policy on the S3 bucket. This is a security best practice so you can explicitly control access for your origins.
You should run this script at the following times:
After ingestion, if you forgot to configure CORS or if videos don’t load
Anytime you see CORS errors in your browser while testing or using the Web UI
The following is an example of the browser error if CORS is missing:
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource...
(Reason: CORS header 'Access-Control-Allow-Origin' missing)
Note:
You can also configure CORS manually at any time. Refer to the Manually Configure CORS for S3 Videos section below for full details and examples of manual configuration using the AWS CLI.
Tip
For security reasons, only allow the origins you expect (for example, restrict to the actual AWS load balancer URL you obtained from kubectl get ingress). Do not use * for production unless absolutely necessary.
If your deployment uses a custom S3 bucket (via CUSTOM_S3_BUCKET_NAME), ensure you configure CORS for all buckets used by CDS (main and custom).
For manual CORS configuration, refer to the Manually Configure CORS for S3 Videos section below.
Step 5: Verify Ingestion#
To verify ingestion, use the following commands:
# List collections
cds collections list
# Access Web UI
echo "https://$(docker exec cds-deployment bash -c 'kubectl get ingress simple-ingress -o jsonpath="{.status.loadBalancer.ingress[0].hostname}"')/cosmos-dataset-search"
Step 6: Ingest your Videos#
Follow these steps to ingest your own videos:
Upload your videos to S3:
aws s3 cp /path/to/your/videos/ s3://$CUSTOM_S3_BUCKET_NAME/CUSTOM_S3_FOLDER/ --recursive
Update the folder name in
my-env.sh:# Edit my-env.sh and change: CUSTOM_AWS_ACCESS_KEY_ID=<> CUSTOM_AWS_SECRET_ACCESS_KEY=<> CUSTOM_AWS_REGION=<> CUSTOM_S3_BUCKET_NAME=<> CUSTOM_S3_FOLDER=my-videos # Then re-source: set -a && source my-env.sh && set +a
Run ingestion using the
ingest_custom_videos.shscript:cd infra/blueprint bash ingest_custom_videos.sh
Note
The ingest_custom_videos.sh script creates a collection with proper S3 storage configuration, creates Kubernetes secrets for S3 access, and then ingests the videos. This is the recommended workflow for AWS EKS deployments.
Advanced Options and Configurations#
Managing Secrets#
CDS uses Kubernetes secrets to store sensitive credentials (like S3 access keys) that collections need to access videos in different buckets.
Create a Secret for S3 Access#
docker exec cds-deployment kubectl create secret generic my-s3-creds \
--from-literal=aws_access_key_id=... \
--from-literal=aws_secret_access_key=secret... \
--from-literal=aws_region=us-east-2
List Secrets#
docker exec cds-deployment kubectl get secrets
Use the Secret in a Collection#
When creating a collection that references videos in S3, specify the secret in the collection configuration:
cds collections create --pipeline cosmos_video_search_milvus \
--name "My Collection" \
--config-yaml <(echo "
tags:
storage-template: 's3://my-bucket/videos/{{filename}}'
storage-secrets: 'my-s3-creds'
")
For more details on using secrets with collections, refer to the CLI User Guide.
Manually Configure CORS for S3 Videos#
When your browser tries to load videos from S3, it makes cross-origin requests. S3 blocks these requests by default unless you explicitly configure CORS (Cross-Origin Resource Sharing) rules.
You will see the following error if CORS is not configured:
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource...
(Reason: CORS header 'Access-Control-Allow-Origin' missing). Status code: 200.
Required IAM Permissions#
To configure CORS, your AWS credentials must have these permissions on the S3 bucket:
{
"Effect": "Allow",
"Action": [
"s3:PutBucketCORS",
"s3:GetBucketCORS"
],
"Resource": "arn:aws:s3:::<your-bucket-name>"
}
If you encounter an “AccessDenied” error, contact your AWS administrator to obtain these permissions.
Configure CORS for Main S3 Bucket#
Configure CORS for your main deployment bucket (S3_BUCKET_NAME):
# Source your environment variables
set -a && source my-env.sh && set +a
# Get your ingress hostname
INGRESS_HOSTNAME=$(docker exec cds-deployment bash -c "kubectl get ingress simple-ingress -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'")
echo "Configuring CORS for bucket: $S3_BUCKET_NAME"
echo "Allowing origin: https://$INGRESS_HOSTNAME"
# Apply CORS configuration with your specific ingress origin (RECOMMENDED for security)
docker exec cds-deployment bash -c "aws s3api put-bucket-cors \
--bucket \$S3_BUCKET_NAME \
--region \$AWS_REGION \
--cors-configuration '{
\"CORSRules\": [{
\"AllowedOrigins\": [\"https://$INGRESS_HOSTNAME\"],
\"AllowedMethods\": [\"GET\", \"HEAD\"],
\"AllowedHeaders\": [\"*\"],
\"ExposeHeaders\": [\"ETag\", \"Content-Length\", \"Content-Type\"],
\"MaxAgeSeconds\": 3600
}]
}'"
# Verify CORS configuration
docker exec cds-deployment bash -c "aws s3api get-bucket-cors --bucket \$S3_BUCKET_NAME --region \$AWS_REGION"
Alternative: Wildcard CORS (less secure, allows any origin):
docker exec cds-deployment bash -c "aws s3api put-bucket-cors \
--bucket \$S3_BUCKET_NAME \
--region \$AWS_REGION \
--cors-configuration '{
\"CORSRules\": [{
\"AllowedOrigins\": [\"*\"],
\"AllowedMethods\": [\"GET\", \"HEAD\"],
\"AllowedHeaders\": [\"*\"],
\"ExposeHeaders\": [\"ETag\", \"Content-Length\", \"Content-Type\"],
\"MaxAgeSeconds\": 3600
}]
}'"
Configure CORS for Custom S3 Bucket#
If you’re using a different S3 bucket for video storage (via CUSTOM_S3_BUCKET_NAME), you must also configure CORS for that bucket:
# Source environment variables
set -a && source my-env.sh && set +a
# Get your ingress hostname
INGRESS_HOSTNAME=$(docker exec cds-deployment bash -c "kubectl get ingress simple-ingress -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'")
# Configure AWS CLI with custom dataset bucket credentials
export PROFILE_NAME="${CUSTOM_S3_BUCKET_NAME}-profile"
aws configure set aws_access_key_id "${CUSTOM_AWS_ACCESS_KEY_ID}" --profile "${PROFILE_NAME}"
aws configure set aws_secret_access_key "${CUSTOM_AWS_SECRET_ACCESS_KEY}" --profile "${PROFILE_NAME}"
aws configure set region "${CUSTOM_AWS_REGION}" --profile "${PROFILE_NAME}"
echo "Configuring CORS for custom bucket: $CUSTOM_S3_BUCKET_NAME"
echo "Allowing origin: https://$INGRESS_HOSTNAME"
# Apply CORS to custom bucket with specific origin (RECOMMENDED)
aws s3api put-bucket-cors \
--bucket $CUSTOM_S3_BUCKET_NAME \
--region $CUSTOM_AWS_REGION \
--profile $PROFILE_NAME \
--cors-configuration "{
\"CORSRules\": [{
\"AllowedOrigins\": [\"https://${INGRESS_HOSTNAME}\"],
\"AllowedMethods\": [\"GET\", \"HEAD\"],
\"AllowedHeaders\": [\"*\"],
\"ExposeHeaders\": [\"ETag\", \"Content-Length\", \"Content-Type\"],
\"MaxAgeSeconds\": 3600
}]
}"
# Verify
aws s3api get-bucket-cors --bucket $CUSTOM_S3_BUCKET_NAME --region $CUSTOM_AWS_REGION --profile $PROFILE_NAME
Alternative: Wildcard CORS for custom bucket (less secure):
aws s3api put-bucket-cors \
--bucket $CUSTOM_S3_BUCKET_NAME \
--region $CUSTOM_AWS_REGION \
--profile $PROFILE_NAME \
--cors-configuration '{
"CORSRules": [{
"AllowedOrigins": ["*"],
"AllowedMethods": ["GET", "HEAD"],
"AllowedHeaders": ["*"],
"ExposeHeaders": ["ETag", "Content-Length", "Content-Type"],
"MaxAgeSeconds": 3600
}]
}'
Test CORS Configuration#
After configuring CORS, test that videos load in the web UI:
Open the web UI in your browser.
Perform a search.
Click Use this with search on a video result.
The video should load and play without errors.
If you still see CORS errors, verify the following:
CORS configuration is applied:
aws s3api get-bucket-cors --bucket <bucket-name> --region <region>Your ingress hostname matches the allowed origin in CORS rules
You’ve configured CORS on all buckets that store videos
Monitoring and Debugging#
Check Pod Status#
docker exec cds-deployment bash -c "kubectl get pods"
View Logs#
# CDS logs
docker exec cds-deployment bash -c "kubectl logs deployment/visual-search --tail=100"
# Cosmos-embed logs
docker exec cds-deployment bash -c "kubectl logs deployment/cosmos-embed-nvidia-nim-cosmos-embed --tail=100"
# Milvus query node logs
docker exec cds-deployment bash -c "kubectl logs deployment/milvus-querynode --tail=100"
Check Resources#
# Node status
docker exec cds-deployment bash -c "kubectl get nodes"
# Services and ingress
docker exec cds-deployment bash -c "kubectl get svc,ingress"
# Persistent volumes
docker exec cds-deployment bash -c "kubectl get pv,pvc"
Troubleshooting#
Pods Not Starting#
To check pod details, use the following command:
docker exec cds-deployment bash -c "kubectl describe pod <pod-name>"
The following are common causes of pods not starting:
Insufficient resources: Check node capacity.
Image pull errors: Verify
NGC_API_KEY.Volume mounting issues: Check PVC status.
Cosmos-embed Issues#
If the pod is pending, use the following command to check the details:
docker exec cds-deployment bash -c "kubectl describe pod -l app.kubernetes.io/name=nvidia-nim-cosmos-embed"
Check for the following:
GPU nodes available:
kubectl get nodes -l role=cvs-gpuGPU resources: One GPU per pod is required.
Proper scheduling: Should be on
cvs-gpulabeled nodes.
If the container is creating for a long time:
This is normal; it is downloading a large model (~20GB)
The process can take 10-15 minutes on first download.
Check the logs for more details:
kubectl logs -l app.kubernetes.io/name=nvidia-nim-cosmos-embed
Ingress Not Accessible#
Check ingress status:
docker exec cds-deployment bash -c "kubectl get ingress simple-ingress"
Check load balancer:
docker exec cds-deployment bash -c "kubectl get svc -n ingress-nginx"
Note
AWS ALB takes 2-3 minutes to become accessible after creation.
Service Health Checks#
# Get ingress hostname
docker exec cds-deployment bash -c "kubectl get ingress simple-ingress -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'"
# Test API (replace <hostname> with actual hostname)
curl -k https://<hostname>/api/v1/health
CORS Issues with S3 Videos#
Problem: Videos don’t load in the web UI with CORS error in browser console:
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource...
(Reason: CORS header 'Access-Control-Allow-Origin' missing)
Cause: S3 bucket doesn’t have CORS configuration to allow the web UI to access videos.
Quick Solution: Run the CORS verification and configuration script:
docker exec cds-deployment bash -c "cd /workspace/blueprint/bringup && ./verify_and_configure_cors.sh" # add -y | --apply for applying CORS
This script will perform the following steps:
Check your IAM permissions.
Verify current CORS configuration.
Guide you through configuring CORS interactively.
Manual Solution: Refer to the Manually Configure CORS for S3 Videos section for complete instructions on configuring CORS manually for both main and custom S3 buckets.
Cleanup#
Complete Removal#
When you’re done, use the following command to delete all resources:
docker exec cds-deployment bash -c "cd /workspace/blueprint/teardown && ./shutdown_sequence.sh -y"
Duration: 10-15 minutes
The following resources will be deleted:
All Kubernetes resources
EKS cluster and node groups
S3 bucket (optional - you’ll be prompted)
IAM roles and policies
CloudFormation stacks
Verify Cleanup#
To verify the cleanup, use the following command:
# Check cluster is gone
docker exec cds-deployment bash -c "eksctl get cluster --name \$CLUSTER_NAME"
# Should return: No clusters found
Quick Reference#
Essential Commands#
# Check all pods
docker exec cds-deployment bash -c "kubectl get pods"
# Check nodes
docker exec cds-deployment bash -c "kubectl get nodes"
# Get ingress URL
docker exec cds-deployment bash -c "kubectl get ingress simple-ingress"
# View logs
docker exec cds-deployment bash -c "kubectl logs <pod-name> --tail=100"
# Describe problematic pod
docker exec cds-deployment bash -c "kubectl describe pod <pod-name>"
# Verify and configure CORS (if videos don't load)
docker exec cds-deployment bash -c "cd /workspace/blueprint/bringup && ./verify_and_configure_cors.sh" # add -y | --apply for applying CORS
Deployment Checklist#
[ ] AWS account with required permissions
[ ] NGC API key obtained
[ ] Docker Hub credentials ready
[ ] Docker installed and running
[ ] Host-setup container built (
make build-host-setup)[ ] Environment variables configured in
my-env.sh[ ] Deployment container started
[ ] AWS credentials validated
[ ] EKS cluster created (
cluster_up.sh -y)[ ] S3 bucket configured (
s3_up.sh -y)[ ] Kubernetes services deployed (
k8s_up.sh -y)[ ] All 17 pods in “Running” status
[ ] CORS verification (run
./verify_and_configure_cors.sh)[ ] Deployment URLs obtained
[ ] CDS CLI installed on local machine
[ ] CLI configured and tested
Next Steps#
Next steps include installing the CDS CLI locally, configuring your API endpoint, and starting ingesting videos.