AI vWS Toolkit - Video Search and Summarization

Deployment Guide

  1. Set up a Linux VM in vCenter with the following configuration:

    • vCPU: 8 CPU

    • Memory: 72 GB

    • vGPU Profile: 48Q

    • Storage: 512GB

    ai-vws-001.png


  2. Install Ubuntu and set up the necessary dependencies listed below:

    • vGPU GRID Driver v19.0+

    • Docker v27.5.1+

    • Docker Compose v2.32.4+

    • NVIDIA Container Toolkit v1.13.5+

    • Open-vm-tools

    • Openssh-server

    • Dkms

  3. Blacklist nouveau driver

    Copy
    Copied!
                

    $ sudo vim /etc/modprobe.d/blacklist.conf $ blacklist nouveau $ options nouveau modeset=0

    ai-vws-002.png

  4. Update initramfs, then reboot.

    Copy
    Copied!
                

    $ sudo update-initramfs -u $ sudo reboot


  5. Install your preferred remoting protocol (i.e., NoMachine, Horizon, VNC). The rest of this guide will use NoMachine as the remote protocol.

  6. Download and install NVIDIA vGPU software.

    Copy
    Copied!
                

    $ sudo dpkg -i nvidia-linux-grid-xxx_xxx.xx.xx_amd64.deb

    ai-vws-003.png

  7. Once the driver utility has completed installation, reboot, then run the nvidia-smi command to verify the driver has been installed correctly. Be sure to license the virtual machine.

    ai-vws-004.png


  1. Download the Video Search and Summarization repository from GitHub

    Copy
    Copied!
                

    $ git clone https://github.com/NVIDIA-AI-Blueprints/video-search-and-summarization.git


  2. Create a Docker network that will be shared between VSS services and CV pipeline containers.

    Copy
    Copied!
                

    $ docker network create vss-shared-network


  3. Log in to NVIDIA’s container registry using your NGC API Key

    Copy
    Copied!
                

    # Log in to NVIDIA Container Registry docker login nvcr.io # Username: $oauthtoken # Password: <PASTE_NGC_API_KEY_HERE>


  4. Change to the directory containing the Event Reviewer Docker Compose configuration.

    Copy
    Copied!
                

    $ cd ~/video-search-and-summarization/deploy/docker/event_reviewer/


  5. Add your NGC_API_KEY to the .env file (replace $NGC_API_KEY with your own):

    Copy
    Copied!
                

    $ echo "NGC_API_KEY=$NGC_API_KEY" >> .env


  6. Add your HF_TOKEN to the .env file (replace $HF_TOKEN with your own):

    Copy
    Copied!
                

    $ echo "HF_TOKEN=$HF_TOKEN" >> .env


  7. For this project, we will need to increase the number of open files. Edit the compose.yaml file and add the following lines in the services/via-server/ulimits section:

    Copy
    Copied!
                

    nofile: soft: 65535 hard: 65535

    ai-vws-005.png

  8. Launch the complete VSS Event Reviewer stack including Alert Bridge, VLM Pipeline, Alert Inspector UI, and Video Storage Toolkit. Depending on your Internet connection, this could take 20 to 30 minutes to process.

    Copy
    Copied!
                

    $ ALERT_REVIEW_MEDIA_BASE_DIR=/tmp/alert-media-dir docker compose up

    ai-vws-006.png

  9. When you see the following, the containers have started up successfully and you can continue to the next section:

    ai-vws-007.png


  1. In a new terminal session, navigate to the computer vision event detector configuration.

    Copy
    Copied!
                

    $ cd ~/video-search-and-summarization/examples/cv-event-detector


  2. As with the previous container, we will need to increase the number of open files for this container, too. Edit the compose.yaml file and add the following lines in the services/via-server/ulimits section:

    Copy
    Copied!
                

    nofile: soft: 65535 hard: 65535

    ai-vws-008.png

  3. Launch the DeepStream computer vision pipeline and CV UI services.

    Copy
    Copied!
                

    $ ALERT_REVIEW_MEDIA_BASE_DIR=/tmp/alert-media-dir docker compose up

    When you see the following, the containers have spun up successfully and you can continue on to the next step.

    ai-vws-009.png


  1. Open a web browser and navigate to the Event Inspector UI at http://localhost:7862

    ai-vws-010.png


  2. In the Examples section, click the video that shows a thumbnail of a street.

    ai-vws-011.png


  3. Explain the options (don’t click anything else)

    ai-vws-012.png
    • CV Pipeline Parameters

      • Frame Skip- With the current setting, the CV model will be inspecting every 5th frame or every 5 seconds. This can help speed up the processing time, but if set too high, it could potentially miss an event

      • Object Detection Threshold - This setting is used to indicate how many objects detected, in this case one person, need to be found in the video before it starts to record events

    • VSS Alert Parameters

      • Enable Yes/No verification (leave this checked) - The model will return either a Yes or No based on the condition

      • Enable Description (leave unchecked) - The model will add more details to the response. This will add more processing time

      • System Prompt for Alerts - A simple prompt telling the agent what to do

    ai-vws-013.png

    • Alert Prompts

      • Alert Prompt - We are telling the agent what we want it to alert on, in this case, if it detects a person, are they using the crosswalk?

      • Enable Alert Reasoning (leave unchecked) - This allows the system to not only detect an object but to understand and narrate the context of an event, providing actionable insights and reducing the “noise” of irrelevant alerts. This will also add processing time.

  4. Click the green Process Video button

    ai-vws-014.png

    You can monitor the progress below the button. Note that the first time you process a video, there won’t be a progress bar. It should take about a minute to complete.

    ai-vws-015.png


  5. Open another tab in the browser and navigate to the Alert Inspector at http://localhost:7860

    ai-vws-016.png


  6. We can see the agent has found two instances of a person and alerted on whether or not they used the crosswalk.

    ai-vws-017.png

    Click on the Chat icon next to the first entry. Note a video now loads in the Preview section.

    ai-vws-018.png

    The video presented shows a short clip of when a person was detected crossing the street.

    You can use the chat interface to ask a relevant question about what is happening in the video clip.

    ai-vws-019.png


  1. Go back to the VSS as Inspector tab and click the last video thumbnail under Examples.

    ai-vws-020.png


  2. In the VSS Alert Parameters, click the box next to Enable Description and Enable Alert Reasoning. Click the Process Video button.

    ai-vws-021.png


  3. This process will take a bit longer as it is producing more detailed descriptions for the responses and using reasoning for each alert. When the process has completed, click over to the Video Search and Summarization Agent tab.

  4. You can see we have a more detailed description in the VLM Response column with information on why the alert resulted in a true response.

    ai-vws-022.png


  5. You can click the chat icon to see a short clip of what triggered the alert and ask questions related to it.

Previous Quickstart Guide
Next Configuration
© Copyright © 2013-2026, NVIDIA Corporation. Last updated on Feb 26, 2026