What can I help you with?
AI vWS Toolkit - Converting PDFs to AI-Generated Podcasts

Deployment Guide

Note

Although this guide uses vCenter, NVIDIA AI vWS can be deployed on any NVIDIA vGPU-supported hypervisor. It’s assumed that all vWS requirements, including licensing, are already configured.

  1. Set up a Linux VM in vCenter with the following configuration:

    • vCPU - 16 CPU

    • Memory - 32 GB

    • vGPU Profile - 48Q

    ai-ws-001.png


  2. Install Ubuntu and set up the necessary dependencies listed below:

    • open-vm-tool (reboot required after installation)

    • openssh-server

    • vim

    • net-tools

    • build-essential

    • dkms

    • fuse3

    • libfuse2

    ai-ws-002.png


  3. Blacklist nouveau driver

    Copy
    Copied!
                

    $ sudo vim /etc/modprobe.d/blacklist.conf $ blacklist nouveau $ options nouveau modeset=0

    ai-ws-003.png

  4. Update initramfs, then reboot.

    Copy
    Copied!
                

    $ sudo update-initramfs -u $ sudo reboot


  5. Install your preferred remoting protocol (i.e., NoMachine, Horizon, VNC). The rest of this guide will use NoMachine as the remote protocol.

  6. Download and install NVIDIA vGPU software.

    Copy
    Copied!
                

    $ sudo chmod +x nvidia-linux-grid-xxx_xxx.xx.xx_amd64.deb $ sudo dpkg -i nvidia-linux-grid-xxx_xxx.xx.xx_amd64.deb

    ai-ws-004.png

  7. Once the driver utility has completed installation, reboot, then run the nvidia-smi command to verify the driver has been installed correctly.

    ai-ws-005.png


  8. At this point, the VM setup is complete. Next, install AI Workbench on the Ubuntu VW. AI Workbench can be downloaded from the NVIDIA website. The installation guide for Ubuntu can be found here. The installation guide for Windows can be found here.

    Note

    After you update AI Workbench on your local computer, you must also update any connected remote locations. For details, see Update AI Workbench on a Remote Computer.

    ai-ws-006.png

  1. Visit the NVIDIA Developer Program page, click on Join and sign up for an NVIDIA account.

    ai-ws-007.png


  2. Use the NVIDIA Developer Program credential to log into NVIDIA NGC Catalog

    ai-ws-008.png


  3. Click the account name at the top right. In the drop-down menu, select Setup.

    ai-ws-009.png


  4. Click on “Generate API Key” then click on “+ Generate Personal Key”

    ai-ws-010.png

    ai-ws-011.png


  5. Enter the key name and expiration. Under Services Included, make sure NGC Catalog is selected.

    ai-ws-012.png


  6. Once your personal API key is generated, save the key that is required for accessing NVIDIA NIMs during the subsequent deployment phase.

    ai-ws-013.png


  1. Launch NVIDIA AI Workbench and clone the pdf-to-podcast project from GitHub. In AI Workbench, select Clone Project, then enter the repository URL to start the cloning process.

    ai-ws-014.png


  2. AI Workbench will download the repository shortly. Click on the bottom status bar to monitor progress.

    ai-ws-015.png


  3. Once the build is completed, click the Environment tab on the left and scroll down to Secrets. Enter your ElevenLabs and NVIDIA API keys. The ElevenLabs API key can be generated after requesting credits and the NVIDIA API key can be generated from the previous instructions.

    ai-ws-016.png


  4. Once the Secret has been set up, click Start Environment on the top right. This will start the container service for this model.

    ai-ws-017.png


  5. Once the service container is started, you may utilize Open Frontend or Jupyterlab to configure the GPU resources for this project. The default configuration is the NVIDIA Cloud Functions (NVCF) endpoint. To modify it for running the project locally, click on the “Agent Configurations” tab on the browser then follow the example of llama-3.1-8b-instruct NIM highlighted below. Replace the model.json contents with the local NIM instruction and click the disk icon to save the changes.

    ai-ws-018.png


  6. The NVIDIA NIM requires sufficient privileges to the ~/.cache/nim before it can run properly. Use a remote session to connect to the Linux VM and grant the privileges by the following command.

    Copy
    Copied!
                

    $ sudo chmod -R a+w ~/.cache/nim


  7. Select “local” profile in Compose in the Environment tab in AI Workbench UI. And click Start. The docker compose will then pull down assets and compose the containers.

    ai-ws-019.png


  8. The docker compose may take up to 20 minutes, depending on how the VM is configured and how fast the internet connection is. When the compose is finished and the project is ready, the following can be found in the compose log:

    Copy
    Copied!
                

    local-nim-1 | INFO 2025-03-14 20:48:40.382 metrics.py:351] Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%.


There are a few customizations in the blueprint.

  1. Rename your virtual podcast hosts. Their names can be modified in the Python code in the project folder /frontend/utils through JupyterLab.

    ai-ws-020.png


  2. Use different NIMs for different expertised outputs. The NIMs can be customized in the models.json file located in the project root folder using JupiterLab. It also needs to match docker-comose.yaml. Different NIM models have varying GPU memory requirements, so ensure the model fits within your GPU’s memory capacity.

    ai-ws-021.png

    ai-ws-022.png

    ai-ws-023.png


  3. Generated transcription and podcast audio can be found in /frontend/demo-outputs for further customization and/or debugging.

    ai-ws-024.png


  1. To start the project, open or switch back to the front end. Drop the target PDF and context PDF (optional for a better outcome). Then click the Generate Podcast button. This process may take up to 20 minutes, depending on the system configuration and the complexity of the PDF files.

    ai-ws-025.png


  2. To view the GPU consumption during the podcast generation process, you can use monitoring tools like NVTOP.

    ai-ws-026.png


  3. The transcription and podcast audio will be available after the project finishes.

    ai-ws-027.png


Previous Quickstart Guide
Next Sizing Guide
© Copyright © 2013-2025, NVIDIA Corporation. Last updated on Apr 21, 2025.