Glossary#

List of commonly used acronyms and terms:

Acronyms and Terms#
Term	Definition
Audio2Face-2D	Audio2Face-2D (A2F-2D) is a generative model that converts audio input into life-like 2D mouth movement animations of a provided 2D portrait photo.
Audio2Face-3D	Audio2Face-3D (A2F-3D) is a generative AI technology that converts audio input into life-like 3D facial animations, including emotional expressions.
Animation Graph	NVIDIA Animation Graph is a runtime framework within NVIDIA’s Omniverse platform for skeletal animation blending, playback, and control.
ASR	Automatic speech recognition (ASR), or speech-to-text, is the combination of processes and software that decode human speech and convert it to digitized text.
Audio2Emotion	NVIDIA Audio2Emotion is a component of NVIDIA’s Audio2Face-3D technology that uses AI to detect emotional state from voice and adjust the facial animation of a 3D character accordingly.
AWS	Amazon Web Services (AWS) is a comprehensive cloud computing platform offering IaaS, PaaS, and SaaS solutions on a pay-as-you-go basis. It provides a wide range of services including compute power, database storage, and content delivery.
Azure	Microsoft Azure is Microsoft’s public cloud computing platform offering IaaS, PaaS, SaaS, and serverless functions. It supports various technologies and operates on a pay-as-you-go basis.
Bare-metal	A physical server that is dedicated entirely to a single user or tenant, providing direct access to the server’s hardware without any virtualization layer.
CSP	Cloud Service Provider is a company that offers various components of cloud computing services, enabling businesses and individuals to access and utilize computing resources over the internet. Example CSPs: Amazon Web Services (AWS) Microsoft Azure Google Cloud Platform (GCP)
Digital Humans	Digital Humans are AI-powered virtual representations of humans that combine computer graphics, computer vision, and artificial intelligence to create highly realistic and interactive virtual characters.
GCP	Google Cloud Platform (GCP) is a suite of cloud computing services providing tools for building, deploying, and managing applications. It includes services like computing power, storage, databases, and machine learning, running on Google’s global infrastructure.
Guardrails	Guardrails in the context of Large Language Models (LLMs) are critical safety measures designed to monitor, control, and ensure the safe and responsible operation of these powerful AI systems.
Helm	A package manager for Kubernetes that simplifies the process of installing and managing Kubernetes applications by using pre-configured packages called Helm charts.
Kubernetes	A widely used container orchestration platform designed to automate the deployment, scaling and management of containerized applications.
LLM	Large Language Models (LLMs) are deep learning algorithms that can recognize, summarize, translate, predict, and generate content using very large datasets.
NAT	Network Address Translation(NAT) is a technique used to manage and conserve IP addresses within a network, particularly in the context of the limited number of available IPv4 addresses.
NeMo	NVIDIA NeMo is an end-to-end platform for developing custom generative AI—including large language models (LLMs), vision language models (VLMs), video models, and speech AI—anywhere.
NGC Catalog	NGC (NVIDIA GPU Cloud) Catalog is a curated collection of GPU-optimized software, including containers, pre-trained models, Helm charts for Kubernetes deployments and industry-specific AI toolkits with software development kits (SDKs).
NLP	Natural language processing (NLP) is the application of AI to process and analyze text or voice data in order to understand, interpret, categorize, and/or derive insights from the content.
NVIDIA ACE	NVIDIA ACE (Avatar Cloud Engine) is a comprehensive suite of technologies and tools designed to bring digital humans to life using generative AI.
NVIDIA Blueprint	NVIDIA Blueprints are predefined, pretrained AI workflows designed to simplify and accelerate the development of various AI applications.
NVIDIA GPU	NVIDIA Graphics Processing Unit (GPU) is a specialized electronic circuit designed for graphics rendering and high-speed mathematical calculations, used in gaming, professional graphics, AI, and high-performance computing.
NVIDIA Maxine	NVIDIA Maxine is a suite of GPU-accelerated SDKs and NIM Microservices that enhance audio, video, and augmented reality effects for real-time communications, including features like noise reduction, video upscaling, and eye-gaze correction.
NVIDIA NIM	NVIDIA Inference Microservices (NIM) is a set of easy-to-use microservices designed to accelerate the deployment of generative AI models across cloud, data center, and workstation environments. NIM provides pre-optimized inference engines, such as TensorRT and TensorRT-LLM, to run AI models on NVIDIA GPUs, ensuring low-latency and high-throughput performance.
NVIDIA Omniverse	It is a platform of APIs, SDKs, and services that enable developers to integrate OpenUSD, NVIDIA RTX™ rendering technologies, and generative physical AI into existing software tools and simulation workflows for industrial and robotic use cases.
NVIDIA UCS	NVIDIA UCS (Unified Cloud Services) is a low-code framework designed for developing cloud-native, real-time, and multimodal AI applications. It adopts a microservices architecture, allowing developers to combine microservices into cloud-native applications or services
RAG	Retrieval-Augmented Generation (RAG), a generative AI architecture that combines Large Language Models (LLMs) with a data retrieval component to generate accurate and up-to-date responses. It retrieves relevant information from external knowledge bases and uses this data to inform the generated output. This approach enhances the reliability and accuracy of LLMs.
Riva	NVIDIA Riva is an AI speech SDK that provides a set of tools for building conversational AI applications, including speech recognition, text-to-speech, and natural language processing capabilities, optimized for NVIDIA GPUs.
RTSP	Real-time streaming protocol (RTSP) is a network protocol that controls how the streaming of a media should occur between a server and a client.
SDR	Stream Distribution & Routing(SDR) provides a way to distribute media streams to the individual pods and is responsible for the routing and stream state management.
STUN	A STUN (Session Traversal Utilities for NAT) server is a type of server used in VoIP (Voice over Internet Protocol) and other real-time communication systems to help clients behind firewalls or NAT (Network Address Translation) devices to connect with other clients.
TTS	Text-to-speech is a form of speech synthesis that converts any string of text characters into spoken output.
TURN	Traversal Using Relays around NAT(TURN) is a network protocol and server technology designed to facilitate communication between devices that are behind Network Address Translation (NAT) systems or firewalls, where direct peer-to-peer connections are not possible.
Voice Font	Refers to NVIDIA Voice Font Microservice. This feature converts the speaker’s timbre from input audio to that of the reference audio, while retaining the linguistic content and prosody from the input.
UMIM	Unified Multimodal Interaction Management (UMIM) provides an interaction level interface between the interaction manager (IM) - a decision making unit - and an interactive system executing the commands from the IM.
VST	NVIDIA Video Storage Toolkit (VST), also referred to as VMS (Video Management System) manages audio and video streams and provides on demand access to offline streams from storage. It accepts a WebRTC stream from the front end UI application and outputs RTSP streams for further processing.