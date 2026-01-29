PyNvVideoCodec includes a comprehensive set of sample applications and Jupyter notebooks that demonstrate the API usage across various use cases. These samples are organized into three categories:

Basic Sample Applications [.\samples\basic]: Demonstrate fundamental encoding, decoding, and transcoding workflows suitable for getting started with PyNvVideoCodec.

[.\samples\basic]: Demonstrate fundamental encoding, decoding, and transcoding workflows suitable for getting started with PyNvVideoCodec. Advanced Sample Applications [.\samples\advanced]: Showcase advanced features such as low-latency decoding, SEI message handling, performance measurement, and CUDA resource management.

[.\samples\advanced]: Showcase advanced features such as low-latency decoding, SEI message handling, performance measurement, and CUDA resource management. Jupyter Notebooks [.\samples\jupyter]: Provide interactive tutorials that guide users through the API with step-by-step explanations and live code execution.

Ensure you have the following prerequisites before running the samples:

NVIDIA GPU with appropriate drivers installed

PyNvVideoCodec module is installed

Required dependencies installed by running the following command on shell prompt: Copy Copied! pip install -r samples/requirements.txt

The following table provides an overview of all sample applications and their objectives:

Type Application Name Objective Basic simple_decode_sampling.py Decode multiple video files and sample frames at regular intervals. Converts decoded frames to PyTorch tensors for use in machine learning workflows. encode.py Encode raw video frames into a compressed video file. Supports reading input from both CPU memory and GPU memory. create_video_segments.py Extract specific portions of a video based on start and end timestamps. Useful for creating training clips or trimming videos. Jupyter simple_decode_tutorial.ipynb Step-by-step tutorial for video decoding. Learn how to access frames by index, retrieve video metadata, and reuse the decoder for multiple files. object_detection_tutorial.ipynb Run object detection on video frames using a pre-trained deep learning model. Uses threaded decoding to run video decoding and AI inference in parallel. Advanced decode.py Reads a video file using low level core decoder interface and outputs raw decoded frame decode_with_cuda_control.py Decode video with explicit control over CUDA resources. Useful when integrating with other CUDA-based libraries that require specific context or stream management. decode_from_memory_buffer.py Decode video data directly from memory instead of reading from a file. Useful for processing video received over a network or stored in memory. decode_with_low_latency.py Decode video with minimal delay between input and output. Useful for real-time applications like video conferencing or live streaming. decode_perf.py Measure video decoding speed using multiple decoder instances in parallel. Helps evaluate system performance and throughput. decode_sei_msg.py Extract metadata embedded in the video stream (SEI messages). Used to retrieve information like HDR settings, timecodes, or custom data. simple_decode_stats.py Extracts decode stats QP values, coding-unit types, and motion vectors from video stream decode_reconfigure.py Dynamically reconfigure the core decoder to handle resolution changes. Switch between input streams with different dimensions without recreating the decoder. encode_reconfigure.py Change encoding settings (like bitrate) while encoding is in progress. Useful for adaptive streaming where quality adjusts based on network conditions. encode_sei_msg.py Embed custom metadata into the encoded video stream. Used to add HDR information, timecodes, or application-specific data. encode_perf.py Measure video encoding perf using multiple encoder instances in parallel. Helps evaluate system performance and throughput.

The basic sample applications demonstrate fundamental video decode, encode and transcode workflows. These applications are ideal for getting started with PyNvVideoCodec.

This sample demonstrates multi-file video decoding with frame sampling and tensor conversion, which is a common use case in machine learning pipelines. The application accepts one or more video files as input and samples frames evenly across each video's duration. Using the SimpleDecoder API, frames are decoded directly to GPU memory in RGB format and converted to PyTorch tensors for downstream processing. The sample showcases decoder reconfiguration, allowing a single decoder instance to process multiple video files of different resolutions efficiently. Frames are sampled evenly across the video duration based on the specified frame count.

Copy Copied! python simple_decode_sampling.py video1.mp4 video2.mp4 -g 0 -f 16

Parameter Type Description video_files string One or more video files to process (positional arguments) -g, --gpu-id int GPU device ID to use (default: 0) -f, --frames int Number of frames to sample per video (default: 8)





This sample demonstrates video encoding with support for both CPU and GPU buffer modes. The application reads raw YUV frames from an input file and encodes them using PyNvVideoCodec's encode API. In CPU buffer mode, frame data is read into host memory and copied to CUDA buffers for encoding. In GPU buffer mode, frame data is loaded directly into CUDA device buffers, reducing memory transfer overhead. The sample supports multiple codecs (H.264, HEVC, AV1) and various input formats (NV12, YUV444, P010, etc.). Encoder configuration can be customized via a JSON configuration file for fine-grained control over encoding parameters such as bitrate, GOP size, and quality presets.

Copy Copied! python encode.py -i input.yuv -s 1920x1080 -m gpu -if NV12 -c h264

Parameter Type Description -i, --input string Path to raw input video file (YUV format) -o, --output string Path to output encoded video file -s, --size string Frame size in WxH format (e.g., 1920x1080) -m, --mode string Buffer mode: cpu or gpu (default: gpu) -if, --format string Input pixel format: NV12, YUV444, P010, ARGB, ABGR, etc. (default: NV12) -c, --codec string Output codec: h264, hevc, av1 (default: h264) -f, --frames int Number of frames to encode (default: all frames) -g, --gpu-id int GPU device ID to use (default: 0) -json string Path to JSON encoder configuration file





This sample demonstrates creation of smaller video segments from bigger video file. This approach is useful for creating small meaning video clips for training of AI models. The application extracts segments from a video file based on timestamp ranges specified in a text file. The sample uses the SimpleDecoder to retrieve video metadata for duration validation, then uses the Transcoder API to extract each segment as a separate output file. Output files are automatically named with the timestamp range appended to the base filename. Timestamps that exceed the video duration are automatically clipped, and invalid ranges are reported with error messages.

Copy Copied! python create_video_segments.py -i input.mp4 -s segments.txt -c transcode_config.json

Parameter Type Description -i, --input string Path to input video file -s, --segments string Path to segments file containing timestamp ranges (default: segments.txt) -c, --config string Path to transcoder configuration JSON file (default: transcode_config.json) -o, --output string Output filename template (timestamp is appended automatically) -g, --gpu-id int GPU device ID to use (default: 0)

The segments file should contain one timestamp range per line in the format: start_time end_time

Example segments.txt:

Copy Copied! 0.0 10.5 15.0 30.0 45.5 60.0

The Jupyter notebooks provide interactive tutorials that guide users through the PyNvVideoCodec APIs with step-by-step explanations and live code execution. These notebooks are ideal for learning and experimentation.

To run the Jupyter notebooks, follow these steps:

Ensure PyNvVideoCodec is installed in your Python environment. Install the required dependencies: Copy Copied! pip install -r samples/requirements.txt Navigate to the notebooks directory and launch Jupyter: Copy Copied! cd samples/jupyter jupyter notebook Open the desired notebook from the Jupyter interface in your browser.

Note: Ensure that the CUDA toolkit is installed and your NVIDIA GPU drivers are up to date before running the notebooks.





This notebook provides a comprehensive tutorial on using the SimpleDecoder API for GPU-accelerated video decoding. The tutorial demonstrates how to decode videos efficiently and access frames using various methods. The notebook covers multiple frame access methods: single frame access by index, seeking to specific frames, batch frame retrieval, accessing frames by specific indices, slice-based access, and time-based frame retrieval. Another key feature demonstrated is decoder reconfiguration, which allows reusing a single decoder instance to process multiple video files of different resolutions without the overhead of creating new decoder instances. This is particularly useful in deep learning pipelines where processing many short video clips efficiently is critical. The sample also displays decoded output frames.

Key Features Demonstrated:

Basic decoder setup and initialization with SimpleDecoder

Retrieving video stream metadata (resolution, frame count, FPS, duration)

Advanced stream information including key frame locations and GOP structure

Multiple frame access methods: index, batch, slice, and time-based

Decoder reconfiguration for processing multiple videos efficiently

Converting decoded frames to PyTorch tensors

This notebook demonstrates a practical deep learning use case: real-time object detection in video using ThreadedDecoder is used to decode video in parallel. Unlike a normal decoder where decode and inference are executed sequentially (causing the inference to stall while waiting for frames), the ThreadedDecoder runs decoding on a separate background thread and keeps next batch of pre-decoded frames ready for inference. This ensures that decoded frames are always ready when the inference model needs them, eliminating pipeline stalls and maximizing GPU utilization. The tutorial downloads a sample video, sets up a ThreadedDecoder with configurable buffer size, and uses a pre-trained Faster R-CNN model with ResNet-50 backbone to detect objects in video frames. The model is trained on the COCO dataset and can identify 91 different object classes. Detection results are filtered by confidence threshold and displayed with bounding boxes and class labels overlaid on each frame. This notebook serves as a template for building efficient video analytics applications.

Key Features Demonstrated:

Understanding the difference between normal decoder and ThreadedDecoder

Setting up ThreadedDecoder with buffer size configuration

Integrating video decoding with PyTorch deep learning models

Running Faster R-CNN object detection on decoded frames

Filtering detection results by confidence threshold

Visualizing object detection results with bounding boxes and labels

Building efficient parallel decode-inference pipelines

This section demonstrates advanced features of PyNvVideoCodec including low-latency decoding, SEI message handling, performance measurement, CUDA resource management, and decode statistics extraction. These samples are designed for users who need fine-grained control over video processing workflows.

This sample demonstrates basic video decoding using the core decoder API with a demuxer. The application creates a demuxer to extract encoded packets from a video file and sets up a core decoder for hardware-accelerated decoding.

The decode pipeline in this sample is as follows:

video file -> demuxer -> packets -> decoder -> raw YUV frames.

The sample shows how to work with both device and host memory for frame data, query hardware capabilities like the number of NVDEC engines, limit the number of frames decoded using the frame count parameter. Output is written as raw YUV frames to a file.

Copy Copied! python decode.py -i input.mp4 -o output.yuv -d 1

Parameter Type Description -i, --input string Path to input video file -o, --output string Path to output raw YUV file (default: <input_name>.yuv) -d int Use device memory (1) or host memory (0) (default: 1) -g, --gpu-id int GPU device ID to use (default: 0) -f, --frames int Number of frames to decode (default: all frames)





This sample demonstrates advanced video decoding with explicit CUDA context and stream management. This application gives full control over CUDA resources by manually initializing CUDA contexts and streams before creating the decoder. This is essential for applications that need to share CUDA resources across multiple components or require precise control over GPU memory and synchronization. The sample shows proper CUDA resource cleanup order (decoder, demuxer, stream, context), switching between device memory (zero-copy) and host memory modes, and querying hardware capabilities.

Copy Copied! python decode_with_cuda_control.py -i input.mp4 -o output.yuv -d 1

Parameter Type Description -i, --input string Path to input video file -o, --output string Path to output raw YUV file (default: <input_name>.yuv) -d int Use device memory (1) or host memory (0) (default: 1) -g, --gpu-id int GPU device ID to use (default: 0) -f, --frames int Number of frames to decode (default: all frames)





This sample demonstrates decoding video directly from memory. This approach is useful when video data comes from network streams, memory-mapped files, or any source where data is available in memory rather than as a file on disk. The application implements a custom VideoStreamFeeder class that reads video data into memory and feeds chunks to the demuxer through a callback mechanism. This saves file I/O overhead and enables streaming applications. The sample shows how to create a callback-based demuxer, manage video data buffers and chunk sizes efficiently, implement proper buffer position tracking and EOF handling, and work with both device and host memory for decoded frames.

Copy Copied! python decode_from_memory_buffer.py -i input.mp4 -o output.yuv -d 1

Parameter Type Description -i, --input string Path to input video file -o, --output string Path to output raw YUV file (default: <input_name>.yuv) -d int Use device memory (1) or host memory (0) (default: 1) -g, --gpu-id int GPU device ID to use (default: 0) -f, --frames int Number of frames to decode (default: all frames)





This sample demonstrates low-latency video decoding using different latency modes. The decoder supports three latency modes:

NATIVE (0): Has at least 1 frame latency for streams with B-frames and outputs in display order.

Has at least 1 frame latency for streams with B-frames and outputs in display order. LOW (1): Has zero latency for All-Intra and IPPP sequences (without B-frames) while maintaining display order.

Has zero latency for All-Intra and IPPP sequences (without B-frames) while maintaining display order. ZERO (2): Has zero latency for All-Intra/IPPP streams and outputs in decode order.

Low and zero latency modes should not be used with streams containing B-frames. For low latency modes, the sample sets the ENDOFPICTURE flag on packets to trigger immediate decode. This is useful for real-time video applications such as live streaming, video conferencing, and interactive media where minimizing decode latency is critical.

Copy Copied! python decode_with_low_latency.py -i input.mp4 -o output.yuv -dl 1

Parameter Type Description -i, --input string Path to input video file -o, --output string Path to output raw YUV file (default: <input_name>.yuv) -d int Use device memory (1) or host memory (0) (default: 1) -g, --gpu-id int GPU device ID to use (default: 0) -f, --frames int Number of frames to decode (default: all frames) -dl int Decode latency mode: 0=NATIVE, 1=LOW, 2=ZERO





This sample demonstrates how to measure video decoding performance. The application supports two execution modes:

Thread mode: Offers higher performance with lower overhead, shared memory access, and shared CUDA context between decoder instances. Python GIL is not a bottleneck as decoder operations are GIL-free.

Offers higher performance with lower overhead, shared memory access, and shared CUDA context between decoder instances. Python GIL is not a bottleneck as decoder operations are GIL-free. Process mode: Provides complete isolation between decoder instances with independent GPU memory.

The sample reports detailed performance metrics including frames per second (FPS), total frames decoded, and wall time. This is a performance testing application that does not write output files.

Copy Copied! python decode_perf.py -i input.mp4 -n 4 -m thread

Parameter Type Description -i, --input string Path to input video file -n int Number of parallel instances (default: 1) -m, --mode string Execution mode: thread or process (default: thread) -d int Use device memory (1) or host memory (0) (default: 1) -g, --gpu-id int GPU device ID to use (default: 0) -f, --frames int Number of frames to decode per instance (default: all frames)





This sample demonstrates how to extract and parse Supplemental Enhancement Information (SEI) messages from video streams during decoding. SEI messages are additional data embedded in video streams that provide supplementary information such as HDR/display metadata (color volume, light levels, transfer characteristics), timecode data for frame timing and sequence information, and custom metadata for application-specific needs. Common use cases include HDR display configuration for video playback, frame-accurate editing in content creation, timing synchronization in broadcast, and embedding application-specific metadata. The sample outputs raw binary SEI messages to one file and pickled SEI type information to another file, enabling further analysis or processing.

Copy Copied! python decode_sei_msg.py -i input.mp4 -s sei_message.bin -st sei_type_message.bin

Parameter Type Description -i, --input string Path to input video file -s string Output SEI message file (default: sei_message.bin) -st string Output SEI type message file (default: sei_type_message.bin) -d int Use device memory (1) or host memory (0) (default: 1) -g, --gpu-id int GPU device ID to use (default: 0)





This sample demonstrates how to extract and analyze decode statistics from H.264 and H.265 video streams using the SimpleDecoder API. The statistics collected include:

QP (Quantization Parameter): Analysis with average, min, and max values per frame indicating compression levels.

Analysis with average, min, and max values per frame indicating compression levels. CU (Coding Unit) type distribution: Shows INTRA (spatial prediction), INTER (temporal prediction), SKIP (copy from reference), and PCM (uncompressed) blocks.

Shows INTRA (spatial prediction), INTER (temporal prediction), SKIP (copy from reference), and PCM (uncompressed) blocks. Motion vector statistics: For L0 and L1 references, enabling temporal complexity assessment.

For L0 and L1 references, enabling temporal complexity assessment. Macroblock details: Provides per-block encoding decisions and parameters.

These statistics are valuable for video quality analysis, encoder behavior understanding, performance optimization, and debugging encoding/decoding issues. The output is written as formatted text to a statistics file.

Copy Copied! python simple_decode_stats.py -i input.mp4 -p output_stats.txt -d 1

Parameter Type Description -i, --input string Path to input video file -p string Output file for statistics (default: <input_name>_stats.txt) -d int Use device memory (1) or host memory (0) (default: 1) -g, --gpu-id int GPU device ID to use (default: 0)

Note: The decode statistics feature requires NVIDIA display driver version 590 or newer.





This sample demonstrates dynamic core decoder reconfiguration for handling resolution changes during playback. The application shows how to switch between input streams with different dimensions without recreating the decoder, which is useful for adaptive streaming scenarios or processing multiple video files with varying resolutions. The sample creates a decoder with maximum dimensions to accommodate both streams, decodes frames from the first stream, uses setReconfigParams() to reconfigure for the second stream's dimensions, and continues decoding the second stream.

Copy Copied! python decode_reconfigure.py -i1 video1.mp4 -i2 video2.mp4 -o1 output1.yuv -o2 output2.yuv

Parameter Type Description -i1 string Path to first input video file -i2 string Path to second input video file (can have different resolution) -o1 string Path to first output raw YUV file (default: <input1_name>.yuv) -o2 string Path to second output raw YUV file (default: <input2_name>.yuv) -d int Use device memory (1) or host memory (0) (default: 1) -g, --gpu-id int GPU device ID to use (default: 0) -f, --frames int Number of frames to decode per stream (default: all frames)





This sample demonstrates dynamic encoder reconfiguration for bitrate control at runtime. The application shows how to modify encoder parameters without resetting the encoder session, which is useful for adaptive bitrate streaming and dynamic quality adjustment. The sample changes the bitrate every 100 frames: at frame 0 it uses the original bitrate, at frame 100 it reduces to half the bitrate, at frame 200 it restores the original bitrate, and so on. The reconfiguration also handles VBV (Video Buffer Verifier) parameters including buffer size and initial delay. This capability is essential for live streaming applications that need to adapt to changing network conditions or for video conferencing systems that adjust quality based on bandwidth availability.

Copy Copied! python encode_reconfigure.py -i input.yuv -s 1920x1080 -c h264

Parameter Type Description -i, --input string Path to raw input video file (YUV format) -o, --output string Path to output encoded video file -s, --size string Frame size in WxH format (e.g., 1920x1080) -if, --format string Input pixel format (default: NV12) -c, --codec string Output codec: h264, hevc, av1 -f, --frames int Number of frames to encode (default: all frames) -g, --gpu-id int GPU device ID to use (default: 0) -json string Path to JSON encoder configuration file





This sample demonstrates SEI (Supplemental Enhancement Information) message insertion during encoding. The application shows how to embed custom metadata into the encoded bitstream, which can be extracted during playback or transcoding. SEI messages support various types including HDR/display metadata (color volume, light levels, transfer characteristics), timecode data for frame timing, and custom user-defined data. The sample handles codec-specific SEI types: for H.264 and HEVC it uses SEI type 5 (user data unregistered), and for AV1 it uses type 6. Common use cases include embedding HDR metadata for proper display configuration, inserting timecodes for broadcast synchronization, and adding application-specific data for content management systems.

Copy Copied! python encode_sei_msg.py -i input.yuv -s 1920x1080 -c hevc

Parameter Type Description -i, --input string Path to raw input video file (YUV format) -o, --output string Path to output encoded video file (auto-generated as <input>.<codec>) -s, --size string Frame size in WxH format (e.g., 1920x1080) -if, --format string Input pixel format (default: NV12) -c, --codec string Output codec: h264, hevc, av1 -f, --frames int Number of frames to encode (default: all frames) -g, --gpu-id int GPU device ID to use (default: 0) -json string Path to JSON encoder configuration file





This sample demonstrates advanced parallel video encoding using multiple threads or processes to achieve higher encoding throughput. Similar to decode_perf.py, it supports two execution modes:

Thread mode: Runs multiple encoders in the same process with shared memory access, lower overhead, and simpler synchronization.

Runs multiple encoders in the same process with shared memory access, lower overhead, and simpler synchronization. Process mode: Runs multiple encoders in separate processes with IPC memory sharing, providing better isolation and stability.

The sample reports detailed performance metrics including total frames encoded, combined FPS across all workers, and average FPS per instance. This is a performance testing application that does not write output files.

Copy Copied! python encode_perf.py -m thread -i input.yuv -s 1920x1080 -n 4