Accelerated Decode with FFmpeg#
The NVIDIA ffmpeg package supports hardware-accelerated decoding on NVIDIA® Jetson™ devices.
On the Jetson Thor platform, FFmpeg supports hardware-accelerated decoding, encoding, and transcoding
using both nvbufsurface and cuda hardware acceleration interfaces. Hardware-accelerated
video transformation operations using VIC are supported with the nvbufsurface interface.
GPU-based transformation operations are supported with both nvbufsurface and cuda interfaces.
Install ffmpeg Binary Package in Jetson Linux Builds#
To install the ffmpeg binary package in Jetson Linux Builds, enter the following commands:
$ sudo apt update
$ sudo apt install -y ffmpeg
Get Source Files for the ffmpeg Package#
To get source files for the ffmpeg package, enter the following command:
$ apt source ffmpeg
Prerequisite for the Jetson Thor Platform#
Install the CUDA Toolkit from the latest JetPack release.
Decoding on the Jetson Thor Platform#
Using nvbufsurface Hardware Acceleration
H.264 decode:
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \ -c:v h264_cuvid -i input.mp4 -vf "hwdownload,format=nv12" output.yuvH.265 decode:
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \ -c:v hevc_cuvid -i input.mp4 -vf "hwdownload,format=nv12" output.yuvDecode with automatic codec detection:
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \ -i input.mp4 -vf "hwdownload,format=nv12" output.yuv
Using CUDA Hardware Acceleration
H.264 decode:
$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \ -c:v h264_cuvid -i input.mp4 -vf "hwdownload,format=nv12" output.yuvH.265 decode:
$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \ -c:v hevc_cuvid -i input.mp4 -vf "hwdownload,format=nv12" output.yuvDecode with automatic codec detection:
$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \ -i input.mp4 -vf "hwdownload,format=nv12" output.yuv
Encoding on the Jetson Thor Platform#
Using nvbufsurface Hardware Acceleration
H.264 encode:
$ ffmpeg -f rawvideo -pix_fmt nv12 -s 320x240 -i output.yuv \ -vf hwupload_nvbufsurface -c:v h264_nvenc output.mp4H.265 encode:
$ ffmpeg -f rawvideo -pix_fmt nv12 -s 320x240 -i output.yuv \ -vf hwupload_nvbufsurface -c:v hevc_nvenc output.mp4
Using CUDA Hardware Acceleration
H.264 encode:
$ ffmpeg -f rawvideo -pix_fmt nv12 -s 320x240 -i output.yuv \ -vf hwupload_cuda -c:v h264_nvenc output.mp4H.265 encode:
$ ffmpeg -f rawvideo -pix_fmt nv12 -s 320x240 -i output.yuv \ -vf hwupload_cuda -c:v hevc_nvenc output.mp4
Transcoding on the Jetson Thor Platform#
Using nvbufsurface Hardware Acceleration
Transcode to H.264 with explicit codec selection:
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \ -c:v h264_cuvid -i input.mp4 -c:a copy -c:v h264_nvenc -b:v 5M output.mp4Transcode to H.264 with automatic codec detection:
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \ -i input.mp4 -c:a copy -c:v h264_nvenc -b:v 5M output.mp4Transcode to H.265 with explicit codec selection:
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \ -c:v h264_cuvid -i input.mp4 -c:a copy -c:v hevc_nvenc -b:v 5M output.mp4Transcode to H.265 with automatic codec detection:
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \ -i input.mp4 -c:a copy -c:v hevc_nvenc -b:v 5M output.mp4
Using CUDA Hardware Acceleration
Transcode to H.264 with explicit codec selection:
$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \ -c:v h264_cuvid -i input.mp4 -c:a copy -c:v h264_nvenc -b:v 5M output.mp4Transcode to H.264 with automatic codec detection:
$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \ -i input.mp4 -c:a copy -c:v h264_nvenc -b:v 5M output.mp4Transcode to H.265 with explicit codec selection:
$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \ -c:v h264_cuvid -i input.mp4 -c:a copy -c:v hevc_nvenc -b:v 5M output.mp4Transcode to H.265 with automatic codec detection:
$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \ -i input.mp4 -c:a copy -c:v hevc_nvenc -b:v 5M output.mp4
Decoding and Transforming on the Jetson Thor Platform#
Transform operations can be performed using VIC (Video Image Compositor) or GPU-based compute.
Using VIC
Scale:
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \ -c:v h264_cuvid -i input.mp4 \ -vf "nvbufsurftransform=w=1920:h=1080,hwdownload,format=nv12" output_scale.yuvCrop:
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \ -c:v h264_cuvid -i input.mp4 \ -vf "nvbufsurftransform=left=10:right=10:top=10:bottom=10,hwdownload,format=nv12" output_crop.yuvFormat conversion to RGBA:
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \ -c:v h264_cuvid -i input.mp4 \ -vf "nvbufsurftransform=format=rgba,hwdownload,format=rgba" output.rgb
Using GPU
To use GPU-based compute instead of VIC, add compute-hw=1 to the nvbufsurftransform
filter parameters.
Using nvbufsurface Hardware Acceleration
Scale:
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \ -c:v h264_cuvid -i input.mp4 \ -vf "nvbufsurftransform=w=1920:h=1080:compute-hw=1,hwdownload,format=nv12" output_scale_gpu.yuvCrop:
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \ -c:v h264_cuvid -i input.mp4 \ -vf "nvbufsurftransform=left=10:right=10:top=10:bottom=10:compute-hw=1,hwdownload,format=nv12" output_crop_gpu.yuvFormat conversion to RGBA:
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \ -c:v h264_cuvid -i input.mp4 \ -vf "nvbufsurftransform=format=rgba:compute-hw=1,hwdownload,format=rgba" output_gpu.rgb
Using CUDA Hardware Acceleration
Scale:
$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \ -c:v h264_cuvid -i input.mp4 \ -vf "nvbufsurftransform=w=1920:h=1080:compute-hw=1,hwdownload,format=nv12" output_scale_gpu.yuvCrop:
$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \ -c:v h264_cuvid -i input.mp4 \ -vf "nvbufsurftransform=left=10:right=10:top=10:bottom=10:compute-hw=1,hwdownload,format=nv12" output_crop_gpu.yuvFormat conversion to RGBA:
$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \ -c:v h264_cuvid -i input.mp4 \ -vf "nvbufsurftransform=format=rgba:compute-hw=1,hwdownload,format=rgba" output_gpu.rgb
JPEG Transcoding on the Jetson Thor Platform#
Using nvbufsurface Hardware Acceleration
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
-c:v nvjpegdec -i nvidia-logo.jpg -c:a copy -c:v nvjpegenc -b:v 5M output.jpg
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
-c:v nvjpegdec -i sample_720p.mjpeg -c:a copy -c:v nvjpegenc -b:v 5M output.mjpeg
Using CUDA Hardware Acceleration
$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
-c:v nvjpegdec -i nvidia-logo.jpg -c:a copy -c:v nvjpegenc -b:v 5M output.jpg
$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
-c:v nvjpegdec -i sample_720p.mjpeg -c:a copy -c:v nvjpegenc -b:v 5M output.mjpeg
Decoding on the Jetson Orin Platform#
An application can use accelerated decoding to read video files in the following elementary formats and container formats and dump them in YUV 420 format:
H.264:
$ ffmpeg -c:v h264_nvv4l2dec -i <input file> <output file>
H.265:
$ ffmpeg -c:v hevc_nvv4l2dec -i <input file> <output file>
MPEG2:
$ ffmpeg -c:v mpeg2_nvv4l2dec -i <input file> <output file>
MPEG4:
$ ffmpeg -c:v mpeg4_nvv4l2dec -i <input file> <output file>
VP8:
$ ffmpeg -c:v vp8_nvv4l2dec -i <input file> <output file>
VP9:
$ ffmpeg -c:v vp9_nvv4l2dec -i <input file> <output file>
Note
The MPEG4 container file is not supported.
Functional Flow of Decoding on the Jetson Orin Platform#
This section describes the functional flow of the FFmpeg decoding process on the Jetson Orin platform:
Call
nvv4l2dec_init_decoder()to create a new V4L2 video decoder object on the device node/dev/nvhost-nvdec.Call
subscribe_event()to subscribe to resolution change events.Call
set_output_plane_format()to set the format on the output plane.Call
capture_thread()to start a capture thread.Call
set_capture_plane_format()to set the format on the capture plane.Call
nvv4l2dec_decode()to read buffers fromffmpegand start the decoding process.Call
nvv4l2dec_decoder_get_frame()to get the hardware-accelerated decoded data and pass it toffmpegfor dumping.Call
nvv4l2dec_decoder_close()to destroy the buffers and close the device.
The following table describes the key structure:
Element |
Description |
|---|---|
|
Contains information about input packets. |
|
Contains information about decoded frames. |
|
Specifies a codec type. |
|
Holds the buffer plane parameters. |
|
Holds the buffer information. |
|
Defines the decoder context. |
This table describes the key function:
Element |
Description |
|---|---|
|
Creates a new V4L2 Video Decoder object on the
device node |
|
Subscribes to resolution change events. |
|
Sets the format on the output plane. |
|
Requests buffers on the output plane to be filled from the input bit stream. |
|
Dequeues the V4L2 buffer. |
|
Queues the V4L2 buffer. |
|
Starts the capture thread. |
|
Dequeues an event reported by the V4L2 device. |
|
Requests buffers on the capture plane. |
|
Sets the capture plane format. |