Accelerated Decode with FFmpeg#

The NVIDIA ffmpeg package supports hardware-accelerated decoding on NVIDIA® Jetson™ devices.

On the Jetson Thor platform, FFmpeg supports hardware-accelerated decoding, encoding, and transcoding using both nvbufsurface and cuda hardware acceleration interfaces. Hardware-accelerated video transformation operations using VIC are supported with the nvbufsurface interface. GPU-based transformation operations are supported with both nvbufsurface and cuda interfaces.

Install ffmpeg Binary Package in Jetson Linux Builds#

To install the ffmpeg binary package in Jetson Linux Builds, enter the following commands:

$ sudo apt update
$ sudo apt install -y ffmpeg

Get Source Files for the ffmpeg Package#

To get source files for the ffmpeg package, enter the following command:

$ apt source ffmpeg

Prerequisite for the Jetson Thor Platform#

Install the CUDA Toolkit from the latest JetPack release.

Decoding on the Jetson Thor Platform#

Using nvbufsurface Hardware Acceleration

  • H.264 decode:

    $ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
        -c:v h264_cuvid -i input.mp4 -vf "hwdownload,format=nv12" output.yuv
    
  • H.265 decode:

    $ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
        -c:v hevc_cuvid -i input.mp4 -vf "hwdownload,format=nv12" output.yuv
    
  • Decode with automatic codec detection:

    $ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
        -i input.mp4 -vf "hwdownload,format=nv12" output.yuv
    

Using CUDA Hardware Acceleration

  • H.264 decode:

    $ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
        -c:v h264_cuvid -i input.mp4 -vf "hwdownload,format=nv12" output.yuv
    
  • H.265 decode:

    $ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
        -c:v hevc_cuvid -i input.mp4 -vf "hwdownload,format=nv12" output.yuv
    
  • Decode with automatic codec detection:

    $ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
        -i input.mp4 -vf "hwdownload,format=nv12" output.yuv
    

Encoding on the Jetson Thor Platform#

Using nvbufsurface Hardware Acceleration

  • H.264 encode:

    $ ffmpeg -f rawvideo -pix_fmt nv12 -s 320x240 -i output.yuv \
        -vf hwupload_nvbufsurface -c:v h264_nvenc output.mp4
    
  • H.265 encode:

    $ ffmpeg -f rawvideo -pix_fmt nv12 -s 320x240 -i output.yuv \
        -vf hwupload_nvbufsurface -c:v hevc_nvenc output.mp4
    

Using CUDA Hardware Acceleration

  • H.264 encode:

    $ ffmpeg -f rawvideo -pix_fmt nv12 -s 320x240 -i output.yuv \
        -vf hwupload_cuda -c:v h264_nvenc output.mp4
    
  • H.265 encode:

    $ ffmpeg -f rawvideo -pix_fmt nv12 -s 320x240 -i output.yuv \
        -vf hwupload_cuda -c:v hevc_nvenc output.mp4
    

Transcoding on the Jetson Thor Platform#

Using nvbufsurface Hardware Acceleration

  • Transcode to H.264 with explicit codec selection:

    $ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
        -c:v h264_cuvid -i input.mp4 -c:a copy -c:v h264_nvenc -b:v 5M output.mp4
    
  • Transcode to H.264 with automatic codec detection:

    $ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
        -i input.mp4 -c:a copy -c:v h264_nvenc -b:v 5M output.mp4
    
  • Transcode to H.265 with explicit codec selection:

    $ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
        -c:v h264_cuvid -i input.mp4 -c:a copy -c:v hevc_nvenc -b:v 5M output.mp4
    
  • Transcode to H.265 with automatic codec detection:

    $ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
        -i input.mp4 -c:a copy -c:v hevc_nvenc -b:v 5M output.mp4
    

Using CUDA Hardware Acceleration

  • Transcode to H.264 with explicit codec selection:

    $ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
        -c:v h264_cuvid -i input.mp4 -c:a copy -c:v h264_nvenc -b:v 5M output.mp4
    
  • Transcode to H.264 with automatic codec detection:

    $ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
        -i input.mp4 -c:a copy -c:v h264_nvenc -b:v 5M output.mp4
    
  • Transcode to H.265 with explicit codec selection:

    $ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
        -c:v h264_cuvid -i input.mp4 -c:a copy -c:v hevc_nvenc -b:v 5M output.mp4
    
  • Transcode to H.265 with automatic codec detection:

    $ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
        -i input.mp4 -c:a copy -c:v hevc_nvenc -b:v 5M output.mp4
    

Decoding and Transforming on the Jetson Thor Platform#

Transform operations can be performed using VIC (Video Image Compositor) or GPU-based compute.

Using VIC

  • Scale:

    $ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
        -c:v h264_cuvid -i input.mp4 \
        -vf "nvbufsurftransform=w=1920:h=1080,hwdownload,format=nv12" output_scale.yuv
    
  • Crop:

    $ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
        -c:v h264_cuvid -i input.mp4 \
        -vf "nvbufsurftransform=left=10:right=10:top=10:bottom=10,hwdownload,format=nv12" output_crop.yuv
    
  • Format conversion to RGBA:

    $ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
        -c:v h264_cuvid -i input.mp4 \
        -vf "nvbufsurftransform=format=rgba,hwdownload,format=rgba" output.rgb
    

Using GPU

To use GPU-based compute instead of VIC, add compute-hw=1 to the nvbufsurftransform filter parameters.

Using nvbufsurface Hardware Acceleration

  • Scale:

    $ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
        -c:v h264_cuvid -i input.mp4 \
        -vf "nvbufsurftransform=w=1920:h=1080:compute-hw=1,hwdownload,format=nv12" output_scale_gpu.yuv
    
  • Crop:

    $ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
        -c:v h264_cuvid -i input.mp4 \
        -vf "nvbufsurftransform=left=10:right=10:top=10:bottom=10:compute-hw=1,hwdownload,format=nv12" output_crop_gpu.yuv
    
  • Format conversion to RGBA:

    $ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
        -c:v h264_cuvid -i input.mp4 \
        -vf "nvbufsurftransform=format=rgba:compute-hw=1,hwdownload,format=rgba" output_gpu.rgb
    

Using CUDA Hardware Acceleration

  • Scale:

    $ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
        -c:v h264_cuvid -i input.mp4 \
        -vf "nvbufsurftransform=w=1920:h=1080:compute-hw=1,hwdownload,format=nv12" output_scale_gpu.yuv
    
  • Crop:

    $ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
        -c:v h264_cuvid -i input.mp4 \
        -vf "nvbufsurftransform=left=10:right=10:top=10:bottom=10:compute-hw=1,hwdownload,format=nv12" output_crop_gpu.yuv
    
  • Format conversion to RGBA:

    $ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
        -c:v h264_cuvid -i input.mp4 \
        -vf "nvbufsurftransform=format=rgba:compute-hw=1,hwdownload,format=rgba" output_gpu.rgb
    

JPEG Transcoding on the Jetson Thor Platform#

Using nvbufsurface Hardware Acceleration

$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
    -c:v nvjpegdec -i nvidia-logo.jpg -c:a copy -c:v nvjpegenc -b:v 5M output.jpg
$ ffmpeg -hwaccel nvbufsurface -hwaccel_output_format nvbufsurface \
    -c:v nvjpegdec -i sample_720p.mjpeg -c:a copy -c:v nvjpegenc -b:v 5M output.mjpeg

Using CUDA Hardware Acceleration

$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
    -c:v nvjpegdec -i nvidia-logo.jpg -c:a copy -c:v nvjpegenc -b:v 5M output.jpg
$ ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
    -c:v nvjpegdec -i sample_720p.mjpeg -c:a copy -c:v nvjpegenc -b:v 5M output.mjpeg

Decoding on the Jetson Orin Platform#

An application can use accelerated decoding to read video files in the following elementary formats and container formats and dump them in YUV 420 format:

  • H.264:

    $ ffmpeg -c:v h264_nvv4l2dec -i <input file> <output file>
    
  • H.265:

    $ ffmpeg -c:v hevc_nvv4l2dec -i <input file> <output file>
    
  • MPEG2:

    $ ffmpeg -c:v mpeg2_nvv4l2dec -i <input file> <output file>
    
  • MPEG4:

    $ ffmpeg -c:v mpeg4_nvv4l2dec -i <input file> <output file>
    
  • VP8:

    $ ffmpeg -c:v vp8_nvv4l2dec -i <input file> <output file>
    
  • VP9:

    $ ffmpeg -c:v vp9_nvv4l2dec -i <input file> <output file>
    

Note

The MPEG4 container file is not supported.

Functional Flow of Decoding on the Jetson Orin Platform#

This section describes the functional flow of the FFmpeg decoding process on the Jetson Orin platform:

  1. Call nvv4l2dec_init_decoder() to create a new V4L2 video decoder object on the device node /dev/nvhost-nvdec.

  2. Call subscribe_event() to subscribe to resolution change events.

  3. Call set_output_plane_format() to set the format on the output plane.

  4. Call capture_thread() to start a capture thread.

  5. Call set_capture_plane_format() to set the format on the capture plane.

  6. Call nvv4l2dec_decode() to read buffers from ffmpeg and start the decoding process.

  7. Call nvv4l2dec_decoder_get_frame() to get the hardware-accelerated decoded data and pass it to ffmpeg for dumping.

  8. Call nvv4l2dec_decoder_close() to destroy the buffers and close the device.

The following table describes the key structure:

Element

Description

nvPacket

Contains information about input packets.

nvFrame

Contains information about decoded frames.

nvCodingType

Specifies a codec type.

BufferPlane

Holds the buffer plane parameters.

Buffer

Holds the buffer information.

context_t

Defines the decoder context.

This table describes the key function:

Element

Description

nvv4l2dec_create_decoder

Creates a new V4L2 Video Decoder object on the device node /dev/v4l2-nvdec.

subscribe_event

Subscribes to resolution change events.

set_output_plane_format

Sets the format on the output plane.

req_buffers_on_output_plane

Requests buffers on the output plane to be filled from the input bit stream.

dq_buffer

Dequeues the V4L2 buffer.

q_buffer

Queues the V4L2 buffer.

capture_thread

Starts the capture thread.

dq_event

Dequeues an event reported by the V4L2 device.

req_buffers_on_capture_plane

Requests buffers on the capture plane.

set_capture_plane_format

Sets the capture plane format.