NVIDIA HBAO+ 2.4.

Overview

HBAO+ is a SSAO algorithm designed to achieve high efficiency on DX11 GPUs. The algorithm is based on HBAO [Bavoil and Sainz 2008], with the following differences:

  1. To minimize cache trashing, HBAO+ does not use any randomization texture. Instead, the algorithm uses an Interleaved Rendering approach, generating the AO in multiple passes with a unique jitter value per pass [Bavoil and Jansen 2013].
  2. To avoid over-occlusion artifacts, HBAO+ uses a simpler AO approximation than HBAO, similar to “Scalable Ambient Obscurance” [McGuire et al. 2012] [Bukowski et al. 2012].
  3. To minimize flickering, the HBAO+ is always rendered in full resolution, from full-resolution depths.
_images/hbao-plus-in-tom-clancys-splinter-cell-blacklist-2.jpg

Package

doc/—this documentation page.

lib/—header file, import libraries and DLLs, for Win32, Win64 and Mac OS X.

samples/—source for sample applications demonstrating NVIDIA HBAO+.

Getting Started

  1. INITIALIZE THE LIBRARY:

    GFSDK_SSAO_CustomHeap CustomHeap;
    CustomHeap.new_ = ::operator new;
    CustomHeap.delete_ = ::operator delete;
    
    GFSDK_SSAO_Status status;
    GFSDK_SSAO_Context_D3D11* pAOContext;
    status = GFSDK_SSAO_CreateContext_D3D11(pD3D11Device, &pAOContext, &CustomHeap);
    assert(status == GFSDK_SSAO_OK); // HBAO+ requires feature level 11_0 or above
  2. SET INPUT DEPTHS:

    GFSDK_SSAO_InputData_D3D11 Input;
    Input.DepthData.DepthTextureType = GFSDK_SSAO_HARDWARE_DEPTHS;
    Input.DepthData.pFullResDepthTextureSRV = pDepthStencilTextureSRV;
    Input.DepthData.ProjectionMatrix.Data = GFSDK_SSAO_Float4x4(pProjectionMatrix);
    Input.DepthData.ProjectionMatrix.Layout = GFSDK_SSAO_ROW_MAJOR_ORDER;
    Input.DepthData.MetersToViewSpaceUnits = SceneScale;
  3. SET AO PARAMETERS:

    GFSDK_SSAO_Parameters_D3D11 Params;
    Params.Radius = 2.f;
    Params.Bias = 0.1f;
    Params.PowerExponent = 2.f;
    Params.Blur.Enable = true;
    Params.Blur.Radius = GFSDK_SSAO_BLUR_RADIUS_8;
    Params.Blur.Sharpness = 4.f;
    Params.Output.BlendMode = GFSDK_SSAO_OVERWRITE_RGB;
  4. RENDER AO:

    status = pAOContext->RenderAO(pD3D11Context, &Input, &Params, pOutputColorRTV);
    assert(status == GFSDK_SSAO_OK);

Data Flow

Input Requirements

  • The library has entry points for DX11 and GL3.2.
  • Requires a depth texture to be provided as input, along with associated projection info.
  • Optionally, can also take as input a GBuffer normal texture associated with the input depth texture:
    • Can add normal-mapping details to the AO.
    • Can be used to fix normal reconstruction artifacts with dithered LOD dissolves.
    • But makes the integration more complex. We recommend starting with input normals disabled.
  • Optionally, can also take as input a viewport rectangle associated with the input textures:
    • Defines a sub-area of the input & output full-resolution textures to be sourced and rendered to.
    • The library re-allocates its internal render targets if the Viewport.Width or Viewport.Height changes for a given AO context.

MSAA Support

  • If the input textures are MSAA then:
    • If Output.MSAAMode is set to PER_PIXEL_AO, the library internally uses only sample0 and renders per-pixel AO (default & recommended path).
    • If Output.MSAAMode is set to PER_SAMPLE_AO, the library does one full pass per MSAA subsample, which is effectively super-sampling the AO (slow path).
  • In practice, we have found the PER_PIXEL_AO strategy to not cause any objectionable artifacts, even when using HBAO+ with TXAA.

HBAO+ Pipeline

_images/pipeline_without_input_normals.png _images/pipeline_with_input_normals.png

CoarseAO & DetailAO

  • Coarse AO
    • For each pixel, 32 occlusion samples are taken in a variable-radius disk, with a minimum of 4 full-resolution pixels between the center pixel and the sample coordinates.
    • This occlusion contribution is weighted by the CoarseAO parameter, and is always computed.
  • Detail AO
    • Optionally, 4 occlusion samples can be taken to the left/right/top/bottom of the center pixel, to capture close-range occlusion details.
    • This occlusion contribution is weighted by the DetailAO parameter, and is skipped if DetailAO == 0.f.

Parameters

AO Radius

Definition
For a given AO receiver point P and AO Radius R, sample point S is ignored if ||P-S|| > R
Impact on search area
The AO radius is a multiplier for the screen-space search radius, and the number of samples is fixed. So if the AO radius is too large, under-sampling artifacts may appear.
_images/AO_Radius_1.png _images/AO_Radius_4.png

MetersToViewSpaceUnits

If you are not sure what to set this value to, you can:

  • Set the AO Radius parameter to 1.0 and
  • Increase MetersToViewSpaceUnits until the AO looks like it’s being cast up to 1 meter away

MetersToViewSpaceUnits is used internally

  • To convert the AO radius parameter from meters to view-space units
  • To adjust the blur sharpness parameter

DetailAO

  • The DetailAO may cause over-darkening on alpha-tested vegetation (grass typically)
  • You may want to set DetailAO = 0.f in this case
    • As a bonus, this improves the performance because the DetailAO computation is fully skipped
_images/DetailAO_b1.png _images/DetailAO_b0.png

Power Exponent

  • The PowerExponent parameter controls the darkness of the final AO: FinalAO = pow(AO, PowerExponent).
  • Typical PowerExponent values are in the range [2.0, 3.0].

AO Bias

  • The AO Bias parameter can be used to hide low-tessellation artifacts
  • Can also help reducing false-occlusion artifacts near screen borders
  • It weights the AO contributions more strongly for samples towards the normal direction
_images/AOBias_0_0.png _images/AOBias_0_3.png

AO Blur

  • Optional cross-bilateral filter
    • To remove jittered-sampling artifacts (noise)
    • To hide under-sampling artifacts (banding) & reduce jittering
  • 4 Modes
    • No Blur
    • Blur Radius 2
    • Blur Radius 4
    • Blur Radius 8
_images/No_Blur.png _images/Blur_Radius_8.png

Blur Sharpness

The greater the sharpness parameter, the more the blur preserves edges

_images/Blur_Sharpness_0.png _images/Blur_Sharpness_8.png

Blur Sharpness Profile

Optionally, a larger blur sharpness can be used for a user-defined foreground depth range

_images/Blur_Sharpness_Profile_Enable_false.png _images/Blur_Sharpness_Profile_Enable_true.png

Integration Time Estimates

  • Initial integration (for a rendering engineer)
    • <1 man-day with no input normals
    • 1-2 man-days with input normals
  • Initial parameter tuning
    • <1 man-hour
    • Tuning the parameter should be quick once the input data are correctly fed into the library
    • The same HBAO+ parameters may be used globally across the whole game