About the Noise Removal/Background Noise Suppression Effect#

Recordings of speech made outside of a recording studio contain a lot of background noise. The Audio Denoiser Effect removes a variety of background noises from audio recordings. (For the types of noise that this effect removes, see Types of Background Noise Removed.)

This effect retains emotive tones in speech, such as happy, sad, excited, and angry tones, which were removed as noise in previous versions of the SDK. Extreme emotive cases, such as loud laughing, shrieking, screaming, and crying might not be retained. Extremely loud noises in input (SNR < 5 dB) might result in distorted output.

Note

In this guide, the term Background Noise Suppression is used interchangeably with Denoising and Noise Removal (referred to as denoiser in the API).

To run the sample application on Windows for this effect, use the following command:

# (One time, initial setup): Download models using models/download_models.ps1
powershell -ExecutionPolicy Bypass -File ./download_models.ps1 --gpu_architecture <gpu> --effects denoiser-16k,denoiser-48k

# Format: run_effect_demo.bat <architecture> <effect> <input_sample_rate> <output_sample_rate>

# 16k effect
run_effect_demo.bat turing denoiser 16k 16k

# 48k effect
run_effect_demo.bat ampere denoiser 48k 48k

Note

For more information, see Use the Helper Script to Run the Sample Application.

To run the sample application on Linux for this effect, use the following command:

# (One time, initial setup): Download models using models/download_models.sh
./download_models.sh --gpu <gpu> --effects denoiser-16k,denoiser-48k

# Refer to Section 3.2 for further details
Format: ./run_effect.sh -g <gpu> -s <sample_rate> -e denoiser

# 16k effect
./run_effect.sh -g t4 -s 16 -e denoiser

# 48k effect
./run_effect.sh -g t4 -s 48 -e denoiser

Note

For more information, see Use the Helper Script to Run the Sample Application.

This effect has the following characteristics:

  • Supported input/output audio format is 32-bit float audio with a sampling rate of 16 kHz or 48 kHz.

  • In the Linux SDK, this effect has the following maximum throughput (the number of batches supported in real time):

    Architecture

    Maximum Throughput for the 16K Effect

    Maximum Throughput for the 48K Effect

    T4

    3240

    1520

    A100

    15760

    6750

    A10

    6410

    2960

    L40

    14440

    5400

    H100

    18500

    8030

    B100

    25030

    12160

    RTX PRO 6000

    23640

    10640

Note

This effect might miss some noises in the first 1–2 seconds of input audio. Low-volume noises during speech might also be missed.

BNR 2.0 Experimental Model#

BNR 2.0 is a newer version of the Noise Removal effect that improves the speech recognition accuracies of ASR systems and integrates seamlessly in pipelines where audio cleaning is required before being used in other subsystems. It supports all the noise profiles supported by the stable denoiser effect.

The BNR 2.0 preview is currently supported using the effect_version parameter. For details on how this mode is enabled, see Set the Parameters of an Audio Effect.

To run the sample application on Windows for this effect, change the effect_version variable in run_effects_demo.bat from 1 to 2 and run the following command:

# (One time, initial setup): Download models using models/download_models.ps1
powershell -ExecutionPolicy Bypass -File ./download_models.ps1 --gpu_architecture <gpu> --effects denoiser-16k,denoiser-48k

# Format: run_effect_demo.bat <architecture> <effect> <input_sample_rate> <output_sample_rate>

# 16k effect
run_effect_demo.bat turing denoiser 16k 16k

# 48k effect
run_effect_demo.bat ampere denoiser 48k 48k

Note

For more information, see Use the Helper Script to Run the Sample Application. This effect cannot be chained with other effects using the Chaining API.

Note

With this release, Voice Activity Detection is enabled by default for this effect.

To run the sample application on Linux for this effect, use the following command:

# (One time, initial setup): Download models using models/download_models.sh
./download_models.sh --gpu <gpu> --effects denoiser-16k,denoiser-48k

# Refer to Section 3.2 for further details
Format: ./run_effect.sh -g <gpu> -s <sample_rate> -e denoiser -m 2

# 16k effect
./run_effect.sh -g t4 -s 16 -e denoiser -m 2

# 48k effect
./run_effect.sh -g t4 -s 48 -e denoiser -m 2

Note

For more information, see Use the Helper Script to Run the Sample Application.

This effect has the following characteristics:

  • Supported input/output audio format is 32-bit float audio with a sampling rate of 16 kHz or 48 kHz.

  • In the Linux SDK, this effect has the following maximum throughput (the number of batches supported in real time):

    Architecture

    Maximum Throughput for the 16K Effect

    Maximum Throughput for the 48K Effect

    T4

    1350

    410

    A100

    7880

    3250

    A10

    3160

    1540

    L40

    7220

    3290

    H100

    9670

    4160

    B100

    12160

    3940

    RTX Pro 6000

    12160

    6080