About the Noise Removal/Background Noise Suppression Effect#

Recordings of speech made outside of a recording studio contain a lot of background noise. The Audio Denoiser Effect removes a variety of background noises from audio recordings. (For the types of noise that this effect removes, see Types of Background Noise Removed.)

This effect retains emotive tones in speech, such as happy, sad, excited, and angry tones, which were removed as noise in previous versions of the SDK. Extreme emotive cases, such as loud laughing, shrieking, screaming, and crying might not be retained. Extremely loud noises in input (SNR < 5 dB) might result in distorted output.

Note

In this guide, the term Background Noise Suppression is used interchangeably with Denoising and Noise Removal (referred to as denoiser in the API).

To run the sample application on Windows for this effect, use the following command:

# (One time, initial setup): Download models using models/download_models.ps1
powershell -ExecutionPolicy Bypass -File ./download_models.ps1 --gpu_architecture <gpu> --effects denoiser-16k,denoiser-48k

# Format: run_effect_demo.bat <architecture> <effect> <input_sample_rate> <output_sample_rate>

# 16k effect
run_effect_demo.bat turing denoiser 16k 16k

# 48k effect
run_effect_demo.bat ampere denoiser 48k 48k

Note

For more information, see Use the Helper Script to Run the Sample Application.

To run the sample application on Linux for this effect, use the following command:

# (One time, initial setup): Download models using models/download_models.sh
./download_models.sh --gpu <gpu> --effects denoiser-16k,denoiser-48k

# Refer to Section 3.2 for further details
Format: ./run_effect.sh -g <gpu> -s <sample_rate> -e denoiser

# 16k effect
./run_effect.sh -g t4 -s 16 -e denoiser

# 48k effect
./run_effect.sh -g t4 -s 48 -e denoiser

Note

For more information, see Use the Helper Script to Run the Sample Application.

This effect has the following characteristics:

Supported input/output audio format is 32-bit float audio with a sampling rate of 16 kHz or 48 kHz.

In the Linux SDK, this effect has the following maximum throughput (the number of batches supported in real time):

Architecture	Maximum Throughput for the 16K Effect	Maximum Throughput for the 48K Effect
T4	3240	1520
A100	15760	6750
A10	6410	2960
L40	14440	5400
H100	18500	8030
B100	25030	12160
RTX PRO 6000	23640	10640

Note

This effect might miss some noises in the first 1–2 seconds of input audio. Low-volume noises during speech might also be missed.

BNR 2.0 Experimental Model#

BNR 2.0 is a newer version of the Noise Removal effect that improves the speech recognition accuracies of ASR systems and integrates seamlessly in pipelines where audio cleaning is required before being used in other subsystems. It supports all the noise profiles supported by the stable denoiser effect.

The BNR 2.0 preview is currently supported using the effect_version parameter. For details on how this mode is enabled, see Set the Parameters of an Audio Effect.

To run the sample application on Windows for this effect, change the effect_version variable in run_effects_demo.bat from 1 to 2 and run the following command:

# (One time, initial setup): Download models using models/download_models.ps1
powershell -ExecutionPolicy Bypass -File ./download_models.ps1 --gpu_architecture <gpu> --effects denoiser-16k,denoiser-48k

# Format: run_effect_demo.bat <architecture> <effect> <input_sample_rate> <output_sample_rate>

# 16k effect
run_effect_demo.bat turing denoiser 16k 16k

# 48k effect
run_effect_demo.bat ampere denoiser 48k 48k

Note

For more information, see Use the Helper Script to Run the Sample Application. This effect cannot be chained with other effects using the Chaining API.

Note

With this release, Voice Activity Detection is enabled by default for this effect.

To run the sample application on Linux for this effect, use the following command:

# (One time, initial setup): Download models using models/download_models.sh
./download_models.sh --gpu <gpu> --effects denoiser-16k,denoiser-48k

# Refer to Section 3.2 for further details
Format: ./run_effect.sh -g <gpu> -s <sample_rate> -e denoiser -m 2

# 16k effect
./run_effect.sh -g t4 -s 16 -e denoiser -m 2

# 48k effect
./run_effect.sh -g t4 -s 48 -e denoiser -m 2

Note

For more information, see Use the Helper Script to Run the Sample Application.

This effect has the following characteristics:

Supported input/output audio format is 32-bit float audio with a sampling rate of 16 kHz or 48 kHz.

In the Linux SDK, this effect has the following maximum throughput (the number of batches supported in real time):

Architecture	Maximum Throughput for the 16K Effect	Maximum Throughput for the 48K Effect
T4	1350	410
A100	7880	3250
A10	3160	1540
L40	7220	3290
H100	9670	4160
B100	12160	3940
RTX Pro 6000	12160	6080