About the Noise Removal/Background Noise Suppression Effect#
Recordings of speech made outside of a recording studio contain a lot of background noise. The Audio Denoiser Effect removes a variety of background noises from audio recordings. (For the types of noise that this effect removes, see Types of Background Noise Removed.)
This effect retains emotive tones in speech, such as happy, sad, excited, and angry tones, which were removed as noise in previous versions of the SDK. Extreme emotive cases, such as loud laughing, shrieking, screaming, and crying might not be retained. Extremely loud noises in input (SNR < 5 dB) might result in distorted output.
Note
In this guide, the term Background Noise Suppression is used interchangeably with Denoising and Noise Removal (referred to as denoiser in the API).
To run the sample application on Windows for this effect, use the following command:
# (One time, initial setup): Download models using models/download_models.ps1
powershell -ExecutionPolicy Bypass -File ./download_models.ps1 --gpu_architecture <gpu> --effects denoiser-16k,denoiser-48k
# Format: run_effect_demo.bat <architecture> <effect> <input_sample_rate> <output_sample_rate>
# 16k effect
run_effect_demo.bat turing denoiser 16k 16k
# 48k effect
run_effect_demo.bat ampere denoiser 48k 48k
Note
For more information, see Use the Helper Script to Run the Sample Application.
To run the sample application on Linux for this effect, use the following command:
# (One time, initial setup): Download models using models/download_models.sh
./download_models.sh --gpu <gpu> --effects denoiser-16k,denoiser-48k
# Refer to Section 3.2 for further details
Format: ./run_effect.sh -g <gpu> -s <sample_rate> -e denoiser
# 16k effect
./run_effect.sh -g t4 -s 16 -e denoiser
# 48k effect
./run_effect.sh -g t4 -s 48 -e denoiser
Note
For more information, see Use the Helper Script to Run the Sample Application.
This effect has the following characteristics:
Supported input/output audio format is 32-bit float audio with a sampling rate of 16 kHz or 48 kHz.
In the Linux SDK, this effect has the following maximum throughput (the number of batches supported in real time):
Architecture
Maximum Throughput for the 16K Effect
Maximum Throughput for the 48K Effect
T4
3240
1520
A100
15760
6750
A10
6410
2960
L40
14440
5400
H100
18500
8030
B100
25030
12160
RTX PRO 6000
23640
10640
Note
This effect might miss some noises in the first 1–2 seconds of input audio. Low-volume noises during speech might also be missed.
BNR 2.0 Experimental Model#
BNR 2.0 is a newer version of the Noise Removal effect that improves the speech recognition accuracies of ASR systems and integrates seamlessly in pipelines where audio cleaning is required before being used in other subsystems. It supports all the noise profiles supported by the stable denoiser effect.
The BNR 2.0 preview is currently supported using the effect_version parameter. For details on how this mode is enabled, see Set the Parameters of an Audio Effect.
To run the sample application on Windows for this effect, change the effect_version variable in run_effects_demo.bat from 1 to 2 and run the following command:
# (One time, initial setup): Download models using models/download_models.ps1
powershell -ExecutionPolicy Bypass -File ./download_models.ps1 --gpu_architecture <gpu> --effects denoiser-16k,denoiser-48k
# Format: run_effect_demo.bat <architecture> <effect> <input_sample_rate> <output_sample_rate>
# 16k effect
run_effect_demo.bat turing denoiser 16k 16k
# 48k effect
run_effect_demo.bat ampere denoiser 48k 48k
Note
For more information, see Use the Helper Script to Run the Sample Application. This effect cannot be chained with other effects using the Chaining API.
Note
With this release, Voice Activity Detection is enabled by default for this effect.
To run the sample application on Linux for this effect, use the following command:
# (One time, initial setup): Download models using models/download_models.sh
./download_models.sh --gpu <gpu> --effects denoiser-16k,denoiser-48k
# Refer to Section 3.2 for further details
Format: ./run_effect.sh -g <gpu> -s <sample_rate> -e denoiser -m 2
# 16k effect
./run_effect.sh -g t4 -s 16 -e denoiser -m 2
# 48k effect
./run_effect.sh -g t4 -s 48 -e denoiser -m 2
Note
For more information, see Use the Helper Script to Run the Sample Application.
This effect has the following characteristics:
Supported input/output audio format is 32-bit float audio with a sampling rate of 16 kHz or 48 kHz.
In the Linux SDK, this effect has the following maximum throughput (the number of batches supported in real time):
Architecture
Maximum Throughput for the 16K Effect
Maximum Throughput for the 48K Effect
T4
1350
410
A100
7880
3250
A10
3160
1540
L40
7220
3290
H100
9670
4160
B100
12160
3940
RTX Pro 6000
12160
6080