Use the Audio Effects SDK in Applications#

The Audio Effects API is a C API but can also be used with applications that are built using C++.

This page describes the typical workflow for using an effect in applications.

This flow is a simplified version of the sample programs (effects_demo and, for Linux, effects_delayed_streams_demo). The same flow is also used for chained effects, with a few differences in API calls.

Create an effect handle for the effect:

NvAFX_Handle handle;

// Create single effect
NvAFX_Status status = NvAFX_CreateEffect(NVAFX_EFFECT_DENOISER, &handle);

// Create chained effect
NvAFX_CreateChainedEffect(NVAFX_CHAINED_EFFECT_SUPERRES_8k_TO_16k_DENOISER_16k, &handle);

Set the required parameters:

Windows

// Set model name

// Single effect (can also use SetStringList with size 1)
NvAFX_SetString(handle, NVAFX_PARAM_MODEL_PATH, "denoiser_48k.trtpgk");

// Voice Font
NvAFX_SetString(handle, NVAFX_PARAM_REFERENCE_MODEL_PATH, "voice_font_reference.trtpkg");

// Chained effect
NvAFX_SetStringList(handle, NVAFX_PARAM_MODEL_PATH, model_files, num_model_files);

// Set input and output sample rates
NvAFX_SetU32(handle, NVAFX_PARAM_INPUT_SAMPLE_RATE, 48000);
NvAFX_SetU32(handle, NVAFX_PARAM_OUTPUT_SAMPLE_RATE, 48000);

Linux

// Set model name

// Single effect (can also use SetStringList with size 1)
NvAFX_SetString(handle, NVAFX_PARAM_MODEL_PATH, "denoiser_48k.trtpgk");

// Voice Font
std::vector<std::string> models = {"voice_font_reference_16k.trtpkg", "voice_font_low_latency_input_16k.trtpkg"};
NvAFX_SetStringList(handle, NVAFX_PARAM_MODEL_PATH, (const char**)models .data(), model.size());

// Chained effect
NvAFX_SetStringList(handle, NVAFX_PARAM_MODEL_PATH, model_files, num_model_files);

// Set input sample rate and number of streams
NvAFX_SetU32(handle, NVAFX_PARAM_INPUT_SAMPLE_RATE, 48000);
NvAFX_SetU32(handle, NVAFX_PARAM_NUM_STREAMS, 20);

Set the optional parameters by using the NvAFX_SetU32/NvAFX_SetFloat parameters:
- Intensity ratio.
- Use default GPU.
- VAD enable/disable.
- CUDA graph enable/disable (Windows only).
- Delayed streams enable/disable (Linux only).
- Samples per input frame. A list of supported input sample rates can be queried by using NvAFX_GetU32List. (Refer to Get the Parameters of an Audio Effect.)
- Effect version (by using NvAFX_SetU32), if using the experimental Denoiser effect.
- GPU on which the model will be loaded. For more information, refer to Use Multiple GPUs.
Load the model:
```
NvAFX_Load(handle);
```

For the Voice Font effect, after a successful load, set the reference wav by using NvAFX_SetFloatList/NvAFX_SetStreamFloatList:

Windows

// Set the reference file

NvAFX_SetFloatList(handle, NVAFX_PARAM_REFERENCE_AUDIO, &reference_wav);

Linux

// Set the reference of selected streams

// ID starts at zero (first stream is zero)

// If this function is called multiple times, it overwrites the previously set reference

std::vector<unsigned int> stream_idx (num_streams) = {1,2};

std::vector<const float*> reference_frame_ptrs(num_streams) = getRefForStreams(stream_idx);

NvAFX_SetStreamFloatList(handle, NVAFX_PARAM_REFERENCE_AUDIO, stream_idx.data(), reference_frame_ptrs.data(), num_streams);

After a successful load, query the input/output sample rate, channels, and samples per frame for the effect:

// Sample rate
NvAFX_GetU32(handle, NVAFX_PARAM_INPUT_SAMPLE_RATE, &input_sample_rate);
NvAFX_GetU32(handle, NVAFX_PARAM_OUTPUT_SAMPLE_RATE, &output_sample_rate);

// Channels
NvAFX_GetU32(handle, NVAFX_PARAM_NUM_INPUT_CHANNELS, &num_input_channels);
NvAFX_GetU32(handle, NVAFX_PARAM_NUM_OUTPUT_CHANNELS, &num_output_channels);

// Samples per frame
NvAFX_GetU32(handle, NVAFX_PARAM_NUM_SAMPLES_PER_INPUT_FRAME, &num_samples_per_input_frame);
NvAFX_GetU32(handle, NVAFX_PARAM_NUM_SAMPLES_PER_OUTPUT_FRAME, &num_samples_per_output_frame);

// Voice Font only
NvAFX_GetU32(handle, NVAFX_PARAM_REFERENCE_NUM_SAMPLES_PER_INPUT_FRAME, &num_ref_samples_per_frame);

For each input sample, process the audio by using NvAFX_Run:

NvAFX_Run(handle, input, output, num_samples_per_input_frame, num_input_channels);

If a disconnection occurs in audio processing (for example, a batch that was reused for a different audio source), reset the internal effect states by calling NvAFX_Reset:
Windows
NvAFX_Reset(handle);
Linux
NvAFX_Reset(handle, states_array, input_wav_list.size());
On Linux only: During batching, to temporarily pause streams (for example, if data is not ready for that stream but is available for processing for other streams) use NVAFX_PARAM_ACTIVE_STREAMS as required.

After audio processing is complete, to free resources, use NvAFX_DestroyEffect. For more information, see Run an Audio Effect on Delayed Audio Streams on Linux.