Use the Audio Effects SDK in Applications#

The Audio Effects API is a C API but can also be used with applications that are built using C++.

This page describes the typical workflow for using an effect in applications.

This flow is a simplified version of the sample programs (effects_demo and, for Linux, effects_delayed_streams_demo). The same flow is also used for chained effects, with a few differences in API calls.

  1. Create an effect handle for the effect:

    NvAFX_Handle handle;
    
    // Create single effect
    NvAFX_Status status = NvAFX_CreateEffect(NVAFX_EFFECT_DENOISER, &handle);
    
    // Create chained effect
    NvAFX_CreateChainedEffect(NVAFX_CHAINED_EFFECT_SUPERRES_8k_TO_16k_DENOISER_16k, &handle);
    
  2. Set the required parameters:

    // Set model name
    
    // Single effect (can also use SetStringList with size 1)
    NvAFX_SetString(handle, NVAFX_PARAM_MODEL_PATH, "denoiser_48k.trtpgk");
    
    // Voice Font
    NvAFX_SetString(handle, NVAFX_PARAM_REFERENCE_MODEL_PATH, "voice_font_reference.trtpkg");
    
    // Chained effect
    NvAFX_SetStringList(handle, NVAFX_PARAM_MODEL_PATH, model_files, num_model_files);
    
    // Set input and output sample rates
    NvAFX_SetU32(handle, NVAFX_PARAM_INPUT_SAMPLE_RATE, 48000);
    NvAFX_SetU32(handle, NVAFX_PARAM_OUTPUT_SAMPLE_RATE, 48000);
    
    // Set model name
    
    // Single effect (can also use SetStringList with size 1)
    NvAFX_SetString(handle, NVAFX_PARAM_MODEL_PATH, "denoiser_48k.trtpgk");
    
    // Voice Font
    std::vector<std::string> models = {"voice_font_reference_16k.trtpkg", "voice_font_low_latency_input_16k.trtpkg"};
    NvAFX_SetStringList(handle, NVAFX_PARAM_MODEL_PATH, (const char**)models .data(), model.size());
    
    // Chained effect
    NvAFX_SetStringList(handle, NVAFX_PARAM_MODEL_PATH, model_files, num_model_files);
    
    // Set input sample rate and number of streams
    NvAFX_SetU32(handle, NVAFX_PARAM_INPUT_SAMPLE_RATE, 48000);
    NvAFX_SetU32(handle, NVAFX_PARAM_NUM_STREAMS, 20);
    
  3. Set the optional parameters by using the NvAFX_SetU32/NvAFX_SetFloat parameters:

    • Intensity ratio.

    • Use default GPU.

    • VAD enable/disable.

    • CUDA graph enable/disable (Windows only).

    • Delayed streams enable/disable (Linux only).

    • Samples per input frame. A list of supported input sample rates can be queried by using NvAFX_GetU32List. (Refer to Get the Parameters of an Audio Effect.)

    • Effect version (by using NvAFX_SetU32), if using the experimental Denoiser effect.

    • GPU on which the model will be loaded. For more information, refer to Use Multiple GPUs.

  4. Load the model:

    NvAFX_Load(handle);
    
  5. For the Voice Font effect, after a successful load, set the reference wav by using NvAFX_SetFloatList/NvAFX_SetStreamFloatList:

    // Set the reference file
    
    NvAFX_SetFloatList(handle, NVAFX_PARAM_REFERENCE_AUDIO, &reference_wav);
    
    // Set the reference of selected streams
    
    // ID starts at zero (first stream is zero)
    
    // If this function is called multiple times, it overwrites the previously set reference
    
    std::vector<unsigned int> stream_idx (num_streams) = {1,2};
    
    std::vector<const float*> reference_frame_ptrs(num_streams) = getRefForStreams(stream_idx);
    
    NvAFX_SetStreamFloatList(handle, NVAFX_PARAM_REFERENCE_AUDIO, stream_idx.data(), reference_frame_ptrs.data(), num_streams);
    
  6. After a successful load, query the input/output sample rate, channels, and samples per frame for the effect:

    // Sample rate
    NvAFX_GetU32(handle, NVAFX_PARAM_INPUT_SAMPLE_RATE, &input_sample_rate);
    NvAFX_GetU32(handle, NVAFX_PARAM_OUTPUT_SAMPLE_RATE, &output_sample_rate);
    
    // Channels
    NvAFX_GetU32(handle, NVAFX_PARAM_NUM_INPUT_CHANNELS, &num_input_channels);
    NvAFX_GetU32(handle, NVAFX_PARAM_NUM_OUTPUT_CHANNELS, &num_output_channels);
    
    // Samples per frame
    NvAFX_GetU32(handle, NVAFX_PARAM_NUM_SAMPLES_PER_INPUT_FRAME, &num_samples_per_input_frame);
    NvAFX_GetU32(handle, NVAFX_PARAM_NUM_SAMPLES_PER_OUTPUT_FRAME, &num_samples_per_output_frame);
    
    // Voice Font only
    NvAFX_GetU32(handle, NVAFX_PARAM_REFERENCE_NUM_SAMPLES_PER_INPUT_FRAME, &num_ref_samples_per_frame);
    
  7. For each input sample, process the audio by using NvAFX_Run:

    NvAFX_Run(handle, input, output, num_samples_per_input_frame, num_input_channels);
    
  8. If a disconnection occurs in audio processing (for example, a batch that was reused for a different audio source), reset the internal effect states by calling NvAFX_Reset:

    NvAFX_Reset(handle);
    
    NvAFX_Reset(handle, states_array, input_wav_list.size());
    
  9. On Linux only: During batching, to temporarily pause streams (for example, if data is not ready for that stream but is available for processing for other streams) use NVAFX_PARAM_ACTIVE_STREAMS as required.

After audio processing is complete, to free resources, use NvAFX_DestroyEffect. For more information, see Run an Audio Effect on Delayed Audio Streams on Linux.