nvidia.dali.fn.nonsilent_region

nvidia.dali.fn.nonsilent_region(*inputs, **kwargs)

Performs leading and trailing silence detection in an audio buffer.

The operator returns the beginning and length of the non-silent region by comparing the short term power calculated for window_length of the signal with a silence cut-off threshold. The signal is considered to be silent when the short_term_power_db is less than the cutoff_db. where:

short_term_power_db = 10 * log10( short_term_power / reference_power )

Unless specified otherwise, reference_power is the maximum power of the signal.

Inputs and outputs:

  • Input 0 - 1D audio buffer.

  • Output 0 - Index of the first sample in the nonsilent region.

  • Output 1 - Length of nonsilent region.

Note

If Outputs[1] == 0, the value in Outputs[0] is undefined.

Supported backends
  • ‘cpu’

Parameters

input (TensorList) – Input to the operator.

Keyword Arguments
  • bytes_per_sample_hint (int or list of int, optional, default = [0]) –

    Output size hint, in bytes per sample.

    If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.

  • cutoff_db (float, optional, default = -60.0) – The threshold, in dB, below which the signal is considered silent.

  • preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.

  • reference_power (float, optional, default = 0.0) –

    The reference power that is used to convert the signal to dB.

    If a value is not provided, the maximum power of the signal will be used as the reference.

  • reset_interval (int, optional, default = 8192) –

    The number of samples after which the moving mean average is recalculated to avoid loss of precision.

    If reset_interval == -1, or the input type allows exact calculation, the average will not be reset. The default value can be used for most of the use cases.

  • seed (int, optional, default = -1) –

    Random seed.

    If not provided, it will be populated based on the global seed of the pipeline.

  • window_length (int, optional, default = 2048) – Size of the sliding window used to calculate of the short-term power of the signal.