nvidia.dali.fn.nonsilent_region¶
-
nvidia.dali.fn.
nonsilent_region
(*inputs, **kwargs)¶ Performs leading and trailing silence detection in an audio buffer.
The operator returns the beginning and length of the non-silent region by comparing the short term power calculated for
window_length
of the signal with a silence cut-off threshold. The signal is considered to be silent when theshort_term_power_db
is less than thecutoff_db
. where:short_term_power_db = 10 * log10( short_term_power / reference_power )
Unless specified otherwise,
reference_power
is the maximum power of the signal.Inputs and outputs:
Input 0 - 1D audio buffer.
Output 0 - Index of the first sample in the nonsilent region.
Output 1 - Length of nonsilent region.
Note
If
Outputs[1] == 0
, the value inOutputs[0]
is undefined.- Supported backends
‘cpu’
- Parameters
input (TensorList) – Input to the operator.
- Keyword Arguments
bytes_per_sample_hint (int or list of int, optional, default = [0]) –
Output size hint, in bytes per sample.
If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.
cutoff_db (float, optional, default = -60.0) – The threshold, in dB, below which the signal is considered silent.
preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.
reference_power (float, optional, default = 0.0) –
The reference power that is used to convert the signal to dB.
If a value is not provided, the maximum power of the signal will be used as the reference.
reset_interval (int, optional, default = 8192) –
The number of samples after which the moving mean average is recalculated to avoid loss of precision.
If
reset_interval == -1
, or the input type allows exact calculation, the average will not be reset. The default value can be used for most of the use cases.seed (int, optional, default = -1) –
Random seed.
If not provided, it will be populated based on the global seed of the pipeline.
window_length (int, optional, default = 2048) – Size of the sliding window used to calculate of the short-term power of the signal.