nvidia.dali.fn.spectrogram

nvidia.dali.fn.spectrogram(*inputs, **kwargs)

Produces a spectrogram from a 1D signal (for example, audio).

Input data is expected to be one channel (shape being (nsamples,), (nsamples, 1), or (1, nsamples)) of type float32.

Supported backends
  • ‘cpu’

  • ‘gpu’

Parameters

input (TensorList) – Input to the operator.

Keyword Arguments
  • bytes_per_sample_hint (int or list of int, optional, default = [0]) –

    Output size hint, in bytes per sample.

    If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.

  • center_windows (bool, optional, default = True) –

    Indicates whether extracted windows should be padded so that the window function is centered at multiples of window_step.

    If set to False, the signal will not be padded, that is, only windows within the input range will be extracted.

  • layout (layout str, optional, default = ‘ft’) – Output layout: “ft” (frequency-major) or “tf” (time-major).

  • nfft (int, optional) –

    Size of the FFT.

    The number of bins that are created in the output is nfft // 2 + 1.

    Note

    The output only represents the positive part of the spectrum.

  • power (int, optional, default = 2) –

    Exponent of the magnitude of the spectrum.

    Supported values:

    • 1 - amplitude,

    • 2 - power (faster to compute).

  • preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.

  • reflect_padding (bool, optional, default = True) –

    Indicates the padding policy when sampling outside the bounds of the signal.

    If set to True, the signal is mirrored with respect to the boundary, otherwise the signal is padded with zeros.

    Note

    When center_windows is set to False, this option is ignored.

  • seed (int, optional, default = -1) –

    Random seed.

    If not provided, it will be populated based on the global seed of the pipeline.

  • window_fn (float or list of float, optional, default = []) –

    Samples of the window function that will be multiplied to each extracted window when calculating the STFT.

    If a value is provided, it should be a list of floating point numbers of size window_length. If a value is not provided, a Hann window will be used.

  • window_length (int, optional, default = 512) – Window size in number of samples.

  • window_step (int, optional, default = 256) – Step betweeen the STFT windows in number of samples.