nvidia.dali.experimental.dynamic.mfcc#

nvidia.dali.experimental.dynamic.mfcc(input, /, *, batch_size=None, device=None, axis=None, dct_type=None, lifter=None, n_mfcc=None, normalize=None)#

Computes Mel Frequency Cepstral Coefficients (MFCC) from a mel spectrogram.

Supported backends
  • ‘cpu’

  • ‘gpu’

Parameters:

input (Tensor/Batch) – Input to the operator.

Keyword Arguments:
  • axis (int, optional, default = 0) –

    Axis over which the transform will be applied.

    If a value is not provided, the outer-most dimension will be used.

  • dct_type (int, optional, default = 2) –

    Discrete Cosine Transform type.

    The supported types are 1, 2, 3, 4. The formulas that are used to calculate the DCT are equivalent to those described in https://en.wikipedia.org/wiki/Discrete_cosine_transform (the numbers correspond to types listed in https://en.wikipedia.org/wiki/Discrete_cosine_transform#Formal_definition).

  • lifter (float, optional, default = 0.0) –

    Cepstral filtering coefficient, which is also known as the liftering coefficient.

    If the lifter coefficient is greater than 0, the MFCCs will be scaled based on the following formula:

    MFFC[i] = MFCC[i] * (1 + sin(pi * (i + 1) / lifter)) * (lifter / 2)
    

  • n_mfcc (int, optional, default = 20) – Number of MFCC coefficients.

  • normalize (bool, optional, default = False) –

    If set to True, the DCT uses an ortho-normal basis.

    Note

    Normalization is not supported when dct_type=1.