nemo_automodel.components.utils.compile_utils
nemo_automodel.components.utils.compile_utils
Module Contents
Classes
Functions
Data
API
Configuration for torch.compile.
Convert to dictionary.
Apply the Flash Attention + torch.compile compatibility fix.
This enables scalar output capture and patches the key function that causes issues. Note: This function is focused solely on Flash Attention compatibility. For dynamo configuration (cache size, etc.), use configure_torch_dynamo() separately.
Build a compile config from configuration.
Parameters:
Configuration dictionary for compilation.
Returns: CompileConfig
CompileConfig instance.
Compile the model with Flash Attention compatibility.
Parameters:
The model to compile.
Compile configuration.
Returns: nn.Module
The compiled model.
Configure torch._dynamo settings for compilation.
Parameters:
Cache size limit for dynamo compilation
Whether to capture scalar outputs for Flash Attention compatibility
Create a CompileConfig from a dictionary.
Parameters:
Dictionary containing compile configuration.
Returns: CompileConfig
CompileConfig instance.
Enable torch.dynamo to capture scalar outputs for better Flash Attention + torch.compile compatibility.
Apply a simple targeted patch to fix the prepare_fa2_from_position_ids function for torch.compile compatibility.
This is the key function that needs the fix for the max_length computation.