core.quantization.quant_config#

Provide base functionality for quantization purposes.

Usage comes from a user-provide YAML file, for example:

configs: nvfp4: \(payload1 mxfp8: \)payload2

matchers: fc1: config: “nvfp4” type: “glob” pattern: “fc1” enabled: True fc2: config: “nvfp4” type: “glob” pattern: “fc2” enabled: True default: config: “mxfp8” type: “glob” pattern: “*” enabled: True

The user-passed configuration is split into 2 distinct pieces:

  • A set of quantization configs, describing how a given operator will be quantized Note: This is consumed by the operator(s), and the particular operators being instantiated are responsible for parsing this configuration if they support configurable quantization.

  • An ordered collection of matchers that determine what quantization config (if any) is applied to a given operator. The first matcher in the collection that successfully matches the context determines the key from the configs dict. If a matcher doesn’t match, the rest of the matchers in the list are tested against. Matchers define a type, or style of matching - “glob” is bash-style, but this can be extended by inheriting from the abstract Matcher class to define a new match type.

The idea here is to provide an ability to define arbitrarily-complicated recipes in as friendly a way as possible.

Module Contents#

Classes#

MatchContext

Layer context that can be matched to a quantization config.

QuantizationConfig

Wrapper around configuration dictionary for layer’s numerics.

Matcher

Matcher interface to select layers.

GlobMatcher

Pattern based matcher using fnmatch to compare the module path against a pattern. fnmatch supplies glob-style matching similar to that used in bash, allowing for matches like:

RecipeConfig

Hold recipe information (matcher_fn) -> Configs)

Data#

API#

core.quantization.quant_config.logger#

‘getLogger(…)’

class core.quantization.quant_config.MatchContext#

Layer context that can be matched to a quantization config.

module_path: str#

None

layer_number: Optional[int]#

None

class core.quantization.quant_config.QuantizationConfig(
config: dict,
match_input: core.quantization.quant_config.MatchContext,
config_key: str,
)#

Wrapper around configuration dictionary for layer’s numerics.

Initialization

Initialize the quantization config.

The configuration dictionary is copied to defend against modules that mutate the configuration corrupting the configuration of other modules.

__repr__() str#
class core.quantization.quant_config.Matcher#

Bases: abc.ABC

Matcher interface to select layers.

abstractmethod match(
context: core.quantization.quant_config.MatchContext,
) Optional[str]#

Match a layer based on its qualified name.

If it does not match, return None. If it matches, return the configuration key to select for the layer.

class core.quantization.quant_config.GlobMatcher(pattern: str, config_key: str)#

Bases: core.quantization.quant_config.Matcher

Pattern based matcher using fnmatch to compare the module path against a pattern. fnmatch supplies glob-style matching similar to that used in bash, allowing for matches like:

match_str=”fc2” - match anything which includes “fc2” anywhere in the string. match_str=”*fc2” - match anything which includes “fc2” at the end of the string. match_str=”layers.10” - match anything with “layers.10” (layer #) in the string.

Initialization

match(
context: core.quantization.quant_config.MatchContext,
) Optional[str]#

Pattern based match.

__repr__() str#
class core.quantization.quant_config.RecipeConfig(
matchers: List[core.quantization.quant_config.Matcher],
config_dict: Dict[str, Dict],
)#

Hold recipe information (matcher_fn) -> Configs)

Initialization

static _build_matchers(
matchers_dict: Dict | None,
) List[core.quantization.quant_config.Matcher]#
static from_yaml_file(
recipe_yaml_path: str,
) core.quantization.quant_config.RecipeConfig#

Loads recipe from yaml configuration.

static from_config_dict(
config: Dict,
) core.quantization.quant_config.RecipeConfig#

Loads recipe from dict configuration.

match_to_config_key(
operator_context: core.quantization.quant_config.MatchContext,
) str | None#

Gives an operator’s context, return a configuration key if necessary, or sentinel (None) denoting no matchers matched.

match(
operator_context: core.quantization.quant_config.MatchContext,
) core.quantization.quant_config.QuantizationConfig | None#

Gives an operator’s context, return a QuantizationConfig if necessary, or sentinel (None) denoting no matchers matched.

__repr__() str#