core.quantization.quant_config#
Provide base functionality for quantization purposes.
Usage comes from a user-provide YAML file, for example:
configs: nvfp4: \(payload1 mxfp8: \)payload2
matchers: fc1: config: “nvfp4” type: “glob” pattern: “fc1” enabled: True fc2: config: “nvfp4” type: “glob” pattern: “fc2” enabled: True default: config: “mxfp8” type: “glob” pattern: “*” enabled: True
The user-passed configuration is split into 2 distinct pieces:
A set of quantization configs, describing how a given operator will be quantized Note: This is consumed by the operator(s), and the particular operators being instantiated are responsible for parsing this configuration if they support configurable quantization.
An ordered collection of matchers that determine what quantization config (if any) is applied to a given operator. The first matcher in the collection that successfully matches the context determines the key from the configs dict. If a matcher doesn’t match, the rest of the matchers in the list are tested against. Matchers define a type, or style of matching - “glob” is bash-style, but this can be extended by inheriting from the abstract Matcher class to define a new match type.
The idea here is to provide an ability to define arbitrarily-complicated recipes in as friendly a way as possible.
Module Contents#
Classes#
Layer context that can be matched to a quantization config. |
|
Wrapper around configuration dictionary for layer’s numerics. |
|
Matcher interface to select layers. |
|
Pattern based matcher using fnmatch to compare the module path against a pattern. fnmatch supplies glob-style matching similar to that used in bash, allowing for matches like: |
|
Hold recipe information (matcher_fn) -> Configs) |
Data#
API#
- core.quantization.quant_config.logger#
‘getLogger(…)’
- class core.quantization.quant_config.MatchContext#
Layer context that can be matched to a quantization config.
- module_path: str#
None
- layer_number: Optional[int]#
None
- class core.quantization.quant_config.QuantizationConfig(
- config: dict,
- match_input: core.quantization.quant_config.MatchContext,
- config_key: str,
Wrapper around configuration dictionary for layer’s numerics.
Initialization
Initialize the quantization config.
The configuration dictionary is copied to defend against modules that mutate the configuration corrupting the configuration of other modules.
- __repr__() str#
- class core.quantization.quant_config.Matcher#
Bases:
abc.ABCMatcher interface to select layers.
- abstractmethod match( ) Optional[str]#
Match a layer based on its qualified name.
If it does not match, return None. If it matches, return the configuration key to select for the layer.
- class core.quantization.quant_config.GlobMatcher(pattern: str, config_key: str)#
Bases:
core.quantization.quant_config.MatcherPattern based matcher using fnmatch to compare the module path against a pattern. fnmatch supplies glob-style matching similar to that used in bash, allowing for matches like:
match_str=”fc2” - match anything which includes “fc2” anywhere in the string. match_str=”*fc2” - match anything which includes “fc2” at the end of the string. match_str=”layers.10” - match anything with “layers.10” (layer #) in the string.
Initialization
- match( ) Optional[str]#
Pattern based match.
- __repr__() str#
- class core.quantization.quant_config.RecipeConfig(
- matchers: List[core.quantization.quant_config.Matcher],
- config_dict: Dict[str, Dict],
Hold recipe information (matcher_fn) -> Configs)
Initialization
- static _build_matchers(
- matchers_dict: Dict | None,
- static from_yaml_file(
- recipe_yaml_path: str,
Loads recipe from yaml configuration.
- static from_config_dict(
- config: Dict,
Loads recipe from dict configuration.
- match_to_config_key(
- operator_context: core.quantization.quant_config.MatchContext,
Gives an operator’s context, return a configuration key if necessary, or sentinel (None) denoting no matchers matched.
- match(
- operator_context: core.quantization.quant_config.MatchContext,
Gives an operator’s context, return a QuantizationConfig if necessary, or sentinel (None) denoting no matchers matched.
- __repr__() str#