tensorflow_quantization.BaseQuantizeWrapper

class tensorflow_quantization.BaseQuantizeWrapper(*args, **kwargs)[source]

Base wrapper class which all layer wrappers inherit

__init__(layer: keras.engine.base_layer.Layer, **kwargs)[source]

Create a quantize emulate wrapper for a keras layer. This wrapper provides options to quantize inputs and weights of the layer. :Parameters: * layer (tf.keras.layers.Layer) -- The keras layer to be quantized.

  • **kwargs -- Additional keyword arguments to be passed to the keras layer.

build(input_shape)[source]

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters

input_shape -- Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

abstract call(inputs, training=None)[source]

This is where the layer's logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state in __init__(), or the build() method that is called automatically before call() executes the first time.

Parameters
  • inputs -- Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules: - inputs must be explicitly passed. A layer cannot have zero

    arguments, and inputs cannot be provided via the default value of a keyword argument.

    • NumPy array or Python scalar values in inputs get cast as tensors.

    • Keras mask metadata is only collected from inputs.

    • Layers are built (build(input_shape) method) using shape info from inputs only.

    • input_spec compatibility is only checked against inputs.

    • Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.

    • The SavedModel input specification is generated using inputs only.

    • Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.

  • *args -- Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.

  • **kwargs -- Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved: - training: Boolean scalar tensor of Python boolean indicating

    whether the call is meant for training or inference.

    • mask: Boolean input mask. If the layer's call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).

Returns

A tensor or list/tuple of tensors.

compute_output_shape(input_shape)[source]

Computes the output shape of the layer.

This method will cause the layer's state to be built, if that has not happened before. This requires that the layer will later be used with inputs that match the input shape provided here.

Parameters

input_shape -- Shape tuple (tuple of integers) or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.

Returns

An input shape tuple.

classmethod from_config(config)[source]

Creates a layer from its config.

This method is the reverse of get_config, capable of instantiating the same layer from the config dictionary. It does not handle layer connectivity (handled by Network), nor weights (handled by set_weights).

Parameters

config -- A Python dictionary, typically the output of get_config.

Returns

A layer instance.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

property losses

List of losses added using the add_loss() API.

Variable regularization tensors are created when this property is accessed, so it is eager safe: accessing losses under a tf.GradientTape will propagate gradients back to the corresponding variables.

Examples:

>>> class MyLayer(tf.keras.layers.Layer):
...   def call(self, inputs):
...     self.add_loss(tf.abs(tf.reduce_mean(inputs)))
...     return inputs
>>> l = MyLayer()
>>> l(np.ones((10, 1)))
>>> l.losses
[1.0]
>>> inputs = tf.keras.Input(shape=(10,))
>>> x = tf.keras.layers.Dense(10)(inputs)
>>> outputs = tf.keras.layers.Dense(1)(x)
>>> model = tf.keras.Model(inputs, outputs)
>>> # Activity regularization.
>>> len(model.losses)
0
>>> model.add_loss(tf.abs(tf.reduce_mean(x)))
>>> len(model.losses)
1
>>> inputs = tf.keras.Input(shape=(10,))
>>> d = tf.keras.layers.Dense(10, kernel_initializer='ones')
>>> x = d(inputs)
>>> outputs = tf.keras.layers.Dense(1)(x)
>>> model = tf.keras.Model(inputs, outputs)
>>> # Weight regularization.
>>> model.add_loss(lambda: tf.reduce_mean(d.kernel))
>>> model.losses
[<tf.Tensor: shape=(), dtype=float32, numpy=1.0>]
Returns

A list of tensors.

property non_trainable_weights

List of all non-trainable weights tracked by this layer.

Non-trainable weights are not updated during training. They are expected to be updated manually in call().

Returns

A list of non-trainable variables.

property trainable_weights

List of all trainable weights tracked by this layer.

Trainable weights are updated via gradient descent during training.

Returns

A list of trainable variables.

Example

Conv2DTranspose layer is a weighted layer used to perform transformations going in the opposite direction of Convolution.

Note

Conv2DTranspose is a Keras class, thus new wrapper class is Conv2DTransposeQuantizeWrapper. This follows toolkit naming conventions.

from tensorflow.python.util import tf_inspect
from tensorflow_quantization.quantize_wrapper_base import BaseQuantizeWrapper

class Conv2DTransposeQuantizeWrapper(BaseQuantizeWrapper):
    def __init__(self, layer, kernel_type="kernel", **kwargs):
        """
        Create a quantize emulate wrapper for a keras layer.
        This wrapper provides options to quantize inputs, outputs amd weights of a quantizable layer.
        Args:
        layer: The keras layer to be quantized.
        kernel_type: Options=['kernel' for Conv2D/Dense, 'depthwise_kernel' for DepthwiseConv2D]
        **kwargs: Additional keyword arguments to be passed to the keras layer.
        """
        self.kernel_type = kernel_type
        self.channel_axis = kwargs.get("axis", -1)
        super(Conv2DTransposeQuantizeWrapper, self).__init__(layer, **kwargs)

    def build(self, input_shape):
        super(Conv2DTransposeQuantizeWrapper, self).build(input_shape)

        self._weight_vars = []
        self.input_vars = {}
        self.output_vars = {}
        self.channel_axis = -1
        if self.kernel_type == "depthwise_kernel":
            self.channel_axis = 2
        # quantize weights only applicable for weighted ops.
        # By default weights is per channel quantization
        if self.quantize_weights:
            # get kernel weights dims.
            kernel_weights = getattr(self.layer, self.kernel_type)
            min_weight = self.layer.add_weight(
                kernel_weights.name.split(":")[0] + "_min",
                shape=(kernel_weights.shape[self.channel_axis]),
                initializer=tf.keras.initializers.Constant(-6.0),
                trainable=False,
            )
            max_weight = self.layer.add_weight(
                kernel_weights.name.split(":")[0] + "_max",
                shape=(kernel_weights.shape[self.channel_axis]),
                initializer=tf.keras.initializers.Constant(6.0),
                trainable=False,
            )
            quantizer_vars = {"min_var": min_weight, "max_var": max_weight}
            self._weight_vars.append((kernel_weights, quantizer_vars))
            # Needed to ensure unquantized weights get trained as part of the wrapper.
            self._trainable_weights.append(kernel_weights)

        # By default input is per tensor quantization
        if self.quantize_inputs:
            input_min_weight = self.layer.add_weight(
                self.layer.name + "_ip_min",
                shape=None,
                initializer=tf.keras.initializers.Constant(-6.0),
                trainable=False,
            )
            input_max_weight = self.layer.add_weight(
                self.layer.name + "_ip_max",
                shape=None,
                initializer=tf.keras.initializers.Constant(6.0),
                trainable=False,
            )
            self.input_vars["min_var"] = input_min_weight
            self.input_vars["max_var"] = input_max_weight

    def call(self, inputs, training=None):
        if training is None:
            training = tf.keras.backend.learning_phase()

        # Quantize all weights, and replace them in the underlying layer.
        if self.quantize_weights:
            quantized_weights = []
            quantized_weight = self._last_value_quantizer(
                self._weight_vars[0][0],
                training,
                self._weight_vars[0][1],
                per_channel=True,
                channel_axis=self.channel_axis
            )
            quantized_weights.append(quantized_weight)
            # Replace the original weights with QDQ weights
            setattr(self.layer, self.kernel_type, quantized_weights[0])

        # Quantize inputs to the conv layer
        if self.quantize_inputs:
            quantized_inputs = self._last_value_quantizer(
                inputs,
                training,
                self.input_vars,
                per_channel=False)
        else:
            quantized_inputs = inputs

        args = tf_inspect.getfullargspec(self.layer.call).args
        if "training" in args:
            outputs = self.layer.call(quantized_inputs, training=training)
        else:
            outputs = self.layer.call(quantized_inputs)

        return outputs