Constant¶
Generates an output tensor based on input values.
Attributes¶
shape
: Shape of the output tensor.
weights
: Weights of type T
. Must match the number of elements in the output tensor.
Outputs¶
output: Tensor of type T
.
Data Types¶
T: int32
, int64
, float16
, float32
, int8
, bool
, bfloat16
, float8
Shape Information¶
output is a tensor with a shape of shape
.
DLA Support¶
DLA supports this operator only when it’s connected to a PReLU operator as a second input.
Examples¶
Constant
input_shape = [1, 3, 3, 3]
w = np.arange(0.0, 27.0, dtype=np.dtype("f4"))
layer = network.add_constant(shape=input_shape, weights=trt.Weights(w))
network.mark_output(layer.get_output(0))
outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = w.reshape(input_shape)
Constant Int4 Weights
def pack_int4(array: np.ndarray):
result = []
array = array.flatten()
for low, high in zip(array[::2], array[1::2]):
low = np.rint(np.clip(low, -8, 7)).astype(np.int8)
high = np.rint(np.clip(high, -8, 7)).astype(np.int8)
result.append(high << 4 | low & 0x0F)
return np.asarray(result, dtype=np.int8)
w = np.array([[ 0, 1, 2, 3, 4, 5, 6, 7],
[-1, -2, -3, -4, -5, -6, -7, -8],
[ 7, 6, 5, 4, 3, 2, 1, 0],
[-7, -6, -5, -4, -3, -2, -1, 0]], dtype=np.int8)
w_packed = pack_int4(w)
weights = network.add_constant(shape=w.shape, weights=trt.Weights(trt.int4, w_packed.ctypes.data, w.size))
# Quantized weights must be followed by a DQ node
scale = network.add_constant(shape=(), weights=np.ones(shape=(1), dtype=np.float32))
dequantize = network.add_dequantize(weights.get_output(0), scale.get_output(0), trt.float32)
dequantize.precision = trt.int4
network.mark_output(dequantize.get_output(0))
outputs[dequantize.get_output(0).name] = dequantize.get_output(0).shape
expected[dequantize.get_output(0).name] = w
C++ API¶
For more information about the C++ IConstantLayer operator, refer to the C++ IConstantLayer documentation.
Python API¶
For more information about the Python IConstantLayer operator, refer to the Python IConstantLayer documentation.