MatrixMultiply¶
Computes a matrix product between two input tensors to produce an output tensor. When applicable, broadcasting is used (refer to Shape Information for more information).
Attributes¶
op0
: How to treat the first input tensor:
NONE
Default behavior.TRANSPOSE
Transpose the tensor. Only the last two dimensions are transposed.VECTOR
Treat the tensor as a collection of vectors.
op1
: How to treat the second input tensor, has the same options as op0
.
Outputs¶
C: tensor of type T
Data Types¶
T: float16
, float32
, bfloat16
, float8
Shape Information¶
A is a tensor with a shape of \([a_0,...,a_n], n \geq 2\)
B is a tensor with a shape of \([b_0,...,b_n], n \geq 2\)
C is a tensor with a shape of \([c_0,...,c_n], n \geq 2\)
For each input dimension (except the last two), the lengths must match, or one of them must be equal to 1. In the latter case, the tensor is broadcasted along that axis.
The output has the same rank as the inputs.
For each output dimension (except the last two), its length is equal to the lengths of the corresponding input dimensions if they match, otherwise it is equal to the length that is not 1.
The last two dimensions are derived from the simple matrix multiply rules. For example, when op0
and op1
are set to NONE
, \(c_{n-1} = a_{n-1}, c_n = b_n\)
Examples¶
MatrixMultiply
input_shape = [1, 2, 2, 3]
input_shape2 = [1, 2, 3, 1]
in1 = network.add_input("input1", dtype=trt.float32, shape=input_shape)
in2 = network.add_input("input2", dtype=trt.float32, shape=input_shape2)
layer = network.add_matrix_multiply(in1, trt.MatrixOperation.NONE, in2, trt.MatrixOperation.NONE)
network.mark_output(layer.get_output(0))
inputs[in1.name] = np.array(
[
[
[
[-3.0, -2.0, -1.0],
[10.0, -25.0, 0.0],
],
[
[-5.0, -4.0, -3.0],
[1.0, -2.0, 2.0],
],
]
]
)
inputs[in2.name] = np.array(
[
[
[[0.0], [1.0], [2.0]],
[[3.0], [4.0], [5.0]],
]
]
)
outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array(
[
[[-4.0], [-25.0]],
[[-46.0], [5.0]],
]
)
MatrixMultiply With Broadacast
input_shape = [1, 2, 2, 3]
input_shape2 = [1, 1, 3, 1]
in1 = network.add_input("input1", dtype=trt.float32, shape=input_shape)
in2 = network.add_input("input2", dtype=trt.float32, shape=input_shape2)
layer = network.add_matrix_multiply(in1, trt.MatrixOperation.NONE, in2, trt.MatrixOperation.NONE)
network.mark_output(layer.get_output(0))
inputs[in1.name] = np.array(
[
[
[
[-3.0, -2.0, -1.0],
[10.0, -25.0, 0.0],
],
[
[-5.0, -4.0, -3.0],
[1.0, -2.0, 2.0],
],
]
]
)
inputs[in2.name] = np.array([[[[0.0], [1.0], [2.0]]]])
outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array(
[
[
[[-4.0], [-25.0]],
[[-10.0], [2.0]],
]
]
)
MatrixMultiply
input_shape = [1, 2, 3, 2]
input_shape2 = [1, 2, 3, 1]
in1 = network.add_input("input1", dtype=trt.float32, shape=input_shape)
in2 = network.add_input("input2", dtype=trt.float32, shape=input_shape2)
layer = network.add_matrix_multiply(in1, trt.MatrixOperation.TRANSPOSE, in2, trt.MatrixOperation.NONE)
network.mark_output(layer.get_output(0))
inputs[in1.name] = np.array(
[
[
[
[-3.0, 10.0],
[-2.0, -25.0],
[-1.0, 0.0],
],
[
[-5.0, 1.0],
[-4.0, -2.0],
[-3.0, 2.0],
],
]
]
)
inputs[in2.name] = np.array(
[
[
[[0.0], [1.0], [2.0]],
[[3.0], [4.0], [5.0]],
]
]
)
outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array(
[
[[-4.0], [-25.0]],
[[-46.0], [5.0]],
]
)
C++ API¶
For more information about the C++ IMatrixMultiplyLayer operator, refer to the C++ IMatrixMultiplyLayer documentation.
Python API¶
For more information about the Python IMatrixMultiplyLayer operator, refer to the Python IMatrixMultiplyLayer documentation.