MatrixMultiply

Computes a matrix product between two input tensors to produce an output tensor. When applicable, broadcasting is used (refer to Shape Information for more information).

Attributes

op0: How to treat the first input tensor:

  • NONE Default behavior.

  • TRANSPOSE Transpose the tensor. Only the last two dimensions are transposed.

  • VECTOR Treat the tensor as a collection of vectors.

op1: How to treat the second input tensor, has the same options as op0.

Inputs

A: tensor of type T

B: tensor of type T

Outputs

C: tensor of type T

Data Types

T: float16, float32, bfloat16, float8

Shape Information

A is a tensor with a shape of \([a_0,...,a_n], n \geq 2\)

B is a tensor with a shape of \([b_0,...,b_n], n \geq 2\)

C is a tensor with a shape of \([c_0,...,c_n], n \geq 2\)

For each input dimension (except the last two), the lengths must match, or one of them must be equal to 1. In the latter case, the tensor is broadcasted along that axis.

The output has the same rank as the inputs. For each output dimension (except the last two), its length is equal to the lengths of the corresponding input dimensions if they match, otherwise it is equal to the length that is not 1. The last two dimensions are derived from the simple matrix multiply rules. For example, when op0 and op1 are set to NONE, \(c_{n-1} = a_{n-1}, c_n = b_n\)

Examples

MatrixMultiply
input_shape = [1, 2, 2, 3]
input_shape2 = [1, 2, 3, 1]
in1 = network.add_input("input1", dtype=trt.float32, shape=input_shape)
in2 = network.add_input("input2", dtype=trt.float32, shape=input_shape2)
layer = network.add_matrix_multiply(in1, trt.MatrixOperation.NONE, in2, trt.MatrixOperation.NONE)
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [
                [-3.0, -2.0, -1.0],
                [10.0, -25.0, 0.0],
            ],
            [
                [-5.0, -4.0, -3.0],
                [1.0, -2.0, 2.0],
            ],
        ]
    ]
)

inputs[in2.name] = np.array(
    [
        [
            [[0.0], [1.0], [2.0]],
            [[3.0], [4.0], [5.0]],
        ]
    ]
)

outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array(
    [
        [[-4.0], [-25.0]],
        [[-46.0], [5.0]],
    ]
)
MatrixMultiply With Broadacast
input_shape = [1, 2, 2, 3]
input_shape2 = [1, 1, 3, 1]
in1 = network.add_input("input1", dtype=trt.float32, shape=input_shape)
in2 = network.add_input("input2", dtype=trt.float32, shape=input_shape2)
layer = network.add_matrix_multiply(in1, trt.MatrixOperation.NONE, in2, trt.MatrixOperation.NONE)
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [
                [-3.0, -2.0, -1.0],
                [10.0, -25.0, 0.0],
            ],
            [
                [-5.0, -4.0, -3.0],
                [1.0, -2.0, 2.0],
            ],
        ]
    ]
)

inputs[in2.name] = np.array([[[[0.0], [1.0], [2.0]]]])

outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array(
    [
        [
            [[-4.0], [-25.0]],
            [[-10.0], [2.0]],
        ]
    ]
)
MatrixMultiply
input_shape = [1, 2, 3, 2]
input_shape2 = [1, 2, 3, 1]
in1 = network.add_input("input1", dtype=trt.float32, shape=input_shape)
in2 = network.add_input("input2", dtype=trt.float32, shape=input_shape2)
layer = network.add_matrix_multiply(in1, trt.MatrixOperation.TRANSPOSE, in2, trt.MatrixOperation.NONE)
network.mark_output(layer.get_output(0))

inputs[in1.name] = np.array(
    [
        [
            [
                [-3.0, 10.0],
                [-2.0, -25.0],
                [-1.0, 0.0],
            ],
            [
                [-5.0, 1.0],
                [-4.0, -2.0],
                [-3.0, 2.0],
            ],
        ]
    ]
)

inputs[in2.name] = np.array(
    [
        [
            [[0.0], [1.0], [2.0]],
            [[3.0], [4.0], [5.0]],
        ]
    ]
)

outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array(
    [
        [[-4.0], [-25.0]],
        [[-46.0], [5.0]],
    ]
)

C++ API

For more information about the C++ IMatrixMultiplyLayer operator, refer to the C++ IMatrixMultiplyLayer documentation.

Python API

For more information about the Python IMatrixMultiplyLayer operator, refer to the Python IMatrixMultiplyLayer documentation.