tritonclient.utils

tritonclient.utils#

Functions

deserialize_bf16_tensor(encoded_tensor)

Deserializes an encoded bf16 tensor into a numpy array of dtype of python objects

deserialize_bytes_tensor(encoded_tensor)

Deserializes an encoded bytes tensor into a numpy array of dtype of python objects

np_to_triton_dtype(np_dtype)

raise_error(msg)

Raise error with the provided message

serialize_bf16_tensor(input_tensor)

Serializes a bfloat16 tensor into a flat numpy array of bytes.

serialize_byte_tensor(input_tensor)

Serializes a bytes tensor into a flat numpy array of length prepended bytes.

serialized_byte_size(tensor_value)

Get the underlying number of bytes for a numpy ndarray.

triton_to_np_dtype(dtype)

Exceptions

InferenceServerException(msg[, status, ...])

Exception indicating non-Success status.

exception tritonclient.utils.InferenceServerException(msg, status=None, debug_details=None)#

Exception indicating non-Success status.

Parameters:
  • msg (str) – A brief description of error

  • status (str) – The error code

  • debug_details (str) – The additional details on the error

debug_details()#

Get the detailed information about the exception for debugging purposes

Returns:

Returns the exception details

Return type:

str

message()#

Get the exception message.

Returns:

The message associated with this exception, or None if no message.

Return type:

str

status()#

Get the status of the exception.

Returns:

Returns the status of the exception

Return type:

str

tritonclient.utils.deserialize_bf16_tensor(encoded_tensor)#

Deserializes an encoded bf16 tensor into a numpy array of dtype of python objects

Parameters:

encoded_tensor (bytes) – The encoded bytes tensor where each element is 2 bytes (size of bfloat16)

Returns:

string_tensor – The 1-D numpy array of type float32 containing the deserialized bytes in row-major form.

Return type:

np.array

tritonclient.utils.deserialize_bytes_tensor(encoded_tensor)#

Deserializes an encoded bytes tensor into a numpy array of dtype of python objects

Parameters:

encoded_tensor (bytes) – The encoded bytes tensor where each element has its length in first 4 bytes followed by the content

Returns:

string_tensor – The 1-D numpy array of type object containing the deserialized bytes in row-major form.

Return type:

np.array

tritonclient.utils.np_to_triton_dtype(np_dtype)#
tritonclient.utils.raise_error(msg)#

Raise error with the provided message

tritonclient.utils.serialize_bf16_tensor(input_tensor)#

Serializes a bfloat16 tensor into a flat numpy array of bytes. The numpy array should use dtype of np.float32.

Parameters:

input_tensor (np.array) – The bfloat16 tensor to serialize.

Returns:

serialized_bf16_tensor – The 1-D numpy array of type uint8 containing the serialized bytes in row-major form.

Return type:

np.array

Raises:

InferenceServerException – If unable to serialize the given tensor.

tritonclient.utils.serialize_byte_tensor(input_tensor)#

Serializes a bytes tensor into a flat numpy array of length prepended bytes. The numpy array should use dtype of np.object. For np.bytes, numpy will remove trailing zeros at the end of byte sequence and because of this it should be avoided.

Parameters:

input_tensor (np.array) – The bytes tensor to serialize.

Returns:

serialized_bytes_tensor – The 1-D numpy array of type uint8 containing the serialized bytes in row-major form.

Return type:

np.array

Raises:

InferenceServerException – If unable to serialize the given tensor.

tritonclient.utils.serialized_byte_size(tensor_value)#

Get the underlying number of bytes for a numpy ndarray.

Parameters:

tensor_value (numpy.ndarray) – Numpy array to calculate the number of bytes for.

Returns:

Number of bytes present in this tensor

Return type:

int

tritonclient.utils.triton_to_np_dtype(dtype)#

Modules

tritonclient.utils.cuda_shared_memory

tritonclient.utils.shared_memory