tritonclient.utils#

Functions

`deserialize_bf16_tensor`(encoded_tensor)	Deserializes an encoded bf16 tensor into a numpy array of dtype of python objects
`deserialize_bytes_tensor`(encoded_tensor)	Deserializes an encoded bytes tensor into a numpy array of dtype of python objects
`np_to_triton_dtype`(np_dtype)
`raise_error`(msg)	Raise error with the provided message
`serialize_bf16_tensor`(input_tensor)	Serializes a bfloat16 tensor into a flat numpy array of bytes.
`serialize_byte_tensor`(input_tensor)	Serializes a bytes tensor into a flat numpy array of length prepended bytes.
`serialized_byte_size`(tensor_value)	Get the underlying number of bytes for a numpy ndarray.
`triton_to_np_dtype`(dtype)

Exceptions

InferenceServerException(msg[, status, ...])

Exception indicating non-Success status.

exception tritonclient.utils.InferenceServerException(msg, status=None, debug_details=None)#

Exception indicating non-Success status.

Parameters:

msg (str) – A brief description of error
status (str) – The error code
debug_details (str) – The additional details on the error

debug_details()#

Get the detailed information about the exception for debugging purposes

Returns:: Returns the exception details
Return type:: str

message()#

Get the exception message.

Returns:: The message associated with this exception, or None if no message.
Return type:: str

status()#

Get the status of the exception.

Returns:: Returns the status of the exception
Return type:: str

tritonclient.utils.deserialize_bf16_tensor(encoded_tensor)#

Deserializes an encoded bf16 tensor into a numpy array of dtype of python objects

Parameters:: encoded_tensor (bytes) – The encoded bytes tensor where each element is 2 bytes (size of bfloat16)
Returns:: string_tensor – The 1-D numpy array of type float32 containing the deserialized bytes in row-major form.
Return type:: np.array

tritonclient.utils.deserialize_bytes_tensor(encoded_tensor)#

Deserializes an encoded bytes tensor into a numpy array of dtype of python objects

Parameters:: encoded_tensor (bytes) – The encoded bytes tensor where each element has its length in first 4 bytes followed by the content
Returns:: string_tensor – The 1-D numpy array of type object containing the deserialized bytes in row-major form.
Return type:: np.array

tritonclient.utils.np_to_triton_dtype(np_dtype)#

tritonclient.utils.raise_error(msg)#: Raise error with the provided message

tritonclient.utils.serialize_bf16_tensor(input_tensor)#

Serializes a bfloat16 tensor into a flat numpy array of bytes. The numpy array should use dtype of np.float32.

Parameters:: input_tensor (np.array) – The bfloat16 tensor to serialize.
Returns:: serialized_bf16_tensor – The 1-D numpy array of type uint8 containing the serialized bytes in row-major form.
Return type:: np.array
Raises:: InferenceServerException – If unable to serialize the given tensor.

tritonclient.utils.serialize_byte_tensor(input_tensor)#

Serializes a bytes tensor into a flat numpy array of length prepended bytes. The numpy array should use dtype of np.object. For np.bytes, numpy will remove trailing zeros at the end of byte sequence and because of this it should be avoided.

Parameters:: input_tensor (np.array) – The bytes tensor to serialize.
Returns:: serialized_bytes_tensor – The 1-D numpy array of type uint8 containing the serialized bytes in row-major form.
Return type:: np.array
Raises:: InferenceServerException – If unable to serialize the given tensor.

tritonclient.utils.serialized_byte_size(tensor_value)#

Get the underlying number of bytes for a numpy ndarray.

Parameters:: tensor_value (numpy.ndarray) – Numpy array to calculate the number of bytes for.
Returns:: Number of bytes present in this tensor
Return type:: int

tritonclient.utils.triton_to_np_dtype(dtype)#

Modules

`cuda_shared_memory`
`shared_memory`