morpheus.stages.inference.triton_inference_stage.ShmInputWrapper
- class ShmInputWrapper(client, model_name, config)[source]
Bases:
morpheus.stages.inference.triton_inference_stage.InputWrapper
This class is a wrapper around a CUDA shared memory object shared between this process and a Triton server instance. Since the Triton server only accepts numpy arrays as inputs, we can use this special class to pass memory references of inputs on the device to the server without having to go to the host eliminating serialization and network overhead.
- Parameters
- clienttritonclient.InferenceServerClient
- model_namestr
- configtyping.Dict[str,
TritonInOut
]
Triton inference server client instance.
Name of the model. Specifies which model can handle the inference requests that are sent to Triton inference server.
Model input and output configuration. Keys represent the input/output names. Values will be a
TritonInOut
object.
Methods
build_input
(name, data, force_convert_inputs)This helper function builds a Triton InferInput object that can be directly used by
tritonclient.async_infer
.get_bytes
(name)Get the bytes needed for a particular input/output.
get_offset
(name)Get the offset needed for a particular input/output.
get_ptr
(name)Returns the
cupy.cuda.MemoryPointer
object to the internalShmWrapper
for the specified input/output name.- build_input(name, data, force_convert_inputs)[source]
This helper function builds a Triton InferInput object that can be directly used by
tritonclient.async_infer
. Utilizes the config option passed in the constructor to determine the shape/size/type.- Parameters
- namestr
- datacupy.ndarray
- force_convert_inputs: bool
Inference input name.
Inference input data.
Whether or not to convert the inputs to the type specified by Triton. This will happen automatically if no data would be lost in the conversion (i.e., float -> double). Set this to True to convert the input even if data would be lost (i.e., double -> float).
- get_bytes(name)[source]
Get the bytes needed for a particular input/output.
- Parameters
- namestr
Configuration name.
- Returns
- bytes
Configuration as bytes.
- get_offset(name)[source]
Get the offset needed for a particular input/output.
- Parameters
- namestr
Configuration input/output name.
- Returns
- int
Configuration offset.
- get_ptr(name)[source]
Returns the
cupy.cuda.MemoryPointer
object to the internalShmWrapper
for the specified input/output name.- Parameters
- namestr
Input/output name.
- Returns
- cp.cuda.MemoryPointer
Returns the shared memory pointer for this input/output.