morpheus.stages.inference.triton_inference_stage
Classes
|
This class is a wrapper around a CUDA shared memory object shared between this process and a Triton server instance. |
|
This class provides a bounded pool of resources. |
|
This class is a wrapper around a CUDA shared memory object shared between this process and a Triton server instance. |
|
Data class for model input and output configuration. |
|
This class extends |
|
This class extends |
|
This class extends TritonInference to deal with scenario-specific NLP models inference requests like building response. |
|
Perform inference with Triton Inference Server. |