NVIDIA Docs Hub NVIDIA Clara Clara Holoscan Deploy 0.7.4 InferenceClient APIs

InferenceClient APIs

class cim_application_sdk.clara_application.ClaraApplication(name)

This is the inference application class, which encapsulates interactions with TensorRT Inference Server, TRTIS, and provides APIs to perform data processing and inference.

logger: The logger object.

context_manager: The object that holds the runtime settings.

work_handle: The object that gets work via a specific messaging mode.

finish(request_id, error_code)

Set a request as completed with either a success or failure status.

Parameters

request_id – Id of the request being processed.
error_code – Enum CimMsgStatus.

Returns

Void

Raises

ValueError – If argument is not valid.
Exception – An error occurred when reporting progress.

get_config(model, version=None)

Gets the model config from the inference server.

Parameters

model – A string as the name of the model.
version – Version number of the model, None for the latest.

Returns

inference_server.api.model_config_pb2

Raises

ModelNotAvailableEx – Error occurred getting model config.
InferenceServerException – If fails to get inference server status.

get_work(process_request, timeout=0)

Gets a work request to perform inference.

Parameters

process_request – A function having an argument of type WorkRequest.
timeout – Integer value for seconds before timeout. 0 for infinite.

Returns

Void

Raises

ValueError – When required argument is not valid.
TimeoutError – Timeout expired waiting for request to arrive.
InvalidRequestMessageEx – If the request message cannot be parsed.

perform_inference(inp_dict, out_dict, model, version=None, batch_size=1)

Performs inference for the specified model and version.

Parameters

inp_dict – Input dictionary as required by the model.
out_dict – Output dictionary as required by the model.
model – A string as the name of the model.
version – Version number of the model. None for the latest.
batch_size – Integer value as the batch size. Default is 1.

Returns

Output protobuf containing inference results

Raises

ValueError – If an argument is not valid.
ModelNameNotProvidedEx – If model name is not provided nor found in environment variables.
CimException – If unable to create handle to the inference server.
Exception – An error occurred performing inference on the server.

report_progress(request_id, request_status, percent, error_code= )

Reports request processing progress, with measurements in percentage.

Parameters

request_id – Id of the request being processed.
request_status – Enum RequestStatus.
percent – Int, between 0 and 100 for percentage of completion.
error_code – Enum CimMsgStatus, EXECUTING | IN_PROGRESS.

Returns

Void

Raises

ValueError – If an argument is not valid.
Exception – An error occurred when reporting progress.

class cim_application_sdk.work_request.WorkRequest(req_msg, input_file_base_path, output_file_base_path)

Class encapsulating work request details

request_id: Id of the request.

request_file_path: Path to request input file(s).

result_file_path: Path to save output results.

request_priority: Priority of request.

study_type: Study type object

class cim_application_sdk.processing_interface.PreProcess

Class that performs the processing of input data.

logger: Logging object.

__call__(input_path, study_type=None)

Function to process/transform the input data.

Parameters

input_path – string as the path to the input data.
study_type – the studyType object if known.

Returns

transformed data that will be used in inference.

__init__(): Inits the class with logger object.

class cim_application_sdk.processing_interface.PostProcess

Class that performs the processing of output data.

logger: Logging object.

__call__(result_data, output_path, study_type=None)

Function to transform and save the result data from inference.

Parameters

result_data – data resulting from inference
output_path – string as the path to save the transformed data
study_type – the studyType object if known

Returns

Void

__init__(): Inits the class with logger object.

class cim_application_sdk.cim_response_status.CimMsgStatus

Status of inference request processing.

ATTIS_NOT_REACHABLE = 'ATTIS_NOT_REACHABLE'

COMPLETE_SUCCESS = 'COMPLETE_SUCCESS'

EXECUTING = 'EXECUTING'

INFERENCE_FAILED = 'INFERENCE_FAILED'

INPUT_FILES_MISSING = 'INPUT_FILES_MISSING'

INSUFFICIENT_STORAGE = 'INSUFFICIENT_STORAGE'

INTERNAL_ERROR = 'INTERNAL_ERROR'

INVALID_APP_REFERENCE = 'INVALID_APP_REFERENCE'

INVALID_MESSAGE = 'INVALID_MESSAGE'

INVALID_MODEL_REFERENCE = 'INVALID_MODEL_REFERENCE'

INVALID_REQUEST = 'INVALID_REQUEST'

INVALID_STUDY_TYPE = 'INVALID_STUDY_TYPE'

POST_PROCESSING_ERROR = 'POSTPROCESSING_ERROR'

PRE_PROCESSING_ERROR = 'PREPROCESSING_ERROR'

RECEIVED = 'RECEIVED'

SERVICE_UNAVAILABLE = 'SERVICE_UNAVAILABLE'

TIMEOUT = 'TIMEOUT'

UNKNOWN_ERROR_RETRYING = 'UNKNOWN_ERROR_RETRYING'

UNPROCESSABLE_ENTITY = 'UNPROCESSABLE_ENTITY'

class cim_application_sdk.cim_response_status.RequestStatus

Processing status of an inference request, IN_PROGRESS|COMPLETED.

COMPLETED = 'COMPLETED'

IN_PROGRESS = 'IN_PROGRESS'

exception cim_application_sdk.exceptions.CimException: There was an ambiguous exception that occurred in Clara Inference SDK.

exception cim_application_sdk.exceptions.InferenceServerNotProvidedEx: TensorRT Inference server URL is not provided.

exception cim_application_sdk.exceptions.InvalidRequestMessageEx: Request message not valid and cannot be parsed by Clara Inference SDK.

exception cim_application_sdk.exceptions.InvalidStartupArgsEx: Invalid start up arguments passed to Clara Inference SDK.

exception cim_application_sdk.exceptions.ModelNameNotProvidedEx: The model name is not provided.

exception cim_application_sdk.exceptions.ModelNotAvailableEx: The model or model status is not available on the inference server.