Inference Resource#
This resource corresponds to model inference endpoints provided by the NIM Proxy. Use the models
sub-resource to list models available for inference and use completions
or chat.completions
resources for inference.
Sync Inference Resource#
- class nemo_microservices.lib.custom_resources.inference.InferenceResource(client: NeMoMicroservices)
Bases:
SyncAPIResource
- property models: ModelsResource
- property with_raw_response: InferenceResourceWithRawResponse
This property can be used as a prefix for any HTTP method call to return the raw response object instead of the parsed content.
For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#accessing-raw-response-data-e-g-headers
- property with_streaming_response: InferenceResourceWithStreamingResponse
An alternative to .with_raw_response that doesn’t eagerly read the response body.
For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#with_streaming_response
- create_from_dict(data: dict[str, object]) object
Async Inference Resource#
- class nemo_microservices.lib.custom_resources.inference.AsyncInferenceResource(client: AsyncNeMoMicroservices)
Bases:
AsyncAPIResource
- property models: AsyncModelsResource
- property with_raw_response: AsyncInferenceResourceWithRawResponse
This property can be used as a prefix for any HTTP method call to return the raw response object instead of the parsed content.
For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#accessing-raw-response-data-e-g-headers
- property with_streaming_response: AsyncInferenceResourceWithStreamingResponse
An alternative to .with_raw_response that doesn’t eagerly read the response body.
For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#with_streaming_response
- create_from_dict(
- data: dict[str, object],