nemo_microservices.resources.v2.inference.deployments.models#

Module Contents#

Classes#

API#

class nemo_microservices.resources.v2.inference.deployments.models.AsyncModelsResource(
client: nemo_microservices._client.AsyncNeMoMicroservices,
)#

Bases: nemo_microservices._resource.AsyncAPIResource

Initialization

async list(
deployment_name: str,
*,
namespace: str,
extra_headers: nemo_microservices._types.Headers | None = None,
extra_query: nemo_microservices._types.Query | None = None,
extra_body: nemo_microservices._types.Body | None = None,
timeout: float | httpx.Timeout | None | nemo_microservices._types.NotGiven = not_given,
) nemo_microservices.types.v2.inference.deployments.model_list_response.ModelListResponse#

Get Latest ModelDeployment’s Model Entities from Entity Store.

This provides the API contract that NIMs expect from Entity Store today, for pulling LoRAs, but enables us to enforce AuthZ boundaries.

TODO: Implement model entity retrieval based on deployment config.

Args: extra_headers: Send extra headers

extra_query: Add additional query parameters to the request

extra_body: Add additional JSON properties to the request

timeout: Override the client-level default timeout for this request, in seconds

property with_raw_response: nemo_microservices.resources.v2.inference.deployments.models.AsyncModelsResourceWithRawResponse#

This property can be used as a prefix for any HTTP method call to return the raw response object instead of the parsed content.

For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#accessing-raw-response-data-e-g-headers

property with_streaming_response: nemo_microservices.resources.v2.inference.deployments.models.AsyncModelsResourceWithStreamingResponse#

An alternative to .with_raw_response that doesn’t eagerly read the response body.

For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#with_streaming_response

class nemo_microservices.resources.v2.inference.deployments.models.AsyncModelsResourceWithRawResponse(
models: nemo_microservices.resources.v2.inference.deployments.models.AsyncModelsResource,
)#

Initialization

class nemo_microservices.resources.v2.inference.deployments.models.AsyncModelsResourceWithStreamingResponse(
models: nemo_microservices.resources.v2.inference.deployments.models.AsyncModelsResource,
)#

Initialization

class nemo_microservices.resources.v2.inference.deployments.models.ModelsResource(client: nemo_microservices._client.NeMoMicroservices)#

Bases: nemo_microservices._resource.SyncAPIResource

Initialization

list(
deployment_name: str,
*,
namespace: str,
extra_headers: nemo_microservices._types.Headers | None = None,
extra_query: nemo_microservices._types.Query | None = None,
extra_body: nemo_microservices._types.Body | None = None,
timeout: float | httpx.Timeout | None | nemo_microservices._types.NotGiven = not_given,
) nemo_microservices.types.v2.inference.deployments.model_list_response.ModelListResponse#

Get Latest ModelDeployment’s Model Entities from Entity Store.

This provides the API contract that NIMs expect from Entity Store today, for pulling LoRAs, but enables us to enforce AuthZ boundaries.

TODO: Implement model entity retrieval based on deployment config.

Args: extra_headers: Send extra headers

extra_query: Add additional query parameters to the request

extra_body: Add additional JSON properties to the request

timeout: Override the client-level default timeout for this request, in seconds

property with_raw_response: nemo_microservices.resources.v2.inference.deployments.models.ModelsResourceWithRawResponse#

This property can be used as a prefix for any HTTP method call to return the raw response object instead of the parsed content.

For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#accessing-raw-response-data-e-g-headers

property with_streaming_response: nemo_microservices.resources.v2.inference.deployments.models.ModelsResourceWithStreamingResponse#

An alternative to .with_raw_response that doesn’t eagerly read the response body.

For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#with_streaming_response

class nemo_microservices.resources.v2.inference.deployments.models.ModelsResourceWithRawResponse(
models: nemo_microservices.resources.v2.inference.deployments.models.ModelsResource,
)#

Initialization

class nemo_microservices.resources.v2.inference.deployments.models.ModelsResourceWithStreamingResponse(
models: nemo_microservices.resources.v2.inference.deployments.models.ModelsResource,
)#

Initialization