nemo_microservices.resources.v2.inference.deployments.models#
Module Contents#
Classes#
API#
- class nemo_microservices.resources.v2.inference.deployments.models.AsyncModelsResource(
- client: nemo_microservices._client.AsyncNeMoMicroservices,
Bases:
nemo_microservices._resource.AsyncAPIResourceInitialization
- async list(
- deployment_name: str,
- *,
- namespace: str,
- extra_headers: nemo_microservices._types.Headers | None = None,
- extra_query: nemo_microservices._types.Query | None = None,
- extra_body: nemo_microservices._types.Body | None = None,
- timeout: float | httpx.Timeout | None | nemo_microservices._types.NotGiven = not_given,
Get Latest ModelDeployment’s Model Entities from Entity Store.
This provides the API contract that NIMs expect from Entity Store today, for pulling LoRAs, but enables us to enforce AuthZ boundaries.
TODO: Implement model entity retrieval based on deployment config.
Args: extra_headers: Send extra headers
extra_query: Add additional query parameters to the request
extra_body: Add additional JSON properties to the request
timeout: Override the client-level default timeout for this request, in seconds
- property with_raw_response: nemo_microservices.resources.v2.inference.deployments.models.AsyncModelsResourceWithRawResponse#
This property can be used as a prefix for any HTTP method call to return the raw response object instead of the parsed content.
For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#accessing-raw-response-data-e-g-headers
- property with_streaming_response: nemo_microservices.resources.v2.inference.deployments.models.AsyncModelsResourceWithStreamingResponse#
An alternative to
.with_raw_responsethat doesn’t eagerly read the response body.For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#with_streaming_response
- class nemo_microservices.resources.v2.inference.deployments.models.AsyncModelsResourceWithRawResponse( )#
Initialization
- class nemo_microservices.resources.v2.inference.deployments.models.AsyncModelsResourceWithStreamingResponse( )#
Initialization
- class nemo_microservices.resources.v2.inference.deployments.models.ModelsResource(client: nemo_microservices._client.NeMoMicroservices)#
Bases:
nemo_microservices._resource.SyncAPIResourceInitialization
- list(
- deployment_name: str,
- *,
- namespace: str,
- extra_headers: nemo_microservices._types.Headers | None = None,
- extra_query: nemo_microservices._types.Query | None = None,
- extra_body: nemo_microservices._types.Body | None = None,
- timeout: float | httpx.Timeout | None | nemo_microservices._types.NotGiven = not_given,
Get Latest ModelDeployment’s Model Entities from Entity Store.
This provides the API contract that NIMs expect from Entity Store today, for pulling LoRAs, but enables us to enforce AuthZ boundaries.
TODO: Implement model entity retrieval based on deployment config.
Args: extra_headers: Send extra headers
extra_query: Add additional query parameters to the request
extra_body: Add additional JSON properties to the request
timeout: Override the client-level default timeout for this request, in seconds
- property with_raw_response: nemo_microservices.resources.v2.inference.deployments.models.ModelsResourceWithRawResponse#
This property can be used as a prefix for any HTTP method call to return the raw response object instead of the parsed content.
For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#accessing-raw-response-data-e-g-headers
- property with_streaming_response: nemo_microservices.resources.v2.inference.deployments.models.ModelsResourceWithStreamingResponse#
An alternative to
.with_raw_responsethat doesn’t eagerly read the response body.For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#with_streaming_response
- class nemo_microservices.resources.v2.inference.deployments.models.ModelsResourceWithRawResponse( )#
Initialization
- class nemo_microservices.resources.v2.inference.deployments.models.ModelsResourceWithStreamingResponse( )#
Initialization