Client APIs#

The following reference provides detailed documentation for the synchronous and asynchronous clients of the NeMo Microservices Python SDK.

Synchronous Client#

class nemo_microservices.NeMoMicroservices#

Constructs a new synchronous NeMoMicroservices client instance. The following code snippet shows how to create a client instance.

from nemo_microservices import NeMoMicroservices
client = NeMoMicroservices(
    base_url="http://nemo.test",
    inference_base_url="http://nim.test"
)

Parameters:

base_url (Optional[str]) – Sets the base URL of the NeMo microservices API endpoints. This must be configured by the cluster administrator in your organization, following the instructions in the ingress setup guide. By default, the client checks if the NEMO_MICROSERVICES_BASE_URL environment variable is defined; and if it is not set, the client sets the value to http://nemo.test/.
inference_base_url (Optional[str]) –
Sets the base URL of a microservice for inference. You can specify one of the following API endpoints:
- The NeMo NIM Proxy microservice endpoint. This is the recommended endpoint because this microservice serves as a proxy for multiple NIM microservices.
- Individual NIM microservice endpoints you deployed to your Kubernetes cluster. If you want to use only one specific NIM microservice, use this option.
- The endpoints from build.nvidia.com.
timeout (Optional[float | Timeout]) –
Sets the HTTP request timeout for all API calls made by the client. The timeout is passed to the parent class (SyncAPIClient/AsyncAPIClient) during client construction. Individual API methods can also accept a timeout parameter to override the client-level timeout for specific requests.

Accepted values:
- A float (seconds)
- A Timeout object (imported from httpx)
- None (no timeout)
- NotGiven (use default)
max_retries (Optional[int]) –
Sets the maximum number of automatic retries for failed HTTP requests. When an HTTP request fails with certain status codes, the client automatically retries the request up to the specified number of times.

Usage Examples:
```
# Custom retry count
client = NeMoMicroservices(max_retries=5)  # 5 retries

# No retries
client = NeMoMicroservices(max_retries=0)

# Override for specific requests
client.with_options(max_retries=3).chat.completions.create(...)
```

property chat: ChatResource#

property completions: CompletionsResource#

property models: ModelsResource#

property customization: CustomizationResource#

property evaluation: EvaluationResource#

property datasets: DatasetsResource#

property embeddings: EmbeddingsResource#

property namespaces: NamespacesResource#

property projects: ProjectsResource#

property deployment: DeploymentResource#

property guardrail: GuardrailResource#

property inference: InferenceResource#

Asynchronous Client#

class nemo_microservices.AsyncNeMoMicroservices#

Constructs a new asynchronous NeMoMicroservices client instance. The following code snippet shows how to create a client instance.

import asyncio
from nemo_microservices import AsyncNeMoMicroservices

client = AsyncNeMoMicroservices(
    base_url="http://nemo.test",
    inference_base_url="http://nim.test"
)

# Sample API call
async def main() -> None:
    page = await client.namespaces.list()
    print(page.data)

asyncio.run(main())

Parameters:

base_url (Optional[str]) – Sets the base URL of the NeMo microservices API endpoints. This must be configured by the cluster administrator in your organization, following the instructions in the ingress setup guide. By default, the client checks if the NEMO_MICROSERVICES_BASE_URL environment variable is defined; and if it is not set, the client sets the value to http://nemo.test/.
inference_base_url (Optional[str]) –
Sets the base URL of a microservice for inference. You can specify one of the following API endpoints:
- The NeMo NIM Proxy microservice endpoint. This is the recommended endpoint because this microservice serves as a proxy for multiple NIM microservices.
- Individual NIM microservice endpoints you deployed to your Kubernetes cluster. If you want to use only one specific NIM microservice, use this option.
- The endpoints from build.nvidia.com.
timeout (Optional[float | Timeout]) –
Sets the HTTP request timeout for all API calls made by the client. The timeout is passed to the parent class (SyncAPIClient/AsyncAPIClient) during client construction. Individual API methods can also accept a timeout parameter to override the client-level timeout for specific requests.

Accepted values:
- A float (seconds)
- A Timeout object (imported from httpx)
- None (no timeout)
- NotGiven (use default)

max_retries (Optional[int]) –

Sets the maximum number of automatic retries for failed HTTP requests. When an HTTP request fails with certain status codes, the client automatically retries the request up to the specified number of times.

Usage Examples:

# Custom retry count
client = AsyncNeMoMicroservices(max_retries=5)  # 5 retries

# No retries
client = AsyncNeMoMicroservices(max_retries=0)

# Override for specific requests
client.with_options(max_retries=3).chat.completions.create(...)