Inference Embeddings Resource#

This resource corresponds to the NIM Proxy microservice’s v1/embeddings endpoint, which is for generating embeddings from a text input.

Sync Embeddings Resource#

class nemo_microservices.resources.EmbeddingsResource(client: NeMoMicroservices)

Bases: SyncAPIResource

property with_raw_response: EmbeddingsResourceWithRawResponse

This property can be used as a prefix for any HTTP method call to return the raw response object instead of the parsed content.

For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#accessing-raw-response-data-e-g-headers

property with_streaming_response: EmbeddingsResourceWithStreamingResponse

An alternative to .with_raw_response that doesn’t eagerly read the response body.

For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#with_streaming_response

create(
*,
input: str | List[str],
model: str,
dimensions: int | NotGiven = NOT_GIVEN,
encoding_format: str | NotGiven = NOT_GIVEN,
input_type: str | NotGiven = NOT_GIVEN,
truncate: str | NotGiven = NOT_GIVEN,
user: str | NotGiven = NOT_GIVEN,
extra_headers: Headers | None = None,
extra_query: Query | None = None,
extra_body: Body | None = None,
timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
) CreateEmbeddingResponse

Embeddings for the provided input.

Parameters:
  • input – Input text to embed, encoded as a string or array of tokens.

  • model – The model to use. Must be one of the available models.

  • dimensions – The dimensionality of the embedding vector.

  • encoding_format – The encoding format of the input.

  • input_type – The type of the input.

  • truncate – Truncate the input text.

  • user – Not Supported. A unique identifier representing your end-user.

  • extra_headers – Send extra headers

  • extra_query – Add additional query parameters to the request

  • extra_body – Add additional JSON properties to the request

  • timeout – Override the client-level default timeout for this request, in seconds

create_from_dict(data: dict[str, object]) object

Async Embeddings Resource#

class nemo_microservices.resources.AsyncEmbeddingsResource(client: AsyncNeMoMicroservices)

Bases: AsyncAPIResource

property with_raw_response: AsyncEmbeddingsResourceWithRawResponse

This property can be used as a prefix for any HTTP method call to return the raw response object instead of the parsed content.

For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#accessing-raw-response-data-e-g-headers

property with_streaming_response: AsyncEmbeddingsResourceWithStreamingResponse

An alternative to .with_raw_response that doesn’t eagerly read the response body.

For more information, see https://docs.nvidia.com/nemo/microservices/latest/pysdk/index.html#with_streaming_response

async create(
*,
input: str | List[str],
model: str,
dimensions: int | NotGiven = NOT_GIVEN,
encoding_format: str | NotGiven = NOT_GIVEN,
input_type: str | NotGiven = NOT_GIVEN,
truncate: str | NotGiven = NOT_GIVEN,
user: str | NotGiven = NOT_GIVEN,
extra_headers: Headers | None = None,
extra_query: Query | None = None,
extra_body: Body | None = None,
timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
) CreateEmbeddingResponse

Embeddings for the provided input.

Parameters:
  • input – Input text to embed, encoded as a string or array of tokens.

  • model – The model to use. Must be one of the available models.

  • dimensions – The dimensionality of the embedding vector.

  • encoding_format – The encoding format of the input.

  • input_type – The type of the input.

  • truncate – Truncate the input text.

  • user – Not Supported. A unique identifier representing your end-user.

  • extra_headers – Send extra headers

  • extra_query – Add additional query parameters to the request

  • extra_body – Add additional JSON properties to the request

  • timeout – Override the client-level default timeout for this request, in seconds

create_from_dict(
data: dict[str, object],
) object