API Reference (gRPC) for NVIDIA NeMo Retriever Embedding NIM#
This documentation contains the gRPC reference for NVIDIA NeMo Retriever Embedding NIM.
gRPC Models#
The gRPC model names differ from the NIM model IDs shown in the Support Matrix. The following table contains the mapping of the names.
Model ID |
gRPC Model Name |
|---|---|
nvidia/llama-nemotron-embed-1b-v2 |
nvidia_llama_nemotron_embed_1b_v2 |
nvidia/nv-embedqa-e5-v5 |
nvidia_nv_embedqa_e5_v5 |
Request Inputs#
Input |
Shape |
Data Type |
Description |
Required |
|---|---|---|---|---|
|
[batch_size, 1] |
BYTES |
A list of UTF-8 encoded strings to embed. For details on how to encode multimodal data as string, refer to How to Specify Modality. |
Yes |
|
[batch_size, 1] |
BYTES |
A list of UTF-8 modality strings for each of the text input elements. If you don’t specify modality, the modality is inferred. For supported modalities, refer to How to Specify Modality. |
No |
Request Parameters#
Parameter |
Data Type |
Description |
Valid Values |
Default |
Required |
|---|---|---|---|---|---|
|
String |
The context of the embedding. |
|
|
Yes |
|
String |
How to handle text that exceeds the maximum token length. |
|
|
Yes |
|
Integer |
The desired dimensionality of the output embeddings. Must be supported by the model. |
— |
The model’s default dimension. |
No |
|
String |
The output type of the embeddings. See How to Specify Embedding Type for how the output type is handled. |
|
|
No |
|
String |
Directory path where NVCF (NVIDIA Cloud Functions) asset files are stored. See API Reference (OpenAI) for more details. |
- |
No |
Response#
Output |
Shape |
Data Type |
Description |
|---|---|---|---|
|
[batch_size] |
INT32 |
The number of tokens in each input text. |
|
[batch_size, embedding_dimension] |
Configurable using the |
The resulting embedding vectors. |