API Reference (gRPC) for NVIDIA NeMo Retriever Reranking NIM#
This documentation contains the gRPC reference for NVIDIA NeMo Retriever Reranking NIM.
gRPC Models#
The gRPC model names differ from the NIM model IDs that appear in the Support Matrix. The following table contains the mapping of the names.
Model ID |
gRPC Model Name |
|---|---|
nvidia/llama-nemotron-rerank-1b-v2 |
nvidia_llama_nemotron_rerank_1b_v2 |
nvidia/llama-nemotron-rerank-500m-v2 |
nvidia_llama_nemotron_rerank_500m_v2 |
Request Inputs#
Input |
Shape |
Data Type |
Description |
|---|---|---|---|
|
[batch_size, 1] |
BYTES |
The queries for reranking, encoded as UTF-8. |
|
[batch_size, 1] |
BYTES |
The passages for reranking, encoded as UTF-8. |
Request Parameters#
Parameter |
Data Type |
Description |
Valid Values |
Default |
Required |
|---|---|---|---|---|---|
|
String |
How to handle text that exceeds the maximum token length. |
|
|
No |
Response#
Output |
Shape |
Data Type |
Description |
|---|---|---|---|
|
[batch_size, 1] |
INT32 |
The index of the passages in descending order. |
|
[batch_size, 1] |
FP32 |
The logit of the passages. |
|
[batch_size] |
INT32 |
The number of tokens in each input text. |