API Reference (gRPC) for NVIDIA NeMo Retriever Reranking NIM#

This documentation contains the gRPC reference for NVIDIA NeMo Retriever Reranking NIM.

gRPC Models#

The gRPC model names differ from the NIM model IDs that appear in the Support Matrix. The following table contains the mapping of the names.

Model ID

gRPC Model Name

nvidia/llama-nemotron-rerank-1b-v2

nvidia_llama_nemotron_rerank_1b_v2

nvidia/llama-nemotron-rerank-500m-v2

nvidia_llama_nemotron_rerank_500m_v2

Request Inputs#

Input

Shape

Data Type

Description

query

[batch_size, 1]

BYTES

The queries for reranking, encoded as UTF-8.

passage

[batch_size, 1]

BYTES

The passages for reranking, encoded as UTF-8.

Request Parameters#

Parameter

Data Type

Description

Valid Values

Default

Required

truncate

String

How to handle text that exceeds the maximum token length.

"END", "NONE"

"NONE"

No

Response#

Output

Shape

Data Type

Description

index

[batch_size, 1]

INT32

The index of the passages in descending order.

logit

[batch_size, 1]

FP32

The logit of the passages.

token_count

[batch_size]

INT32

The number of tokens in each input text.