API Reference (gRPC) for NVIDIA NeMo Retriever Reranking NIM#

This documentation contains the gRPC reference for NVIDIA NeMo Retriever Reranking NIM.

gRPC Models#

The gRPC model names differ from the NIM model IDs that appear in the Support Matrix. The following table contains the mapping of the names.

Model ID	gRPC Model Name
nvidia/llama-nemotron-rerank-1b-v2	nvidia_llama_nemotron_rerank_1b_v2
nvidia/llama-nemotron-rerank-500m-v2	nvidia_llama_nemotron_rerank_500m_v2

Input	Shape	Data Type	Description
`query`	[batch_size, 1]	BYTES	The queries for reranking, encoded as UTF-8.
`passage`	[batch_size, 1]	BYTES	The passages for reranking, encoded as UTF-8.

Parameter	Data Type	Description	Valid Values	Default	Required
`truncate`	String	How to handle text that exceeds the maximum token length.	`"END"`, `"NONE"`	`"NONE"`	No

Output	Shape	Data Type	Description
`index`	[batch_size, 1]	INT32	The index of the passages in descending order.
`logit`	[batch_size, 1]	FP32	The logit of the passages.
`token_count`	[batch_size]	INT32	The number of tokens in each input text.