Text Reranking (Latest)
Microservices

Reference

You can download the complete API spec

Warning

Every model has a maximum token length. The models section lists the maximum token lengths of the supported models. See the truncate field in the Reference on ways to handle sequences longer than the maximum token length.

Use the examples in this section to help you get started with using the API.

The complete API spec can be found at Open AI Spec

List Models

cURL Request

Use the following command to list the available models.

Copy
Copied!
            

curl "http://${HOSTNAME}:${SERVICE_PORT}/v1/models" \ -H 'Accept: application/json'

Response

Copy
Copied!
            

{ "object": "list", "data": [ { "id": "nvidia/nv-rerankqa-mistral-4b-v3" } ] }

Generate Rankings

cURL Request

Copy
Copied!
            

curl -X "POST" \ "http://${HOSTNAME}:${SERVICE_PORT}/v1/ranking" \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "model": "nvidia/nv-rerankqa-mistral-4b-v3", "query": {"text": "which way should i go?"}, "passages": [ {"text": "two roads diverged in a yellow wood, and sorry i could not travel both and be one traveler, long i stood and looked down one as far as i could to where it bent in the undergrowth;"}, {"text": "then took the other, as just as fair, and having perhaps the better claim because it was grassy and wanted wear, though as for that the passing there had worn them really about the same,"}, {"text": "and both that morning equally lay in leaves no step had trodden black. oh, i marked the first for another day! yet knowing how way leads on to way i doubted if i should ever come back."}, {"text": "i shall be telling this with a sigh somewhere ages and ages hense: two roads diverged in a wood, and i, i took the one less traveled by, and that has made all the difference."} ], "truncate": "END" }'

Response

Copy
Copied!
            

{ "rankings": [ { "index": 0, "logit": 0.7646484375 }, { "index": 3, "logit": -1.1044921875 }, { "index": 2, "logit": -2.71875 }, { "index": 1, "logit": -5.09765625 } ] }

Health Check

cURL Request

Use the following command to query the health endpoints.

Copy
Copied!
            

curl "http://${HOSTNAME}:${SERVICE_PORT}/v1/health/ready" \ -H 'Accept: application/json'

Copy
Copied!
            

curl "http://${HOSTNAME}:${SERVICE_PORT}/v1/health/live" \ -H 'Accept: application/json'

Response

Copy
Copied!
            

{ "ready": true }

Copy
Copied!
            

{ "live": true }

Previous Performance
Next Acknowledgements
© Copyright © 2024, NVIDIA Corporation. Last updated on Aug 6, 2024.