API Reference for NVIDIA NIM for Table Extraction#

This documentation contains the API reference for NVIDIA NIM for Table Extraction.

OpenAPI Specification#

You can download the complete API spec. The API spec is subject to change while in Early Access (EA). EA participants are encouraged to provide feedback to NVIDIA prior to the General Access (GA) release.

API Examples#

Extract Text Data from Image#

The v1/infer endpoint accepts multiple images and returns a list of text detections with associated bounding boxes and confidence scores from each image.

The only supported type is image_url.

Each image must be base64 encoded, and should be represented in the following JSON format. The supported image formats are png and jpeg.

{
  "type": "image_url",
  "url": "data:image/<IMAGE_FORMAT>;base64,<BASE64_ENCODED_IMAGE>"
}

An inference request has an entry for input. The value for input is an array of dictionaries that contain fields type and url. For example, a JSON payload of three images looks like the following:

{
  "input": [
    {
      "type": "image_url",
      "url": "data:img/png;base64,<BASE64_ENCODED_IMAGE>"
    },
    {
      "type": "image_url",
      "url": "data:img/png;base64,<BASE64_ENCODED_IMAGE>"
    },
    {
      "type": "image_url",
      "url": "data:img/png;base64,<BASE64_ENCODED_IMAGE>"
    }
  ]
}

cURL Request

HOSTNAME="localhost"
SERVICE_PORT=8000
curl -X "POST" \
  "http://${HOSTNAME}:${SERVICE_PORT}/v1/infer" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
        "input": [
          {
            "type": "image_url",
            "url": "data:img/png;base64,<BASE64_ENCODED_IMAGE>"
          },
          {
            "type": "image_url",
            "url": "data:img/png;base64,<BASE64_ENCODED_IMAGE>"
          }
        ]
      }'

Response

PaddleOCR NIM output provides confidence scores and float [0, 1] bounding boxes associated with each text detection.

{
    "data": [
        {
            "index": 0,
            "text_detections": [
                {
                    "text_prediction": {
                        "text": "detected text 1 from image 1",
                        "confidence": 0.9917036890983582
                    },
                    "bounding_box": {
                        "points": [
                            {
                                "x": 0.07395833333333333,
                                "y": 0.09304677623261694
                            },
                            {
                                "x": 0.6541666666666667,
                                "y": 0.09034976822587443
                            },
                            {
                                "x": 0.6541666666666667,
                                "y": 0.13080488832701223
                            },
                            {
                                "x": 0.07395833333333333,
                                "y": 0.13350189633375473
                            }
                        ]
                    }
                },
                {
                    "text_prediction": {
                        "text": "detected text 2 from image 1",
                        "confidence": 0.9888473749160767
                    },
                    "bounding_box": {
                        "points": [
                            {
                                "x": 0.8302083333333333,
                                "y": 0.10788032026970079
                            },
                            {
                                "x": 0.934375,
                                "y": 0.10788032026970079
                            },
                            {
                                "x": 0.934375,
                                "y": 0.12541087231352718
                            },
                            {
                                "x": 0.8302083333333333,
                                "y": 0.12541087231352718
                            }
                        ]
                    }
                }
            ]
        },
        {
            "index": 1,
            "text_detections": [
                {
                    "text_prediction": {
                        "text": "detected text 1 from image 2",
                        "confidence": 0.9917036890983582
                    },
                    "bounding_box": {
                        "points": [
                            {
                                "x": 0.07395833333333333,
                                "y": 0.09304677623261694
                            },
                            {
                                "x": 0.6541666666666667,
                                "y": 0.09034976822587443
                            },
                            {
                                "x": 0.6541666666666667,
                                "y": 0.13080488832701223
                            },
                            {
                                "x": 0.07395833333333333,
                                "y": 0.13350189633375473
                            }
                        ]
                    }
                },
                {
                    "text_prediction": {
                        "text": "detected text 2 from image 2",
                        "confidence": 0.9888473749160767
                    },
                    "bounding_box": {
                        "points": [
                            {
                                "x": 0.8302083333333333,
                                "y": 0.10788032026970079
                            },
                            {
                                "x": 0.934375,
                                "y": 0.10788032026970079
                            },
                            {
                                "x": 0.934375,
                                "y": 0.12541087231352718
                            },
                            {
                                "x": 0.8302083333333333,
                                "y": 0.12541087231352718
                            }
                        ]
                    }
                }
            ]
        }
    ]
}

Health Check#

cURL Request

Use the following command to query the health endpoints.

HOSTNAME="localhost"
SERVICE_PORT=8000
curl "http://${HOSTNAME}:${SERVICE_PORT}/v1/health/ready" \
-H 'Accept: application/json'

HOSTNAME="localhost"
SERVICE_PORT=8000
curl "http://${HOSTNAME}:${SERVICE_PORT}/v1/health/live" \
-H 'Accept: application/json'

Response

{
  "ready": true
}

{
  "live": true
}

OpenAPI Reference for Table Extraction NIM#

The following is the OpenAPI reference for NVIDIA NIM for Table Extraction.