nv_ingest.extraction_workflows.image package#

Submodules#

nv_ingest.extraction_workflows.image.image_handlers module#

nv_ingest.extraction_workflows.image.image_handlers.convert_svg_to_bitmap(image_stream: BytesIO) ndarray[source]#

Converts an SVG image from a bytestream to a bitmap format.

Parameters:

image_stream (io.BytesIO) – A bytestream of the SVG file.

Returns:

Preprocessed image as a numpy array in bitmap format.

Return type:

np.ndarray

nv_ingest.extraction_workflows.image.image_handlers.extract_page_element_images(
annotation_dict: Dict[str, List[List[float]]],
original_image: ndarray,
page_idx: int,
page_elements: List[Tuple[int, CroppedImageWithContent]],
) None[source]#

Handle the extraction of tables and charts from the inference results and run additional model inference.

Parameters:
  • annotation_dict (dict of {str : list of list of float}) – A dictionary containing detected objects and their bounding boxes. Keys should include “table” and “chart”, and each key’s value should be a list of bounding boxes, with each bounding box represented as a list of floats.

  • original_image (np.ndarray) – The original image from which objects were detected, expected to be in RGB format with shape (H, W, 3).

  • page_idx (int) – The index of the current page being processed.

  • page_elements (list of tuple of (int, CroppedImageWithContent)) – A list to which extracted tables and charts will be appended. Each item in the list is a tuple where the first element is the page index, and the second is an instance of CroppedImageWithContent representing a cropped image and associated metadata.

Return type:

None

Notes

This function iterates over detected objects labeled as “table” or “chart”. For each object, it crops the original image according to the bounding box coordinates, then creates an instance of CroppedImageWithContent containing the cropped image and metadata, and appends it to page_elements.

Examples

>>> annotation_dict = {"table": [[0.1, 0.1, 0.5, 0.5, 0.8]], "chart": [[0.6, 0.6, 0.9, 0.9, 0.9]]}
>>> original_image = np.random.rand(1536, 1536, 3)
>>> page_elements = []
>>> extract_page_element_images(annotation_dict, original_image, 0, page_elements)
>>> len(page_elements)
2
nv_ingest.extraction_workflows.image.image_handlers.extract_page_elements_from_images(
images: List[ndarray],
config: ImageConfigSchema,
trace_info: List | None = None,
) List[Tuple[int, object]][source]#

Detect and extract tables/charts from a list of NumPy images using YOLOX.

Parameters:
  • images (List[np.ndarray]) – List of images in NumPy array format.

  • config (ImageConfigSchema) – Configuration object containing YOLOX endpoints, auth token, etc.

  • trace_info (Optional[List], optional) – Optional tracing data for debugging/performance profiling.

Returns:

A list of (image_index, CroppedImageWithContent) representing extracted table/chart data from each image.

Return type:

List[Tuple[int, object]]

nv_ingest.extraction_workflows.image.image_handlers.image_data_extractor(
image_stream,
document_type: str,
extract_text: bool,
extract_images: bool,
extract_tables: bool,
extract_charts: bool,
trace_info: dict | None = None,
**kwargs,
)[source]#

Helper function to extract text, images, tables, and charts from an image bytestream.

Parameters:
  • image_stream (io.BytesIO) – A bytestream for the image file.

  • document_type (str) – Specifies the type of the image document (‘png’, ‘jpeg’, ‘jpg’, ‘svg’, ‘tiff’, ‘bmp’).

  • extract_text (bool) – Specifies whether to extract text.

  • extract_images (bool) – Specifies whether to extract images.

  • extract_tables (bool) – Specifies whether to extract tables.

  • extract_charts (bool) – Specifies whether to extract charts.

  • trace_info (dict, optional) – Tracing information for logging or debugging purposes.

  • **kwargs – Additional extraction parameters.

Returns:

A list of extracted data items.

Return type:

list

nv_ingest.extraction_workflows.image.image_handlers.load_and_preprocess_image(
image_stream: BytesIO,
) ndarray[source]#

Loads and preprocesses a JPEG, JPG, or PNG image from a bytestream.

Parameters:

image_stream (io.BytesIO) – A bytestream of the image file.

Returns:

Preprocessed image as a numpy array.

Return type:

np.ndarray

Module contents#

nv_ingest.extraction_workflows.image.image(
image_stream,
document_type: str,
extract_text: bool,
extract_images: bool,
extract_tables: bool,
extract_charts: bool,
trace_info: dict | None = None,
**kwargs,
)#

Helper function to extract text, images, tables, and charts from an image bytestream.

Parameters:
  • image_stream (io.BytesIO) – A bytestream for the image file.

  • document_type (str) – Specifies the type of the image document (‘png’, ‘jpeg’, ‘jpg’, ‘svg’, ‘tiff’, ‘bmp’).

  • extract_text (bool) – Specifies whether to extract text.

  • extract_images (bool) – Specifies whether to extract images.

  • extract_tables (bool) – Specifies whether to extract tables.

  • extract_charts (bool) – Specifies whether to extract charts.

  • trace_info (dict, optional) – Tracing information for logging or debugging purposes.

  • **kwargs – Additional extraction parameters.

Returns:

A list of extracted data items.

Return type:

list