nv_ingest.extraction_workflows.pptx package#

Submodules#

nv_ingest.extraction_workflows.pptx.pptx_helper module#

nv_ingest.extraction_workflows.pptx.pptx_helper.escape_text(text)[source]#
nv_ingest.extraction_workflows.pptx.pptx_helper.format_text(
text: str,
bold: bool = False,
italic: bool = False,
underline: bool = False,
) str[source]#
nv_ingest.extraction_workflows.pptx.pptx_helper.get_bbox(
presentation_object: Presentation | None = None,
shape_object: Slide | None = None,
text_depth: TextTypeEnum | None = None,
)[source]#
nv_ingest.extraction_workflows.pptx.pptx_helper.is_accent(font)[source]#
nv_ingest.extraction_workflows.pptx.pptx_helper.is_list_block(shape)[source]#
nv_ingest.extraction_workflows.pptx.pptx_helper.is_strong(font)[source]#
nv_ingest.extraction_workflows.pptx.pptx_helper.is_subtitle(shape)[source]#
nv_ingest.extraction_workflows.pptx.pptx_helper.is_title(shape)[source]#
nv_ingest.extraction_workflows.pptx.pptx_helper.is_underlined(font)[source]#
nv_ingest.extraction_workflows.pptx.pptx_helper.process_subtitle(shape)[source]#
nv_ingest.extraction_workflows.pptx.pptx_helper.process_title(shape)[source]#
nv_ingest.extraction_workflows.pptx.pptx_helper.python_pptx(
pptx_stream,
extract_text: bool,
extract_images: bool,
extract_tables: bool,
extract_charts: bool,
**kwargs,
)[source]#

Helper function to use python-pptx to extract text from a bytestream PPTX, while deferring image classification into tables/charts if requested.

nv_ingest.extraction_workflows.pptx.pptx_helper.ungroup_shapes(shapes)[source]#

Module contents#

nv_ingest.extraction_workflows.pptx.python_pptx(
pptx_stream,
extract_text: bool,
extract_images: bool,
extract_tables: bool,
extract_charts: bool,
**kwargs,
)[source]#

Helper function to use python-pptx to extract text from a bytestream PPTX, while deferring image classification into tables/charts if requested.