pii.custom_batch_analyzer_engine#

Module Contents#

Classes#

CustomBatchAnalyzerEngine

Batch analysis of documents (tables, lists, dicts).

Data#

API#

class pii.custom_batch_analyzer_engine.CustomBatchAnalyzerEngine(
analyzer_engine: presidio_analyzer.AnalyzerEngine | None = None,
)#

Bases: presidio_analyzer.BatchAnalyzerEngine

Batch analysis of documents (tables, lists, dicts).

Wrapper class to run Presidio Analyzer Engine on multiple values, either lists/iterators of strings, or dictionaries.

Param:

analyzer_engine: AnalyzerEngine instance to use for handling the values in those collections.

Initialization

analyze_batch(
texts: collections.abc.Iterable[str],
language: str,
entities: list[str] | None = None,
correlation_id: str | None = None,
score_threshold: float | None = None,
return_decision_process: bool | None = False,
ad_hoc_recognizers: list[presidio_analyzer.EntityRecognizer] | None = None,
context: list[str] | None = None,
allow_list: list[str] | None = None,
nlp_artifacts_batch: collections.abc.Iterable[presidio_analyzer.nlp_engine.NlpArtifacts] | None = None,
) list[list[presidio_analyzer.RecognizerResult]]#
analyze_dict(
input_dict: dict[str, Any | collections.abc.Iterable[Any]],
language: str,
keys_to_skip: list[str] | None = None,
**kwargs,
) collections.abc.Iterator[presidio_analyzer.DictAnalyzerResult]#

Analyze a dictionary of keys (strings) and values/iterable of values.

Non-string values are returned as is.

Parameters:
  • input_dict – The input dictionary for analysis

  • language – Input language

  • keys_to_skip – Keys to ignore during analysis

  • kwargs – Additional keyword arguments for the AnalyzerEngine.analyze method. Use this to pass arguments to the analyze method, such as ad_hoc_recognizers, context, return_decision_process. See AnalyzerEngine.analyze for the full list.

analyze_iterator(
texts: collections.abc.Iterable[str | bool | float | int],
language: str,
batch_size: int = 32,
**kwargs,
) list[list[presidio_analyzer.RecognizerResult]]#

Analyze an iterable of strings.

Parameters:
  • texts – An list containing strings to be analyzed.

  • language – Input language

  • kwargs – Additional parameters for the AnalyzerEngine.analyze method. :param batch_size

pii.custom_batch_analyzer_engine.logger#

‘getLogger(…)’