(Latest Version)

Class FilterDetectionsStage

Base Type

  • public mrc::pymrc::PythonNode< std::shared_ptr< MultiMessage >, std::shared_ptr< MultiMessage > >

class FilterDetectionsStage : public mrc::pymrc::PythonNode<std::shared_ptr<MultiMessage>, std::shared_ptr<MultiMessage>>

FilterDetectionsStage is used to filter rows from a dataframe based on values in a tensor or dataframe column using a specified criteria. Rows in the meta dataframe are excluded if their associated value in the datasource indicated by field_name is less than or equal to threshold.

This stage can operate in two different modes set by the copy argument. When the copy argument is true (default), rows that meet the filter criteria are copied into a new dataframe. When false sliced views are used instead.

Setting copy=true should be used when the number of matching records is expected to be both high and in non-adjacent rows. In this mode, the stage will generate only one output message for each incoming message, regardless of the size of the input and the number of matching records. However this comes at the cost of needing to allocate additional memory and perform the copy. Note: In most other stages, messages emitted contain a reference to the original MessageMeta emitted into the pipeline by the source stage. When using copy mode this won’t be the case and could cause the original MessageMeta to be deallocated after this stage.

Setting copy=false should be used when either the number of matching records is expected to be very low or are likely to be contained in adjacent rows. In this mode, slices of contiguous blocks of rows are emitted in multiple output messages. Performing a slice is relatively low-cost, however for each incoming message the number of emitted messages could be high (in the worst case scenario as high as half the number of records in the incoming message). Depending on the downstream stages, this can cause performance issues, especially if those stages need to acquire the Python GIL.

Public Types

using base_t = mrc::pymrc::PythonNode<std::shared_ptr<MultiMessage>, std::shared_ptr<MultiMessage>>

Public Functions

FilterDetectionsStage(float threshold, bool copy, FilterSource filter_source, std::string field_name = "probs")

Construct a new Filter Detections Stage object.

Parameters
  • threshold – : Threshold to classify

  • copy – : Whether or not to perform a copy default=true

  • filter_source – : Indicate if the values used for filtering exist in either an output tensor (FilterSource::TENSOR) or a column in a Dataframe (FilterSource::DATAFRAME).

  • field_name – : Name of the tensor or Dataframe column to filter on default=”probs”

© Copyright 2023, NVIDIA. Last updated on Apr 11, 2023.