Defined in File filter_detection.hpp
public mrc::pymrc::PythonNode< std::shared_ptr< MultiMessage >, std::shared_ptr< MultiMessage > >
class FilterDetectionsStage : public mrc::pymrc::PythonNode<std::shared_ptr<MultiMessage>, std::shared_ptr<MultiMessage>>
FilterDetectionsStage is used to filter rows from a dataframe based on values in a tensor or dataframe column using a specified criteria. Rows in the
metadataframe are excluded if their associated value in the datasource indicated by
field_nameis less than or equal to
This stage can operate in two different modes set by the
copyargument. When the
true(default), rows that meet the filter criteria are copied into a new dataframe. When
falsesliced views are used instead.
copy=trueshould be used when the number of matching records is expected to be both high and in non-adjacent rows. In this mode, the stage will generate only one output message for each incoming message, regardless of the size of the input and the number of matching records. However this comes at the cost of needing to allocate additional memory and perform the copy. Note: In most other stages, messages emitted contain a reference to the original
MessageMetaemitted into the pipeline by the source stage. When using copy mode this won’t be the case and could cause the original
MessageMetato be deallocated after this stage.
copy=falseshould be used when either the number of matching records is expected to be very low or are likely to be contained in adjacent rows. In this mode, slices of contiguous blocks of rows are emitted in multiple output messages. Performing a slice is relatively low-cost, however for each incoming message the number of emitted messages could be high (in the worst case scenario as high as half the number of records in the incoming message). Depending on the downstream stages, this can cause performance issues, especially if those stages need to acquire the Python GIL.
- using base_t = mrc::pymrc::PythonNode<std::shared_ptr<MultiMessage>, std::shared_ptr<MultiMessage>>
FilterDetectionsStage(float threshold, bool copy, FilterSource filter_source, std::string field_name = "probs")
Construct a new Filter Detections Stage object.
threshold – : Threshold to classify
copy – : Whether or not to perform a copy default=true
filter_source – : Indicate if the values used for filtering exist in either an output tensor (
FilterSource::TENSOR) or a column in a Dataframe (
field_name – : Name of the tensor or Dataframe column to filter on default=”probs”