(Latest Version)

morpheus.utils.column_info

Functions

column_listjoin(df, col_name)

Returns the array series df[col_name] as flattened string series.

create_increment_col(df, column_name[, ...])

Create a new integer column counting unique occurrences of values in column_name grouped per-day using the timestamp values in timestamp_column and then grouping by groupby_column returning incrementing values starting at 1.

process_dataframe(df_in, input_schema)

Applies colmn transformations as defined by input_schema

Classes

BoolColumn(name, dtype, input_name[, ...])

Subclass of RenameColumn, adds the ability to map a set custom values as boolean values.

ColumnInfo(name, dtype)

Defines a single column and type-cast.

CustomColumn(name, dtype, process_column_fn)

Subclass of ColumnInfo, defines a column to be computed by a user-defined function process_column_fn.

DataFrameInputSchema([json_columns, ...])

Defines the schema specifying the columns to be included in the output DataFrame.

DateTimeColumn(name, dtype, input_name)

Subclass of RenameColumn, specific to casting UTC localized datetime values.

IncrementColumn(name, dtype, input_name, ...)

Subclass of DateTimeColumn, counts the unique occurrences of a value in groupby_column over a specific time window period based on dates in the input_name field.

RenameColumn(name, dtype, input_name)

Subclass of ColumnInfo, adds the ability to also perform a rename.

StringCatColumn(name, dtype, input_columns, sep)

Subclass of ColumnInfo, concatenates values from multiple columns into a new string column separated by sep.

StringJoinColumn(name, dtype, input_name, sep)

Subclass of RenameColumn, converts incoming list values to string by joining by sep.

column_listjoin(df, col_name)[source]

Returns the array series df[col_name] as flattened string series.

create_increment_col(df, column_name, groupby_column='username', timestamp_column='timestamp')[source]

Create a new integer column counting unique occurrences of values in column_name grouped per-day using the timestamp values in timestamp_column and then grouping by groupby_column returning incrementing values starting at 1.

process_dataframe(df_in, input_schema)[source]

Applies colmn transformations as defined by input_schema

© Copyright 2023, NVIDIA. Last updated on Apr 11, 2023.