NVIDIA Docs Hub NVIDIA Morpheus NVIDIA Morpheus (25.02.01) morpheus.utils.column_info

morpheus.utils.column_info

Functions

`column_listjoin`(df, col_name)	Returns the array series `df[col_name]` as flattened string series.
`create_increment_col`(df, column_name[, ...])	Create a new integer column counting unique occurrences of values in `column_name` grouped per-day using the timestamp values in `timestamp_column` and then grouping by `groupby_column` returning incrementing values starting at `1`.
`process_dataframe`(df_in, input_schema)	Processes a dataframe according to the given schema.

Classes

`BoolColumn`(name, dtype, input_name[, ...])	Subclass of `RenameColumn`, adds the ability to map a set custom values as boolean values.
`ColumnInfo`(name, dtype)	Defines a single column and type-cast.
`CustomColumn`(name, dtype, process_column_fn)	Subclass of `ColumnInfo`, defines a column to be computed by a user-defined function `process_column_fn`.
`DataFrameInputSchema`([json_columns, ...])	Defines the schema specifying the columns to be included in the output `DataFrame`.
`DateTimeColumn`(name, dtype, input_name)	Subclass of `RenameColumn`, specific to casting UTC localized datetime values.
`DistinctIncrementColumn`(name, dtype, input_name)	Subclass of `RenameColumn`, counts the unique occurrences of a value in `groupby_column` over a specific time window `period` based on dates in the `timestamp_column` field.
`IncrementColumn`(name, dtype, input_name, ...)	Subclass of `DateTimeColumn`, counts the unique occurrences of a value in `groupby_column` over a specific time window `period` based on dates in the `input_name` field.
`PreparedDFInfo`(df, columns_to_preserve)	Represents the result of preparing a DataFrame along with avilable columns to be preserved.
`RenameColumn`(name, dtype, input_name)	Subclass of `ColumnInfo`, adds the ability to also perform a rename.
`StringCatColumn`(name, dtype, input_columns, sep)	Subclass of `ColumnInfo`, concatenates values from multiple columns into a new string column separated by `sep`.
`StringJoinColumn`(name, dtype, input_name, sep)	Subclass of `RenameColumn`, converts incoming `list` values to string by joining by `sep`.

column_listjoin(df, col_name)[source]

Returns the array series df[col_name] as flattened string series.

Parameters

dfpandas.DataFrame: The dataframe from which to get the column.
col_namestr: The column to transform.

Returns

pandas.Series: A series with the arrays in the column flattened to strings.

create_increment_col(df, column_name, groupby_column='username', timestamp_column='timestamp', period='D')[source]

Create a new integer column counting unique occurrences of values in column_name grouped per-day using the timestamp values in timestamp_column and then grouping by groupby_column returning incrementing values starting at 1.

Parameters

dfpandas.DataFrame: The input dataframe.
column_namestr: Name of the column in which unique occurrences are counted.
groupby_columnstr, default “username”: The column to group by.
timestamp_columnstr, default “timestamp”: The column containing timestamp values.
period: str, default “D”: The period to group by.

Returns

pandas.Series: The new column with incrementing values.

process_dataframe(df_in, input_schema)[source]

Processes a dataframe according to the given schema.

Parameters

df_inpandas.DataFrame or cudf.DataFrame: The input dataframe to process.
input_schemaobject: The schema used to process the dataframe.

Returns

pandas.DataFrame: The processed dataframe.

Previous morpheus.utils.atomic_integer.AtomicInteger

Next morpheus.utils.column_info.BoolColumn