morpheus.io.data_manager.DataManager

class DataManager(storage_type='in_memory', file_format='parquet')[source]

Bases: object

DataManager class to manage the storage and retrieval of files using either in-memory or filesystem storage.

Attributes
manifest

Retrieve a mapping of UUIDs to their filenames or labels.

num_rows

Get the number of rows in a source given its source ID.

records

storage_type

Get the storage type used by the DataManager instance.

Methods

get_record(source_id) Get a DataRecord instance given a source ID.
load(source_id) Load a cuDF DataFrame given a source ID.
remove(source_id) Remove a source using its source ID.
store(data_source[, copy_from_source, ...]) Store a DataFrame or file path as a source and return the source ID.
get_record(source_id)[source]

Get a DataRecord instance given a source ID.

Parameters

source_id (uuid.UUID) – UUID of the source to be retrieved.

Returns

DataRecord instance.

Return type

morpheus.io.data_record.DataRecord

load(source_id)[source]

Load a cuDF DataFrame given a source ID.

Parameters

source_id (uuid.UUID) – UUID of the source to be loaded.

Returns

Loaded cuDF DataFrame.

Return type

cudf.DataFrame

property manifest: dict

Retrieve a mapping of UUIDs to their filenames or labels.

Returns

A dictionary containing UUID to filename/label mappings.

property num_rows: int

Get the number of rows in a source given its source ID. :return:

remove(source_id)[source]

Remove a source using its source ID.

Parameters

source_id (uuid.UUID) – UUID of the source to be removed.

property storage_type: str

Get the storage type used by the DataManager instance.

Returns

Storage type as a string.

store(data_source, copy_from_source=False, data_label=None)[source]

Store a DataFrame or file path as a source and return the source ID.

Parameters
  • data_source (Union[cudf.DataFrame, pandas.DataFrame, str]) – DataFrame or file path to store as a source.

  • copy_from_source (bool) – Whether to copy the data on disk when the input is a file path and the storage type is ‘filesystem’.

  • data_label (Optional[str]) – Optional label for the stored data.

Returns

UUID of the stored source.

Return type

uuid.UUID

Previous morpheus.io.data_manager
Next morpheus.io.data_manager_loader
© Copyright 2024, NVIDIA. Last updated on Apr 25, 2024.