morpheus.io.serializers

Functions

df_to_csv(df[, include_header, ...])

Serializes a DataFrame into CSV and returns the serialized output seperated by lines.

df_to_json(df[, strip_newlines, ...])

Serializes a DataFrame into JSON and returns the serialized output seperated by lines.

df_to_parquet(df[, strip_newlines])

Serializes a DataFrame into Parquet and returns the serialized output seperated by lines.

df_to_stream_csv(df, stream[, ...])

Serializes a DataFrame into CSV into the provided stream object.

df_to_stream_json(df, stream[, ...])

Serializes a DataFrame into JSON into the provided stream object.

df_to_stream_parquet(df, stream)

Serializes a DataFrame into Parquet format into the provided stream object.

write_df_to_file(df, file_name[, file_type])

Writes the provided DataFrame into the file specified using the specified format.

df_to_csv(df, include_header=False, strip_newlines=False, include_index_col=True)[source]

Serializes a DataFrame into CSV and returns the serialized output seperated by lines.

Parameters
dfcudf.DataFrame

Input DataFrame to serialize.

include_headerbool, optional

Whether or not to include the header, by default False.

strip_newlinesbool, optional

Whether or not to strip the newline characters from each string, by default False.

include_index_col: bool, optional

Write out the index as a column, by default True.

Returns
typing.List[str]

List of strings for each line

df_to_json(df, strip_newlines=False, include_index_col=True)[source]

Serializes a DataFrame into JSON and returns the serialized output seperated by lines.

Parameters
dfcudf.DataFrame

Input DataFrame to serialize.

strip_newlinebool, optional

Whether or not to strip the newline characters from each string, by default False.

include_index_col: bool, optional

Write out the index as a column, by default True. Note: This value is currently being ignored due to a known issue in Pandas: https://github.com/pandas-dev/pandas/issues/37600

Returns

——-

typing.List[str]

List of strings for each line.

df_to_parquet(df, strip_newlines=False)[source]

Serializes a DataFrame into Parquet and returns the serialized output seperated by lines.

Parameters
dfcudf.DataFrame

Input DataFrame to serialize.

strip_newlinesbool, optional

Whether or not to strip the newline characters from each string, by default False.

Returns

——-

typing.List[str]

List of strings for each line.

df_to_stream_csv(df, stream, include_header=False, include_index_col=True)[source]

Serializes a DataFrame into CSV into the provided stream object.

Parameters
dfcudf.DataFrame

Input DataFrame to serialize.

streamIOBase

The stream where the serialized DataFrame will be written to.

include_headerbool, optional

Whether or not to include the header, by default False.

include_index_col: bool, optional

Write out the index as a column, by default True.

df_to_stream_json(df, stream, include_index_col=True)[source]

Serializes a DataFrame into JSON into the provided stream object.

Parameters
dfcudf.DataFrame

Input DataFrame to serialize.

streamIOBase

The stream where the serialized DataFrame will be written to.

include_index_col: bool, optional

Write out the index as a column, by default True.

df_to_stream_parquet(df, stream)[source]

Serializes a DataFrame into Parquet format into the provided stream object.

Parameters
dfcudf.DataFrame

Input DataFrame to serialize.

streamIOBase

The stream where the serialized DataFrame will be written to.

write_df_to_file(df, file_name, file_type=<FileTypes.Auto: 0>, **kwargs)[source]

Writes the provided DataFrame into the file specified using the specified format.

Parameters
dftyping.Union[pd.DataFrame, cudf.DataFrame]

The DataFrame to serialize

file_namestr

The location to store the DataFrame

file_typeFileTypes, optional

The type of serialization to use. By default this is FileTypes.Auto which will determine the type from the filename extension

© Copyright 2023, NVIDIA. Last updated on Apr 11, 2023.