morpheus.io.serializers

DataFrame serializers.

Functions

df_to_csv(df[, include_header, ...]) Serializes a DataFrame into CSV and returns the serialized output seperated by lines.
df_to_json(df[, strip_newlines, ...]) Serializes a DataFrame into JSON and returns the serialized output seperated by lines.
df_to_parquet(df[, strip_newlines]) Serializes a DataFrame into Parquet and returns the serialized output seperated by lines.
df_to_stream_csv(df, stream[, ...]) Serializes a DataFrame into CSV into the provided stream object.
df_to_stream_json(df, stream[, ...]) Serializes a DataFrame into JSON into the provided stream object.
df_to_stream_parquet(df, stream) Serializes a DataFrame into Parquet format into the provided stream object.
write_df_to_file(df, file_name[, file_type]) Writes the provided DataFrame into the file specified using the specified format.
df_to_csv(df, include_header=False, strip_newlines=False, include_index_col=True)[source]

Serializes a DataFrame into CSV and returns the serialized output seperated by lines.

Parameters
df

Input DataFrame to serialize.

include_header

Whether or not to include the header, by default False.

strip_newlines

Whether or not to strip the newline characters from each string, by default False.

include_index_col: bool, optional

Write out the index as a column, by default True.

Returns
typing.List[str]

List of strings for each line

df_to_json(df, strip_newlines=False, include_index_col=True)[source]

Serializes a DataFrame into JSON and returns the serialized output seperated by lines.

Parameters
df

Input DataFrame to serialize.

strip_newlines

Whether or not to strip the newline characters from each string, by default False.

include_index_col: bool, optional

Write out the index as a column, by default True. Note: This value is currently being ignored due to a known issue in Pandas: https://github.com/pandas-dev/pandas/issues/37600

Returns

——-

typing.List[str]

List of strings for each line.

df_to_parquet(df, strip_newlines=False)[source]

Serializes a DataFrame into Parquet and returns the serialized output seperated by lines.

Parameters
df

Input DataFrame to serialize.

strip_newlines

Whether or not to strip the newline characters from each string, by default False.

Returns

——-

typing.List[str]

List of strings for each line.

df_to_stream_csv(df, stream, include_header=False, include_index_col=True)[source]

Serializes a DataFrame into CSV into the provided stream object.

Parameters
df

Input DataFrame to serialize.

stream

The stream where the serialized DataFrame will be written to.

include_header

Whether or not to include the header, by default False.

include_index_col: bool, optional

Write out the index as a column, by default True.

df_to_stream_json(df, stream, include_index_col=True, lines=True)[source]

Serializes a DataFrame into JSON into the provided stream object.

Parameters
df

Input DataFrame to serialize.

stream

The stream where the serialized DataFrame will be written to.

include_index_col: bool, optional

Write out the index as a column, by default True.

lines

Write out the JSON in lines format, by default True.

df_to_stream_parquet(df, stream)[source]

Serializes a DataFrame into Parquet format into the provided stream object.

Parameters
df

Input DataFrame to serialize.

stream

The stream where the serialized DataFrame will be written to.

write_df_to_file(df, file_name, file_type=<FileTypes.Auto: 0>, **kwargs)[source]

Writes the provided DataFrame into the file specified using the specified format.

Parameters
df

The DataFrame to serialize

file_name

The location to store the DataFrame

file_type

The type of serialization to use. By default this is FileTypes.Auto which will determine the type from the filename extension

**kwargs

Additional arguments forwarded to the underlying serialization function. Where the underlying serialization function is one of write_df_to_file_cpp, df_to_stream_csv, or df_to_stream_json.

Previous morpheus.io.deserializers
Next morpheus.io.utils
© Copyright 2024, NVIDIA. Last updated on Apr 11, 2024.