morpheus.io.serializers
DataFrame serializers.
Functions
df_to_csv (df[, include_header, ...]) |
Serializes a DataFrame into CSV and returns the serialized output seperated by lines. |
df_to_json (df[, strip_newlines, ...]) |
Serializes a DataFrame into JSON and returns the serialized output seperated by lines. |
df_to_parquet (df[, strip_newlines]) |
Serializes a DataFrame into Parquet and returns the serialized output seperated by lines. |
df_to_stream_csv (df, stream[, ...]) |
Serializes a DataFrame into CSV into the provided stream object. |
df_to_stream_json (df, stream[, ...]) |
Serializes a DataFrame into JSON into the provided stream object. |
df_to_stream_parquet (df, stream) |
Serializes a DataFrame into Parquet format into the provided stream object. |
write_df_to_file (df, file_name[, file_type]) |
Writes the provided DataFrame into the file specified using the specified format. |
- df_to_csv(df, include_header=False, strip_newlines=False, include_index_col=True)[source]
Serializes a DataFrame into CSV and returns the serialized output seperated by lines.
- Parameters
- dfDataFrameType
Input DataFrame to serialize.
- include_headerbool, optional
Whether or not to include the header, by default False.
- strip_newlinesbool, optional
Whether or not to strip the newline characters from each string, by default False.
- include_index_col: bool, optional
Write out the index as a column, by default True.
- Returns
- typing.List[str]
List of strings for each line
- df_to_json(df, strip_newlines=False, include_index_col=True)[source]
Serializes a DataFrame into JSON and returns the serialized output seperated by lines.
- Parameters
- dfDataFrameType
Input DataFrame to serialize.
- strip_newlinesbool, optional
Whether or not to strip the newline characters from each string, by default False.
- include_index_col: bool, optional
Write out the index as a column, by default True. Note: This value is currently being ignored due to a known issue in Pandas: https://github.com/pandas-dev/pandas/issues/37600
- Returns
- ——-
- typing.List[str]
List of strings for each line.
- df_to_parquet(df, strip_newlines=False)[source]
Serializes a DataFrame into Parquet and returns the serialized output seperated by lines.
- Parameters
- dfDataFrameType
Input DataFrame to serialize.
- strip_newlinesbool, optional
Whether or not to strip the newline characters from each string, by default False.
- Returns
- ——-
- typing.List[str]
List of strings for each line.
- df_to_stream_csv(df, stream, include_header=False, include_index_col=True)[source]
Serializes a DataFrame into CSV into the provided stream object.
- Parameters
- dfDataFrameType
Input DataFrame to serialize.
- streamIOBase
The stream where the serialized DataFrame will be written to.
- include_headerbool, optional
Whether or not to include the header, by default False.
- include_index_col: bool, optional
Write out the index as a column, by default True.
- df_to_stream_json(df, stream, include_index_col=True, lines=True)[source]
Serializes a DataFrame into JSON into the provided stream object.
- Parameters
- dfDataFrameType
Input DataFrame to serialize.
- streamIOBase
The stream where the serialized DataFrame will be written to.
- include_index_col: bool, optional
Write out the index as a column, by default True.
- linesbool, optional
Write out the JSON in lines format, by default True.
- df_to_stream_parquet(df, stream)[source]
Serializes a DataFrame into Parquet format into the provided stream object.
- Parameters
- dfDataFrameType
Input DataFrame to serialize.
- streamIOBase
The stream where the serialized DataFrame will be written to.
- write_df_to_file(df, file_name, file_type=<FileTypes.Auto: 0>, **kwargs)[source]
Writes the provided DataFrame into the file specified using the specified format.
- Parameters
- dfDataFrameType
The DataFrame to serialize
- file_namestr
The location to store the DataFrame
- file_typeFileTypes, optional
The type of serialization to use. By default this is
FileTypes.Auto
which will determine the type from the filename extension- **kwargsdict
Additional arguments forwarded to the underlying serialization function. Where the underlying serialization function is one of
write_df_to_file_cpp
,df_to_stream_csv
, ordf_to_stream_json
.