morpheus.io.serializers#
DataFrame serializers.
Functions
|
Serializes a DataFrame into CSV and returns the serialized output seperated by lines. |
|
Serializes a DataFrame into JSON and returns the serialized output seperated by lines. |
|
Serializes a DataFrame into Parquet and returns the serialized output seperated by lines. |
|
Serializes a DataFrame into CSV into the provided stream object. |
|
Serializes a DataFrame into JSON into the provided stream object. |
|
Serializes a DataFrame into Parquet format into the provided stream object. |
|
Writes the provided DataFrame into the file specified using the specified format. |
- df_to_csv(
- df,
- include_header=False,
- strip_newlines=False,
- include_index_col=True,
Serializes a DataFrame into CSV and returns the serialized output seperated by lines.
- Parameters:
- dfDataFrameType
Input DataFrame to serialize.
- include_headerbool, optional
Whether or not to include the header, by default False.
- strip_newlinesbool, optional
Whether or not to strip the newline characters from each string, by default False.
- include_index_col: bool, optional
Write out the index as a column, by default True.
- Returns:
- typing.List[str]
List of strings for each line
- df_to_json(df, strip_newlines=False, include_index_col=True)[source]#
Serializes a DataFrame into JSON and returns the serialized output seperated by lines.
- Parameters:
- dfDataFrameType
Input DataFrame to serialize.
- strip_newlinesbool, optional
Whether or not to strip the newline characters from each string, by default False.
- include_index_col: bool, optional
Write out the index as a column, by default True. Note: This value is currently being ignored due to a known issue in Pandas: pandas-dev/pandas#37600
- Returns
- ——-
- typing.List[str]
List of strings for each line.
- df_to_parquet(df, strip_newlines=False, include_index_col=True)[source]#
Serializes a DataFrame into Parquet and returns the serialized output seperated by lines.
- Parameters:
- dfDataFrameType
Input DataFrame to serialize.
- strip_newlinesbool, default False
Whether or not to strip the newline characters from each string, by default False.
- include_index_col: bool, default True
Write out the index as a column, by default True.
- Returns
- ——-
- typing.List[str]
List of strings for each line.
- df_to_stream_csv(
- df,
- stream,
- include_header=False,
- include_index_col=True,
Serializes a DataFrame into CSV into the provided stream object.
- Parameters:
- dfDataFrameType
Input DataFrame to serialize.
- streamIOBase
The stream where the serialized DataFrame will be written to.
- include_headerbool, optional
Whether or not to include the header, by default False.
- include_index_col: bool, optional
Write out the index as a column, by default True.
- df_to_stream_json(df, stream, include_index_col=True, lines=True)[source]#
Serializes a DataFrame into JSON into the provided stream object.
- Parameters:
- dfDataFrameType
Input DataFrame to serialize.
- streamIOBase
The stream where the serialized DataFrame will be written to.
- include_index_col: bool, optional
Write out the index as a column, by default True.
- linesbool, optional
Write out the JSON in lines format, by default True.
- df_to_stream_parquet(df, stream, include_index_col=True)[source]#
Serializes a DataFrame into Parquet format into the provided stream object.
- Parameters:
- dfDataFrameType
Input DataFrame to serialize.
- streamIOBase
The stream where the serialized DataFrame will be written to.
- include_index_col: bool, default True
Write out the index as a column.
- write_df_to_file(
- df,
- file_name,
- file_type=<FileTypes.Auto: 0>,
- **kwargs,
Writes the provided DataFrame into the file specified using the specified format.
- Parameters:
- dfDataFrameType
The DataFrame to serialize
- file_namestr
The location to store the DataFrame
- file_typeFileTypes, optional
The type of serialization to use. By default this is
FileTypes.Autowhich will determine the type from the filename extension- **kwargsdict
Additional arguments forwarded to the underlying serialization function. Where the underlying serialization function is one of
write_df_to_file_cpp,df_to_stream_csv, ordf_to_stream_json.