morpheus.io.serializers
Functions
|
Serializes a DataFrame into CSV and returns the serialized output seperated by lines. |
|
Serializes a DataFrame into JSON and returns the serialized output seperated by lines. |
|
Serializes a DataFrame into Parquet and returns the serialized output seperated by lines. |
|
Serializes a DataFrame into CSV into the provided stream object. |
|
Serializes a DataFrame into JSON into the provided stream object. |
|
Serializes a DataFrame into Parquet format into the provided stream object. |
|
Writes the provided DataFrame into the file specified using the specified format. |
- df_to_csv(df, include_header=False, strip_newlines=False, include_index_col=True)[source]
Serializes a DataFrame into CSV and returns the serialized output seperated by lines.
- Parameters
- dfcudf.DataFrame
- include_headerbool, optional
- strip_newlinesbool, optional
- include_index_col: bool, optional
Input DataFrame to serialize.
Whether or not to include the header, by default False.
Whether or not to strip the newline characters from each string, by default False.
Write out the index as a column, by default True.
- Returns
- typing.List[str]
List of strings for each line
- df_to_json(df, strip_newlines=False, include_index_col=True)[source]
Serializes a DataFrame into JSON and returns the serialized output seperated by lines.
- Parameters
- dfcudf.DataFrame
- strip_newlinebool, optional
- include_index_col: bool, optional
- Returns
- ——-
- typing.List[str]
Input DataFrame to serialize.
Whether or not to strip the newline characters from each string, by default False.
Write out the index as a column, by default True. Note: This value is currently being ignored due to a known issue in Pandas: https://github.com/pandas-dev/pandas/issues/37600
List of strings for each line.
- df_to_parquet(df, strip_newlines=False)[source]
Serializes a DataFrame into Parquet and returns the serialized output seperated by lines.
- Parameters
- dfcudf.DataFrame
- strip_newlinesbool, optional
- Returns
- ——-
- typing.List[str]
Input DataFrame to serialize.
Whether or not to strip the newline characters from each string, by default False.
List of strings for each line.
- df_to_stream_csv(df, stream, include_header=False, include_index_col=True)[source]
Serializes a DataFrame into CSV into the provided stream object.
- Parameters
- dfcudf.DataFrame
- streamIOBase
- include_headerbool, optional
- include_index_col: bool, optional
Input DataFrame to serialize.
The stream where the serialized DataFrame will be written to.
Whether or not to include the header, by default False.
Write out the index as a column, by default True.
- df_to_stream_json(df, stream, include_index_col=True)[source]
Serializes a DataFrame into JSON into the provided stream object.
- Parameters
- dfcudf.DataFrame
- streamIOBase
- include_index_col: bool, optional
Input DataFrame to serialize.
The stream where the serialized DataFrame will be written to.
Write out the index as a column, by default True.
- df_to_stream_parquet(df, stream)[source]
Serializes a DataFrame into Parquet format into the provided stream object.
- Parameters
- dfcudf.DataFrame
- streamIOBase
Input DataFrame to serialize.
The stream where the serialized DataFrame will be written to.
- write_df_to_file(df, file_name, file_type=<FileTypes.Auto: 0>, **kwargs)[source]
Writes the provided DataFrame into the file specified using the specified format.
- Parameters
- dftyping.Union[pd.DataFrame, cudf.DataFrame]
- file_namestr
- file_typeFileTypes, optional
The DataFrame to serialize
The location to store the DataFrame
The type of serialization to use. By default this is
FileTypes.Auto
which will determine the type from the filename extension