nemo_curator.utils.writer_utils

View as Markdown

Module Contents

Classes

NameDescription
JsonEncoderCustomCustom JSON encoder that handles types that are not JSON serializable.

Functions

NameDescription
write_bytesWrite bytes to local path.
write_csvWrite csv to local path.
write_jsonWrite json to local path.
write_parquetWrite parquet to local path.

API

class nemo_curator.utils.writer_utils.JsonEncoderCustom()

Bases: JSONEncoder

Custom JSON encoder that handles types that are not JSON serializable.

Example:

json.dumps(data, cls=JsonEncoderClass)
nemo_curator.utils.writer_utils.JsonEncoderCustom.default(
obj: object
) -> str | object

Encode an object for JSON serialization.

Parameters:

obj
object

Object to encode.

Returns: str | object

Encoded object.

nemo_curator.utils.writer_utils.write_bytes(
buffer: bytes,
dest: pathlib.Path,
desc: str,
source_video: str,
verbose: bool,
backup_and_overwrite: bool = False,
overwrite: bool = False
) -> None

Write bytes to local path.

Parameters:

buffer
bytes

Bytes to write.

dest
pathlib.Path

Destination to write.

desc
str

Description of the write.

source_video
str

Source video.

verbose
bool

Verbosity.

client

Storage client.

backup_and_overwrite
boolDefaults to False

Backup and overwrite.

overwrite
boolDefaults to False

Overwrite.

nemo_curator.utils.writer_utils.write_csv(
dest: pathlib.Path,
desc: str,
source_video: str,
data: list[list[str]],
verbose: bool,
backup_and_overwrite: bool = False
) -> None

Write csv to local path.

Parameters:

dest
pathlib.Path

Destination to write.

desc
str

Description of the write.

source_video
str

Source video.

data
list[list[str]]

Data to write.

verbose
bool

Verbosity.

client

Storage client.

backup_and_overwrite
boolDefaults to False

Backup and overwrite.

nemo_curator.utils.writer_utils.write_json(
data: dict[str, typing.Any],
dest: pathlib.Path,
desc: str,
source_video: str,
verbose: bool,
backup_and_overwrite: bool = False,
overwrite: bool = False
) -> None

Write json to local path.

Parameters:

data
dict[str, Any]

Data to write.

dest
pathlib.Path

Destination to write.

desc
str

Description of the write.

source_video
str

Source video.

verbose
bool

Verbosity.

client

Storage client.

backup_and_overwrite
boolDefaults to False

Backup and overwrite.

overwrite
boolDefaults to False

Overwrite.

nemo_curator.utils.writer_utils.write_parquet(
data: list[dict[str, str]],
dest: pathlib.Path,
desc: str,
source_video: str,
verbose: bool,
backup_and_overwrite: bool = False,
overwrite: bool = False
) -> None

Write parquet to local path.

Parameters:

data
list[dict[str, str]]

Data to write.

dest
pathlib.Path

Destination to write.

desc
str

Description of the write.

source_video
str

Source video.

verbose
bool

Verbosity.

client

Storage client.

backup_and_overwrite
boolDefaults to False

Whether to backup existing file before overwriting.

overwrite
boolDefaults to False

Whether to overwrite existing file.