(Latest Version)

File Batcher Module

This module loads the input files, removes files that are older than the chosen window of time, and then groups the remaining files by period that fall inside the window.

Parameter

Type

Description

Example Value

Default Value

batching_options

dictionary

Options for batching

See below

-

cache_dir

string

Cache directory

“./file_batcher_cache”

None

file_type

string

File type

“JSON”

"JSON"

filter_nulls

boolean

Whether to filter null values

false

false

schema

dictionary

Data schema

See below

[Required]

timestamp_column_name

string

Name of the timestamp column

“timestamp”

"timestamp"

Key

Type

Description

Example Value

Default Value

end_time

datetime/string

Endtime of the time window

“2023-03-14T23:59:59”

None

iso_date_regex_pattern

string

Regex pattern for ISO date matching

“\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}”

<iso_date_regex_pattern>

parser_kwargs

dictionary

Additional arguments for the parser

{}

{}

period

string

Time period for grouping files

“1d”

"D"

sampling_rate_s

integer

Sampling rate in seconds

60

60

start_time

datetime/string

Start time of the time window

“2023-03-01T00:00:00”

None

Key

Type

Description

Example Value

Default Value

encoding

string

Encoding

“latin1”

"latin1"

schema_str

string

Schema string

“string”

[Required]

Copy
Copied!
            

{ "batching_options": { "end_time": "2023-03-14T23:59:59", "iso_date_regex_pattern": "\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}", "parser_kwargs": {}, "period": "1d", "sampling_rate_s": 60, "start_time": "2023-03-01T00:00:00" }, "cache_dir": "./file_batcher_cache", "file_type": "JSON", "filter_nulls": false, "schema": { "schema_str": "string", "encoding": "latin1" }, "timestamp_column_name": "timestamp" }

© Copyright 2023, NVIDIA. Last updated on Apr 11, 2023.