***
layout: overview
slug: nemo-curator/nemo\_curator/utils/split\_large\_files
title: nemo\_curator.utils.split\_large\_files
----------------------------------------------
## Module Contents
### Functions
| Name | Description |
| ------------------------------------------------------------------------------------------------ | ----------- |
| [`_split_table`](#nemo_curator-utils-split_large_files-_split_table) | - |
| [`_write_table_to_file`](#nemo_curator-utils-split_large_files-_write_table_to_file) | - |
| [`main`](#nemo_curator-utils-split_large_files-main) | - |
| [`parse_args`](#nemo_curator-utils-split_large_files-parse_args) | - |
| [`split_parquet_file_by_size`](#nemo_curator-utils-split_large_files-split_parquet_file_by_size) | - |
### API
```python
nemo_curator.utils.split_large_files._split_table(
table: pyarrow.Table,
target_size: int
) -> list[pyarrow.Table]
```
```python
nemo_curator.utils.split_large_files._write_table_to_file(
table: pyarrow.Table,
outdir: str,
output_prefix: str,
ext: str,
file_idx: int
) -> int
```
```python
nemo_curator.utils.split_large_files.main(
args: argparse.ArgumentParser | None = None
) -> None
```
```python
nemo_curator.utils.split_large_files.parse_args(
args: argparse.ArgumentParser | None = None
) -> argparse.Namespace
```
```python
nemo_curator.utils.split_large_files.split_parquet_file_by_size(
input_file: str,
outdir: str,
target_size_mb: int
) -> None
```