nemo_rl.data.datasets.response_datasets.refcoco#
Module Contents#
Classes#
Simple wrapper around the RefCOCO dataset. |
Functions#
Downloads a zip file from a given URL to a target directory and unzips it into a specified subdirectory within the target directory, showing download progress. |
|
Format the RefCOCO dataset from huggingface. |
API#
- nemo_rl.data.datasets.response_datasets.refcoco.download_and_unzip(
- url: str,
- target_directory: str,
- subdir_name: str = '.',
Downloads a zip file from a given URL to a target directory and unzips it into a specified subdirectory within the target directory, showing download progress.
- Parameters:
url (str) – The URL of the zip file to download.
target_directory (str) – The directory where the zip file will be downloaded and unzipped.
subdir_name (str) – The name of the subdirectory within the target_directory where the contents of the zip file will be unzipped. Defaults to “train”.
- nemo_rl.data.datasets.response_datasets.refcoco.format_refcoco_dataset(
- example: dict[str, Any],
- width: int = 256,
- height: int = 256,
- caption_type: str = 'random',
Format the RefCOCO dataset from huggingface.
This should be replaced with our own curated RefCOCO/+/g dataset soon
- Parameters:
example – The example to format.
width – The width of the resized image.
height – The height of the resized image.
caption_type – The type of caption to use.
- class nemo_rl.data.datasets.response_datasets.refcoco.RefCOCODataset(
- split: str = 'train',
- download_dir: str = './coco_images',
- **kwargs,
Bases:
nemo_rl.data.datasets.raw_dataset.RawDatasetSimple wrapper around the RefCOCO dataset.
- Parameters:
split – Split name for the dataset, default is “train”
download_dir – Directory to download the dataset to, default is “./coco_images”
Initialization
- format_data(data: dict[str, Any]) dict[str, Any]#