filesets.resources#
Extended FilesResource classes with FilesetFileSystem support.
These classes extend the SDK’s generated FilesResource classes to add high-level file operations (upload, download, list, delete) and fsspec filesystem access.
Module Contents#
Classes#
Extended AsyncFilesResource with high-level file operations. |
|
Protocol for async file-like objects (e.g., anyio.open_file(), aiofiles). |
|
Extended FilesResource with high-level file operations. |
|
Response from listing files in a fileset. |
|
Protocol for file-like objects. |
Data#
API#
- filesets.resources.AsyncContent#
None
- class filesets.resources.AsyncFilesResource(client: nemo_platform._client.AsyncNeMoPlatform)#
Bases:
nemo_platform.resources.files.AsyncFilesResourceExtended AsyncFilesResource with high-level file operations.
Provides convenient methods for uploading, downloading, and listing files. For fsspec filesystem access, use sdk.files.fsspec.
Initialization
- async delete(
- *,
- remote_path: str,
- fileset: str | None = None,
- workspace: str | None = None,
Delete a file from a fileset (async).
- Parameters:
remote_path – Path of the file to delete. Can be a full path (e.g., “workspace/fileset#data/file.txt”) if fileset is not provided, or a relative path (e.g., “data/file.txt”) if fileset is provided.
fileset – Fileset name. If not provided, inferred from remote_path.
workspace – Workspace name. If not provided, inferred from remote_path or uses the SDK’s default workspace.
Examples
# Delete a file with explicit fileset >>> await sdk.files.delete( … fileset=”my-fileset”, … remote_path=”data/old-file.txt” … )
# Delete using full path >>> await sdk.files.delete(remote_path=”my-fileset#data/old-file.txt”)
- async download(
- *,
- remote_path: str | list[str] = '',
- local_path: str,
- fileset: str | None = None,
- workspace: str | None = None,
- callback: fsspec.callbacks.Callback | None = None,
- max_workers: int | None = None,
Download files from a fileset to a local path (async).
- Parameters:
remote_path –
Path(s) within the fileset to download. Can be: - A single path (str): Full path (e.g., “workspace/fileset#data/”),
relative path (e.g., “data/”), or glob pattern (e.g., “*.json”).
A list of paths (list[str]): Multiple specific file paths to download. When using a list, fileset and workspace must be provided explicitly.
Defaults to “” (root of fileset).
local_path – Local destination path (directory).
fileset – Fileset name. If not provided, inferred from remote_path (str only).
workspace – Workspace name. If not provided, inferred from remote_path or uses the SDK’s default workspace.
callback – Optional progress callback (e.g., RichProgressCallback).
max_workers – Maximum number of concurrent file transfers.
Examples
# Explicit fileset/workspace >>> await sdk.files.download( … fileset=”my-fileset”, … workspace=”default”, … remote_path=”data/”, … local_path=”./downloads/” … )
# Inferred from path >>> await sdk.files.download( … remote_path=”default/my-fileset#data/”, … local_path=”./downloads/” … )
# Download files matching a glob pattern >>> await sdk.files.download( … fileset=”my-fileset”, … remote_path=”*.json”, … local_path=”./downloads/” … )
# Download files matching a pattern in a subdirectory >>> await sdk.files.download( … fileset=”my-fileset”, … remote_path=”data/*.jsonl”, … local_path=”./downloads/” … )
# Download a list of specific files >>> await sdk.files.download( … fileset=”my-fileset”, … remote_path=[“config.json”, “tokenizer.json”, “vocab.txt”], … local_path=”./downloads/” … )
- async download_content(
- *,
- remote_path: str,
- fileset: str | None = None,
- workspace: str | None = None,
Download a file’s content from a fileset (async).
- Parameters:
remote_path – Path of the file within the fileset.
fileset – Fileset name. If not provided, inferred from remote_path.
workspace – Workspace name. If not provided, uses SDK default.
- Returns:
The file content.
- Return type:
bytes
Examples
# Load JSON >>> content = await sdk.files.download_content( … remote_path=”config.json”, … fileset=”my-fileset”, … ) >>> data = json.loads(content)
# Get text content >>> text = (await sdk.files.download_content( … remote_path=”readme.txt”, … fileset=”my-fileset”, … )).decode(“utf-8”)
- property fsspec: nemo_platform.filesets.filesystem.filesystem.FilesetFileSystem#
Get a FilesetFileSystem instance pre-configured with this SDK client.
This provides fsspec filesystem access. For high-level file operations, use sdk.files instead.
- async list(
- *,
- remote_path: str = '',
- fileset: str | None = None,
- workspace: str | None = None,
- include_cache_status: bool = False,
List all files in a fileset path (recursive, async), with optional glob pattern support.
- Parameters:
remote_path – Path within the fileset to list. Can be a full path (e.g., “workspace/fileset#data/” or “fileset#data/”) if fileset is not provided, or a relative path (e.g., “data/”) if fileset is provided. Supports glob patterns (*, ?, []) for filtering files. Defaults to “” (root of fileset).
fileset – Fileset name. If not provided, inferred from remote_path.
workspace – Workspace name. If not provided, inferred from remote_path or uses the SDK’s default workspace.
include_cache_status – Check and return cache status for each file. When False (default), external storage files return None for cache_status.
- Returns:
ListFilesResponse with data (list of FilesetFile) and cache_status property.
Examples
# List all files in a fileset >>> response = await sdk.files.list(fileset=”my-fileset”) >>> for f in response.data: … print(f”{f.path}: {f.size} bytes”)
# List files in a subdirectory >>> await sdk.files.list( … fileset=”my-fileset”, … remote_path=”data/” … )
# List files matching a glob pattern >>> await sdk.files.list( … fileset=”my-fileset”, … remote_path=”*.json” … )
# List files matching a pattern in a subdirectory >>> await sdk.files.list( … fileset=”my-fileset”, … remote_path=”data/*.jsonl” … )
# Inferred from path >>> await sdk.files.list(remote_path=”my-fileset#data/”)
# Check cache status for external storage >>> response = await sdk.files.list(fileset=”my-fileset”, include_cache_status=True) >>> print(f”Cache status: {response.cache_status}”) >>> for f in response.data: … print(f”{f.path}: {f.cache_status}”)
- async upload(
- *,
- local_path: str,
- remote_path: str = '',
- fileset: str | None = None,
- workspace: str | None = None,
- callback: fsspec.callbacks.Callback | None = None,
- max_workers: int | None = None,
- fileset_auto_create: bool = False,
Upload files from a local path to a fileset (async).
- Parameters:
local_path – Local source path (file or directory).
remote_path – Path within the fileset to upload to. Can be a full path (e.g., “workspace/fileset#data/” or “fileset#data/”) if fileset is not provided, or a relative path (e.g., “data/”) if fileset is provided. Defaults to “” (root of fileset).
fileset – Fileset name. If not provided, inferred from remote_path.
workspace – Workspace name. If not provided, inferred from remote_path or uses the SDK’s default workspace.
callback – Optional progress callback (e.g., RichProgressCallback).
max_workers – Maximum number of concurrent file transfers.
fileset_auto_create – If True, create the fileset if it doesn’t exist. When no fileset is specified (neither as param nor in remote_path), a unique name is generated (e.g., “fileset-a1b2c3d4”).
- Returns:
- The fileset that was uploaded to. Check fileset.name to see
the generated name when using fileset_auto_create without specifying a fileset.
- Return type:
Fileset
Examples
# Explicit fileset/workspace >>> await sdk.files.upload( … fileset=”my-fileset”, … workspace=”default”, … local_path=”./data/”, … remote_path=”uploads/” … )
# Inferred from path >>> await sdk.files.upload( … local_path=”./file.txt”, … remote_path=”default/my-fileset#file.txt” … )
# Auto-create fileset with specified name >>> fileset = await sdk.files.upload( … local_path=”./data/”, … fileset=”new-fileset”, … fileset_auto_create=True … ) >>> print(f”Uploaded to: {fileset.name}”)
# Auto-create fileset with generated name >>> fileset = await sdk.files.upload( … local_path=”./data/”, … fileset_auto_create=True … ) >>> print(f”Uploaded to: {fileset.name}”) # e.g., “fileset-a1b2c3d4”
- async upload_content(
- *,
- content: filesets.resources.AsyncContent,
- remote_path: str,
- fileset: str | None = None,
- workspace: str | None = None,
- fileset_auto_create: bool = False,
Upload in-memory data to a fileset (async).
- Parameters:
content – Content to upload. Can be: - bytes: Raw byte content - str: Text content (will be UTF-8 encoded) - AsyncReadable: Async file-like object (e.g., anyio.open_file(), aiofiles) - AsyncIterator[bytes]: Async iterator yielding byte chunks (streamed)
remote_path – Destination path within the fileset.
fileset – Fileset name. If not provided, inferred from remote_path.
workspace – Workspace name. If not provided, uses SDK default.
fileset_auto_create – If True, create the fileset if it doesn’t exist. When no fileset is specified (neither as param nor in remote_path), a unique name is generated (e.g., “fileset-a1b2c3d4”).
- Returns:
- The fileset that was uploaded to. Check fileset.name to see
the generated name when using fileset_auto_create without specifying a fileset.
- Return type:
Fileset
Examples
# Upload bytes >>> await sdk.files.upload_content( … content=b”Hello, World!”, … remote_path=”message.txt”, … fileset=”my-fileset”, … )
# Upload string (auto UTF-8 encoded) >>> await sdk.files.upload_content( … content=’{“key”: “value”}’, … remote_path=”config.json”, … fileset=”my-fileset”, … )
# Upload from async file (anyio/aiofiles) >>> async with await anyio.open_file(“data.bin”, “rb”) as f: … await sdk.files.upload_content( … content=f, … remote_path=”data.bin”, … fileset=”my-fileset”, … )
# Auto-create fileset with specified name >>> fileset = await sdk.files.upload_content( … content=b”content”, … remote_path=”file.txt”, … fileset=”new-fileset”, … fileset_auto_create=True, … ) >>> print(f”Uploaded to: {fileset.name}”)
# Auto-create fileset with generated name >>> fileset = await sdk.files.upload_content( … content=b”content”, … remote_path=”file.txt”, … fileset_auto_create=True, … ) >>> print(f”Uploaded to: {fileset.name}”) # e.g., “fileset-a1b2c3d4”
- class filesets.resources.AsyncReadable#
Bases:
typing.ProtocolProtocol for async file-like objects (e.g., anyio.open_file(), aiofiles).
- async read(size: int = -1) bytes#
- class filesets.resources.FilesResource(client: nemo_platform._client.NeMoPlatform)#
Bases:
nemo_platform.resources.files.FilesResourceExtended FilesResource with high-level file operations.
Provides convenient methods for uploading, downloading, and listing files. For fsspec filesystem access, use sdk.files.fsspec.
Initialization
- delete(
- *,
- remote_path: str,
- fileset: str | None = None,
- workspace: str | None = None,
Delete a file from a fileset.
- Parameters:
remote_path – Path of the file to delete. Can be a full path (e.g., “workspace/fileset#data/file.txt”) if fileset is not provided, or a relative path (e.g., “data/file.txt”) if fileset is provided.
fileset – Fileset name. If not provided, inferred from remote_path.
workspace – Workspace name. If not provided, inferred from remote_path or uses the SDK’s default workspace.
Examples
# Delete a file with explicit fileset >>> sdk.files.delete( … fileset=”my-fileset”, … remote_path=”data/old-file.txt” … )
# Delete using full path >>> sdk.files.delete(remote_path=”my-fileset#data/old-file.txt”)
- download(
- *,
- remote_path: str | list[str] = '',
- local_path: str,
- fileset: str | None = None,
- workspace: str | None = None,
- callback: fsspec.callbacks.Callback | None = None,
- max_workers: int | None = None,
Download files from a fileset to a local path.
- Parameters:
remote_path –
Path(s) within the fileset to download. Can be: - A single path (str): Full path (e.g., “workspace/fileset#data/”),
relative path (e.g., “data/”), or glob pattern (e.g., “*.json”).
A list of paths (list[str]): Multiple specific file paths to download. When using a list, fileset and workspace must be provided explicitly.
Defaults to “” (root of fileset).
local_path – Local destination path (directory).
fileset – Fileset name. If not provided, inferred from remote_path (str only).
workspace – Workspace name. If not provided, inferred from remote_path or uses the SDK’s default workspace.
callback – Optional progress callback (e.g., RichProgressCallback).
max_workers – Maximum number of concurrent file transfers.
Examples
# Explicit fileset/workspace >>> sdk.files.download( … fileset=”my-fileset”, … workspace=”default”, … remote_path=”data/”, … local_path=”./downloads/” … )
# Inferred from path (with workspace) >>> sdk.files.download( … remote_path=”default/my-fileset#data/”, … local_path=”./downloads/” … )
# Inferred from path (workspace from SDK default) >>> sdk.files.download( … remote_path=”my-fileset#data/”, … local_path=”./downloads/” … )
# Download files matching a glob pattern >>> sdk.files.download( … fileset=”my-fileset”, … remote_path=”*.json”, … local_path=”./downloads/” … )
# Download files matching a pattern in a subdirectory >>> sdk.files.download( … fileset=”my-fileset”, … remote_path=”data/*.jsonl”, … local_path=”./downloads/” … )
# Download a list of specific files >>> sdk.files.download( … fileset=”my-fileset”, … remote_path=[“config.json”, “tokenizer.json”, “vocab.txt”], … local_path=”./downloads/” … )
# With progress callback >>> from nemo_platform.filesets import RichProgressCallback >>> with RichProgressCallback(description=”Downloading”) as cb: … sdk.files.download( … remote_path=”my-fileset#”, … local_path=”./”, … callback=cb … )
- download_content(
- *,
- remote_path: str,
- fileset: str | None = None,
- workspace: str | None = None,
Download a file’s content from a fileset.
- Parameters:
remote_path – Path of the file within the fileset.
fileset – Fileset name. If not provided, inferred from remote_path.
workspace – Workspace name. If not provided, uses SDK default.
- Returns:
The file content.
- Return type:
bytes
Examples
# Load JSON (most common use case) >>> data = json.loads(sdk.files.download_content( … remote_path=”config.json”, … fileset=”my-fileset”, … ))
# Get text content >>> text = sdk.files.download_content( … remote_path=”readme.txt”, … fileset=”my-fileset”, … ).decode(“utf-8”)
# Get binary content >>> content = sdk.files.download_content( … remote_path=”model.bin”, … fileset=”my-fileset”, … )
- property fsspec: nemo_platform.filesets.filesystem.filesystem.FilesetFileSystem#
Access the underlying fsspec filesystem.
- list(
- *,
- remote_path: str = '',
- fileset: str | None = None,
- workspace: str | None = None,
- include_cache_status: bool = False,
List all files in a fileset path (recursive), with optional glob pattern support.
- Parameters:
remote_path – Path within the fileset to list. Can be a full path (e.g., “workspace/fileset#data/” or “fileset#data/”) if fileset is not provided, or a relative path (e.g., “data/”) if fileset is provided. Supports glob patterns (*, ?, []) for filtering files. Defaults to “” (root of fileset).
fileset – Fileset name. If not provided, inferred from remote_path.
workspace – Workspace name. If not provided, inferred from remote_path or uses the SDK’s default workspace.
include_cache_status – Check and return cache status for each file. When False (default), external storage files return None for cache_status.
- Returns:
ListFilesResponse with data (list of FilesetFile) and cache_status property.
Examples
# List all files in a fileset >>> response = sdk.files.list(fileset=”my-fileset”) >>> for f in response.data: … print(f”{f.path}: {f.size} bytes”)
# List files in a subdirectory >>> sdk.files.list( … fileset=”my-fileset”, … remote_path=”data/” … )
# List files matching a glob pattern >>> sdk.files.list( … fileset=”my-fileset”, … remote_path=”*.json” … )
# List files matching a pattern in a subdirectory >>> sdk.files.list( … fileset=”my-fileset”, … remote_path=”data/*.jsonl” … )
# Inferred from path >>> sdk.files.list(remote_path=”my-fileset#data/”)
# Check cache status for external storage >>> response = sdk.files.list(fileset=”my-fileset”, include_cache_status=True) >>> print(f”Cache status: {response.cache_status}”) >>> for f in response.data: … print(f”{f.path}: {f.cache_status}”)
- upload(
- *,
- local_path: str,
- remote_path: str = '',
- fileset: str | None = None,
- workspace: str | None = None,
- callback: fsspec.callbacks.Callback | None = None,
- max_workers: int | None = None,
- fileset_auto_create: bool = False,
Upload files from a local path to a fileset.
- Parameters:
local_path – Local source path (file or directory).
remote_path – Path within the fileset to upload to. Can be a full path (e.g., “workspace/fileset#data/” or “fileset#data/”) if fileset is not provided, or a relative path (e.g., “data/”) if fileset is provided. Defaults to “” (root of fileset).
fileset – Fileset name. If not provided, inferred from remote_path.
workspace – Workspace name. If not provided, inferred from remote_path or uses the SDK’s default workspace.
callback – Optional progress callback (e.g., RichProgressCallback).
max_workers – Maximum number of concurrent file transfers.
fileset_auto_create – If True, create the fileset if it doesn’t exist. When no fileset is specified (neither as param nor in remote_path), a unique name is generated (e.g., “fileset-a1b2c3d4”).
- Returns:
- The fileset that was uploaded to. Check fileset.name to see
the generated name when using fileset_auto_create without specifying a fileset.
- Return type:
Fileset
Examples
# Explicit fileset/workspace >>> sdk.files.upload( … fileset=”my-fileset”, … workspace=”default”, … local_path=”./data/”, … remote_path=”uploads/” … )
# Inferred from path >>> sdk.files.upload( … local_path=”./file.txt”, … remote_path=”default/my-fileset#file.txt” … )
# With workspace from SDK default >>> sdk.files.upload( … local_path=”./file.txt”, … remote_path=”my-fileset#file.txt” … )
# Auto-create fileset with specified name >>> fileset = sdk.files.upload( … local_path=”./data/”, … fileset=”new-fileset”, … fileset_auto_create=True … ) >>> print(f”Uploaded to: {fileset.name}”)
# Auto-create fileset with generated name >>> fileset = sdk.files.upload( … local_path=”./data/”, … fileset_auto_create=True … ) >>> print(f”Uploaded to: {fileset.name}”) # e.g., “fileset-a1b2c3d4”
- upload_content(
- *,
- content: filesets.resources.SyncContent,
- remote_path: str,
- fileset: str | None = None,
- workspace: str | None = None,
- fileset_auto_create: bool = False,
Upload in-memory content to a fileset.
- Parameters:
content – Content to upload. Can be: - bytes: Raw byte content - str: Text content (will be UTF-8 encoded) - BinaryIO: File-like object (e.g., BytesIO, open file) - Iterator[bytes]: Generator or iterator yielding byte chunks
remote_path – Destination path within the fileset.
fileset – Fileset name. If not provided, inferred from remote_path.
workspace – Workspace name. If not provided, uses SDK default.
fileset_auto_create – If True, create the fileset if it doesn’t exist. When no fileset is specified (neither as param nor in remote_path), a unique name is generated (e.g., “fileset-a1b2c3d4”).
- Returns:
- The fileset that was uploaded to. Check fileset.name to see
the generated name when using fileset_auto_create without specifying a fileset.
- Return type:
Fileset
Examples
# Upload bytes >>> sdk.files.upload_content( … content=b”Hello, World!”, … remote_path=”message.txt”, … fileset=”my-fileset”, … )
# Upload string (auto UTF-8 encoded) >>> sdk.files.upload_content( … content=’{“key”: “value”}’, … remote_path=”config.json”, … fileset=”my-fileset”, … )
# Upload from BytesIO >>> from io import BytesIO >>> sdk.files.upload_content( … content=BytesIO(b”content”), … remote_path=”data.bin”, … fileset=”my-fileset”, … )
# Auto-create fileset with specified name >>> fileset = sdk.files.upload_content( … content=b”content”, … remote_path=”file.txt”, … fileset=”new-fileset”, … fileset_auto_create=True, … ) >>> print(f”Uploaded to: {fileset.name}”)
# Auto-create fileset with generated name >>> fileset = sdk.files.upload_content( … content=b”content”, … remote_path=”file.txt”, … fileset_auto_create=True, … ) >>> print(f”Uploaded to: {fileset.name}”) # e.g., “fileset-a1b2c3d4”
- class filesets.resources.ListFilesResponse#
Response from listing files in a fileset.
- data#
List of files in the fileset.
- Properties:
- cache_status: Aggregate cache status of all files.
“caching” if any file is actively being cached
“not_cached” if any file is not cached (and none are caching)
“cached” if all files are fully cached
“not_cacheable” if all files cannot be cached
None if no cache information is available
- property cache_status: nemo_platform.types.files.CacheStatus | None#
Get aggregate cache status of all files.
Returns the most relevant status based on priority: - “caching” if any file is actively being cached - “not_cached” if any file is not cached (and none are caching) - “cached” if all files are fully cached - “not_cacheable” if all files cannot be cached - None if no cache information is available
- data: list[nemo_platform.types.files.FilesetFile]#
None
- class filesets.resources.Readable#
Bases:
typing.ProtocolProtocol for file-like objects.
- read(size: int = -1) bytes#
- filesets.resources.SyncContent#
None