nemo_curator.core.client

View as Markdown

Module Contents

Classes

NameDescription
RayClientThis class is used to setup the Ray cluster and configure metrics integration.

API

class nemo_curator.core.client.RayClient(
ray_port: int = DEFAULT_RAY_PORT,
ray_dashboard_port: int = DEFAULT_RAY_DASHBOARD_PORT,
ray_client_server_port: int = DEFAULT_RAY_CLIENT_SERVER_PORT,
ray_temp_dir: str = DEFAULT_RAY_TEMP_DIR,
include_dashboard: bool = True,
ray_metrics_port: int = DEFAULT_RAY_METRICS_PORT,
ray_dashboard_host: str = DEFAULT_RAY_DASHBOARD_HOST,
num_gpus: int | None = None,
num_cpus: int | None = None,
object_store_memory: int | None = None,
enable_object_spilling: bool = False,
ray_stdouterr_capture_file: str | None = None,
metrics_dir: str | None = None
)
Dataclass

This class is used to setup the Ray cluster and configure metrics integration.

If the specified ports are already in use, it will find the next available port and use that.

Parameters:

ray_port
intDefaults to DEFAULT_RAY_PORT

The port number of the Ray GCS.

ray_dashboard_port
intDefaults to DEFAULT_RAY_DASHBOARD_PORT

The port number of the Ray dashboard.

ray_temp_dir
strDefaults to DEFAULT_RAY_TEMP_DIR

The temporary directory to use for Ray.

include_dashboard
boolDefaults to True

Whether to include dashboard integration. If true, adds Ray metrics service discovery.

ray_metrics_port
intDefaults to DEFAULT_RAY_METRICS_PORT

The port number of the Ray metrics.

ray_dashboard_host
strDefaults to DEFAULT_RAY_DASHBOARD_HOST

The host of the Ray dashboard.

num_gpus
int | NoneDefaults to None

The number of GPUs to use.

num_cpus
int | NoneDefaults to None

The number of CPUs to use.

object_store_memory
int | NoneDefaults to None

The amount of memory to use for the object store.

enable_object_spilling
boolDefaults to False

Whether to enable object spilling.

ray_stdouterr_capture_file
str | NoneDefaults to None

The file to capture stdout/stderr to.

metrics_dir
str | NoneDefaults to None

The directory for Prometheus/Grafana metrics data. If None, uses the per-user default.

enable_object_spilling
bool = False
include_dashboard
bool = True
metrics_dir
str | None = None
num_cpus
int | None = None
num_gpus
int | None = None
object_store_memory
int | None = None
ray_client_server_port
int = DEFAULT_RAY_CLIENT_SERVER_PORT
ray_dashboard_host
str = DEFAULT_RAY_DASHBOARD_HOST
ray_dashboard_port
int = DEFAULT_RAY_DASHBOARD_PORT
ray_metrics_port
int = DEFAULT_RAY_METRICS_PORT
ray_port
int = DEFAULT_RAY_PORT
ray_process
Popen | None = field(init=False, default=None)
ray_stdouterr_capture_file
str | None = None
ray_temp_dir
str = DEFAULT_RAY_TEMP_DIR
nemo_curator.core.client.RayClient.__enter__()
nemo_curator.core.client.RayClient.__exit__(
exc = ()
)
nemo_curator.core.client.RayClient.__post_init__() -> None
nemo_curator.core.client.RayClient.start() -> None

Start the Ray cluster if not already started, optionally capturing stdout/stderr to a file.

nemo_curator.core.client.RayClient.stop() -> None