nemo_deploy.deploy_ray#

Module Contents#

Classes#

DeployRay

A class for managing Ray deployment and serving of models.

Data#

API#

nemo_deploy.deploy_ray.LOGGER = 'getLogger(...)'#
class nemo_deploy.deploy_ray.DeployRay(
address: str = 'auto',
num_cpus: int = 1,
num_gpus: int = 1,
include_dashboard: bool = False,
ignore_reinit_error: bool = True,
runtime_env: dict = None,
)[source]#

A class for managing Ray deployment and serving of models.

This class provides functionality to initialize Ray, start Ray Serve, deploy models, and manage the lifecycle of the Ray cluster.

.. attribute:: address

The address of the Ray cluster to connect to.

Type:

str

.. attribute:: num_cpus

Number of CPUs to allocate for the Ray cluster.

Type:

int

.. attribute:: num_gpus

Number of GPUs to allocate for the Ray cluster.

Type:

int

.. attribute:: include_dashboard

Whether to include the Ray dashboard.

Type:

bool

.. attribute:: ignore_reinit_error

Whether to ignore errors when reinitializing Ray.

Type:

bool

.. attribute:: runtime_env

Runtime environment configuration for Ray.

Type:

dict

Initialization

Initialize the DeployRay instance and set up the Ray cluster.

Parameters:
  • address (str, optional) – Address of the Ray cluster. Defaults to β€œauto”.

  • num_cpus (int, optional) – Number of CPUs to allocate. Defaults to 1.

  • num_gpus (int, optional) – Number of GPUs to allocate. Defaults to 1.

  • include_dashboard (bool, optional) – Whether to include the dashboard. Defaults to False.

  • ignore_reinit_error (bool, optional) – Whether to ignore reinit errors. Defaults to True.

  • runtime_env (dict, optional) – Runtime environment configuration. Defaults to None.

Raises:

Exception – If Ray is not installed.

start(host: str = '0.0.0.0', port: int = None)[source]#

Start Ray Serve with the specified host and port.

Parameters:
  • host (str, optional) – Host address to bind to. Defaults to β€œ0.0.0.0”.

  • port (int, optional) – Port number to use. If None, an available port will be found.

run(app: ray.serve.Application, model_name: str)[source]#

Deploy and start serving a model using Ray Serve.

Parameters:
  • app (Application) – The Ray Serve application to deploy.

  • model_name (str) – Name to give to the deployed model.

stop()[source]#

Stop the Ray Serve deployment and shutdown the Ray cluster.

This method attempts to gracefully shutdown both Ray Serve and the Ray cluster. If any errors occur during shutdown, they are logged as warnings.