> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/curator/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/curator/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/curator/_mcp/server.

# nemo_curator.utils.ray_utils

Cluster-wide Ray helpers shared across backends and inference-server code.

## Module Contents

### Functions

| Name                                                                       | Description                                                                               |
| -------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- |
| [`get_head_node_id`](#nemo_curator-utils-ray_utils-get_head_node_id)       | Return the cluster head node ID, lazily computed and cached.                              |
| [`is_head_node`](#nemo_curator-utils-ray_utils-is_head_node)               | Check if a Ray node dict represents the cluster head.                                     |
| [`run_on_each_node`](#nemo_curator-utils-ray_utils-run_on_each_node)       | Submit `remote_fn(*args)` once per alive Ray node and return results in submission order. |
| [`submit_on_each_node`](#nemo_curator-utils-ray_utils-submit_on_each_node) | Submit `remote_fn(*args)` once per alive Ray node and return the ObjectRefs.              |

### Data

[`_HEAD_NODE_ID_CACHE`](#nemo_curator-utils-ray_utils-_HEAD_NODE_ID_CACHE)

### API

```python
nemo_curator.utils.ray_utils.get_head_node_id() -> str | None
```

Return the cluster head node ID, lazily computed and cached.

Returns `None` if no head node is present in the cluster.

```python
nemo_curator.utils.ray_utils.is_head_node(
    node: dict[str, typing.Any]
) -> bool
```

Check if a Ray node dict represents the cluster head.

```python
nemo_curator.utils.ray_utils.run_on_each_node(
    remote_fn: ray.remote_function.RemoteFunction,
    args = (),
    ignore_head_node: bool = False,
    num_cpus: float = 0,
    num_gpus: float = 0
) -> list[typing.Any]
```

Submit `remote_fn(*args)` once per alive Ray node and return results in submission order.

Convenience wrapper that submits via :func:`submit_on_each_node` and awaits the
refs with a single `ray.get`. For fan-outs across multiple submissions where
parallelism matters, call :func:`submit_on_each_node` directly and `ray.get`
the combined ref list once.

```python
nemo_curator.utils.ray_utils.submit_on_each_node(
    remote_fn: ray.remote_function.RemoteFunction,
    args = (),
    ignore_head_node: bool = False,
    num_cpus: float = 0,
    num_gpus: float = 0
) -> list[typing.Any]
```

Submit `remote_fn(*args)` once per alive Ray node and return the ObjectRefs.

Each invocation is pinned to its node via `NodeAffinitySchedulingStrategy(soft=False)`,
so the function runs on (and only on) the targeted node. Dead nodes are skipped; the
head node is also skipped when `ignore_head_node` is True. The caller is responsible
for awaiting the returned refs (typically via `ray.get`); use this when batching
multiple fan-outs into a single await preserves parallelism.

```python
nemo_curator.utils.ray_utils._HEAD_NODE_ID_CACHE: str | None = None
```