> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/curator/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/curator/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/curator/_mcp/server.

# nemo_curator.utils.gpu_utils

## Module Contents

### Functions

| Name                                                                                           | Description                                                           |
| ---------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
| [`ensure_cudnn_loaded`](#nemo_curator-utils-gpu_utils-ensure_cudnn_loaded)                     | Discover and pre-load cuDNN from the `nvidia-cudnn-cu12` pip package. |
| [`get_gpu_count`](#nemo_curator-utils-gpu_utils-get_gpu_count)                                 | Get number of available CUDA GPUs as a power of 2.                    |
| [`get_max_model_len_from_config`](#nemo_curator-utils-gpu_utils-get_max_model_len_from_config) | Try to get max model length from HuggingFace AutoConfig.              |

### Data

[`_cudnn_loaded`](#nemo_curator-utils-gpu_utils-_cudnn_loaded)

### API

```python
nemo_curator.utils.gpu_utils.ensure_cudnn_loaded() -> bool
```

Discover and pre-load cuDNN from the `nvidia-cudnn-cu12` pip package.

ONNX Runtime relies on the system dynamic linker to locate
`libcudnn*.so` files, but pip-installed packages place them inside
the virtual-environment `site-packages` tree which is **not** on the
default library search path.

Call this function early — before any `import onnxruntime` — to make
those libraries visible to the linker.

This function is **idempotent**: repeated calls are cheap no-ops after
the first successful load.

## Returns

bool
`True` if cuDNN was successfully loaded (or was already loaded),
`False` otherwise.

```python
nemo_curator.utils.gpu_utils.get_gpu_count() -> int
```

Get number of available CUDA GPUs as a power of 2.

Many models require tensor parallelism to use power-of-2 GPU counts.
This returns the largest power of 2 \<= available GPU count.

**Returns:** `int`

Power of 2 GPU count, minimum 1.

**Raises:**

* `RuntimeError`: If no CUDA GPUs are detected.

```python
nemo_curator.utils.gpu_utils.get_max_model_len_from_config(
    model: str,
    cache_dir: str | None = None
) -> int | None
```

Try to get max model length from HuggingFace AutoConfig.

**Parameters:**

Model identifier (e.g., "microsoft/phi-4")

Optional cache directory for model config.

**Returns:** `int | None`

Max model length if found, None otherwise.

```python
nemo_curator.utils.gpu_utils._cudnn_loaded: bool = False
```