nemo_rl.models.generation.sglang.sglang_copied_utils#

Standalone utility functions copied from the SGLang project.

This module contains utility functions that were originally part of the SGLang repository (https://github.com/sgl-project/sglang). They have been copied here to avoid requiring sglang as a runtime dependency for weight refitting functionality.

IMPORTANT: This module should NOT contain any imports from the sglang package. All functions are standalone and self-contained.

Each function includes a permalink to its original source in the SGLang repository. These functions were copied from sglang version 0.5.2.

Module Contents#

Classes#

MultiprocessingSerializer

Serialize/deserialize Python objects using ForkingPickler for IPC.

Functions#

monkey_patch_torch_reductions

Monkey patch torch multiprocessing reductions to use GPU UUIDs.

_reduce_tensor_modified

Modified reduce_tensor that stores GPU UUID instead of device index.

_rebuild_cuda_tensor_modified

Modified rebuild_cuda_tensor that accepts GPU UUID or device index.

_device_to_uuid

Convert a device index to its UUID string.

_device_from_maybe_uuid

Convert a device UUID string or index to a device index.

_modify_tuple

Create a new tuple with one element modified by a function.

Data#

API#

class nemo_rl.models.generation.sglang.sglang_copied_utils.MultiprocessingSerializer#

Serialize/deserialize Python objects using ForkingPickler for IPC.

This class enables serialization of objects (including CUDA tensors with IPC handles) for transfer between processes via HTTP or other mechanisms.

Original source (sglang v0.5.2): https://github.com/sgl-project/sglang/blob/v0.5.2/python/sglang/srt/utils.py#L589-L623

static serialize(obj, output_str: bool = False)#

Serialize a Python object using ForkingPickler.

Parameters:
  • obj – The object to serialize.

  • output_str (bool) – If True, return a base64-encoded string instead of raw bytes.

Returns:

The serialized object.

Return type:

bytes or str

static deserialize(data)#

Deserialize a previously serialized object.

Parameters:

data (bytes or str) – The serialized data, optionally base64-encoded.

Returns:

The deserialized Python object.

nemo_rl.models.generation.sglang.sglang_copied_utils.monkey_patch_torch_reductions()#

Monkey patch torch multiprocessing reductions to use GPU UUIDs.

This patch modifies PyTorch’s CUDA tensor IPC mechanism to use GPU UUIDs instead of device indices. This enables proper weight transfer between processes that may have different CUDA_VISIBLE_DEVICES configurations.

The patch is idempotent - calling it multiple times is safe.

This is a workaround before PyTorch https://github.com/pytorch/pytorch/pull/149248 is merged and released.

Original source (sglang v0.5.2): https://github.com/sgl-project/sglang/blob/v0.5.2/python/sglang/srt/patch_torch.py#L20-L33

nemo_rl.models.generation.sglang.sglang_copied_utils._REDUCE_TENSOR_ARG_DEVICE_INDEX#

6

nemo_rl.models.generation.sglang.sglang_copied_utils._reduce_tensor_modified(*args, **kwargs)#

Modified reduce_tensor that stores GPU UUID instead of device index.

Original source (sglang v0.5.2): https://github.com/sgl-project/sglang/blob/v0.5.2/python/sglang/srt/patch_torch.py#L39-L43

nemo_rl.models.generation.sglang.sglang_copied_utils._rebuild_cuda_tensor_modified(*args)#

Modified rebuild_cuda_tensor that accepts GPU UUID or device index.

Original source (sglang v0.5.2): https://github.com/sgl-project/sglang/blob/v0.5.2/python/sglang/srt/patch_torch.py#L46-L48

nemo_rl.models.generation.sglang.sglang_copied_utils._device_to_uuid(device: int) str#

Convert a device index to its UUID string.

Original source (sglang v0.5.2): https://github.com/sgl-project/sglang/blob/v0.5.2/python/sglang/srt/patch_torch.py#L51-L52

nemo_rl.models.generation.sglang.sglang_copied_utils._device_from_maybe_uuid(
device_maybe_uuid: Union[int, str],
) int#

Convert a device UUID string or index to a device index.

Parameters:

device_maybe_uuid – Either an integer device index or a UUID string.

Returns:

The integer device index.

Raises:

Exception – If the UUID doesn’t match any available device.

Original source (sglang v0.5.2): https://github.com/sgl-project/sglang/blob/v0.5.2/python/sglang/srt/patch_torch.py#L55-L65

nemo_rl.models.generation.sglang.sglang_copied_utils._modify_tuple(t, index: int, modifier: Callable)#

Create a new tuple with one element modified by a function.

Original source (sglang v0.5.2): https://github.com/sgl-project/sglang/blob/v0.5.2/python/sglang/srt/patch_torch.py#L68-L69