Important

NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to the Migration Guide for information on getting started.

Migrate Nsys Profiling from NeMo 1.0 to NeMo 2.0

In NeMo 2.0, the way to configure Nsys profiling has changed from a YAML configuration to using a dedicated callback. This guide will help you migrate your Nsys profiling setup.

NeMo 1.0 (Previous Release)

In NeMo 1.0, Nsys profiling was configured in the YAML configuration file.

model:
    nsys_profile:
        enabled: False
        start_step: 10  # Global batch to start profiling
        end_step: 10 # Global batch to end profiling
        ranks: [0] # Global rank IDs to profile
        gen_shape: False # Generate model and kernel details including input shapes

NeMo 2.0 (New Release)

In NeMo 2.0, Nsys profiling is configured using the NsysCallback class. Here’s how to set it up:

from nemo import lightning as nl
from nemo.lightning.pytorch.callbacks import NsysCallback

trainer = nl.Trainer(
    ...
    callbacks=[NsysCallback(
        enabled=False,
        start_step=10,
        end_step=10,
        ranks=[0],
        gen_shape=False
    )]
)

Migration Steps

  1. Remove the nsys_profile section from your YAML config file.

  2. Add the following import to your Python script:

    from nemo.lightning.pytorch.callbacks import NsysCallback
    
  3. When creating your Trainer, add NsysCallback to the callbacks list:

    trainer = nl.Trainer(
        ...
        callbacks=[NsysCallback(
            enabled=False,
            start_step=10,
            end_step=10,
            ranks=[0],
            gen_shape=False
        )]
    )
    
  4. Adjust the parameters in NsysCallback to match your previous YAML configuration.