Modulus Launch Logging

Launch Logger

class modulus.launch.logging.launch.LaunchLogger(name_space, *args, **kwargs)[source]

Bases: object

Modulus Launch logger

An abstracted logger class that takes care of several fundamental logging functions. This class should first be initialized and then used via a context manager. This will auto compute epoch metrics. This is the standard logger for Modulus examples.

Parameters

Example

Copy
Copied!

            
            >>> from modulus.launch.logging import LaunchLogger
>>> LaunchLogger.initialize()
>>> epochs = 3
>>> for i in range(epochs):
...   with LaunchLogger("Train", epoch=i) as log:
...     # Log 3 mini-batches manually
...     log.log_minibatch({"loss": 1.0})
...     log.log_minibatch({"loss": 2.0})
...     log.log_minibatch({"loss": 3.0})

static initialize(use_wandb: bool = False, use_mlflow: bool = False)[source]

Initialize logging singleton

Parameters

log_epoch(losses: Dict[str, float])[source]

Logs metrics for a single epoch

Parameters

log_minibatch(losses: Dict[str, float])[source]

Logs metrics for a mini-batch epoch

This function should be called every mini-batch iteration. It will accumulate loss values over a datapipe. At the end of a epoch the average of these losses from each mini-batch will get calculated.

Parameters

classmethod toggle_mlflow(value: bool)[source]

Toggle MLFlow logging

Parameters

classmethod toggle_wandb(value: bool)[source]

Toggle WandB logging

Parameters

Console Logger

class modulus.launch.logging.console.PythonLogger(name: str = 'launch')[source]

Bases: object

Simple console logger for DL training This is a WIP

error(message: str)[source]

file_logging(file_name: str = 'launch.log')[source]

info(message: str)[source]

log(message: str)[source]

success(message: str)[source]

warning(message: str)[source]

class modulus.launch.logging.console.RankZeroLoggingWrapper(obj, dist)[source]

MLflow Logger

modulus.launch.logging.mlflow.check_mlflow_logged_in(client: MlflowClient)[source]

modulus.launch.logging.mlflow.initialize_mlflow(experiment_name: str, experiment_desc: Optional[str] = None, run_name: Optional[str] = None, run_desc: Optional[str] = None, user_name: Optional[str] = None, mode: Literal['offline', 'online', 'ngc'] = 'offline', tracking_location: Optional[str] = None, artifact_location: Optional[str] = None) → Tuple[MlflowClient, Run][source]

Initializes MLFlow logging client and run.

Parameters

Note

For NGC mode, one needs to mount a NGC workspace / folder system with a metric folder at /mlflow/mlflow_metrics/ and a artifact folder at /mlflow/mlflow_artifacts/.

Note

This will set up Modulus Launch logger for MLFlow logging. Only one MLFlow logging client is supported with the Modulus Launch logger.

Returns
Return type

Weights and Biases Logger

Weights and Biases Routines and Utilities

modulus.launch.logging.wandb.alert(title, text, duration=300, level=0, is_master=True)[source]

modulus.launch.logging.wandb.initialize_wandb(project: str, entity: str, name: str = 'train', group: Optional[str] = None, sync_tensorboard: bool = False, save_code: bool = False, resume: Optional[str] = None, config=None, mode: Literal['offline', 'online', 'disabled'] = 'offline', results_dir: Optional[str] = None)[source]

Function to initialize wandb client with the weights and biases server.

Parameters

modulus.launch.logging.wandb.is_wandb_initialized()[source]

Logging utils

modulus.launch.logging.utils.create_ddp_group_tag(group_name: Optional[str] = None) → str[source]

Creates a common group tag for logging

For some reason this does not work with multi-node. Seems theres a bug in PyTorch when one uses a distributed util before DDP

Parameters
Returns
Return type