bridge.training.utils.log_utils#

Module Contents#

Functions#

warning_filter

Logging filter to exclude WARNING level messages.

module_filter

Logging filter to exclude messages from specific modules.

add_filter_to_all_loggers

Add a filter to the root logger and all existing loggers.

setup_logging

Set up logging level and filters for the application.

append_to_progress_log

Append a formatted string to the progress log file (rank 0 only).

barrier_and_log

Perform a distributed barrier and then log a message on rank 0.

log_single_rank

If torch distributed is initialized, log only on rank

Data#

API#

bridge.training.utils.log_utils.logger#

‘getLogger(…)’

bridge.training.utils.log_utils.warning_filter(record: logging.LogRecord) bool#

Logging filter to exclude WARNING level messages.

Parameters:

record – The logging record to check.

Returns:

False if the record level is WARNING, True otherwise.

bridge.training.utils.log_utils.module_filter(
record: logging.LogRecord,
modules_to_filter: list[str],
) bool#

Logging filter to exclude messages from specific modules.

Parameters:
  • record – The logging record to check.

  • modules_to_filter – A list of module name prefixes to filter out.

Returns:

False if the record’s logger name starts with any of the specified module prefixes, True otherwise.

bridge.training.utils.log_utils.add_filter_to_all_loggers(
filter: Union[logging.Filter, Callable[[logging.LogRecord], bool]],
) None#

Add a filter to the root logger and all existing loggers.

Parameters:

filter – A logging filter instance or callable to add.

bridge.training.utils.log_utils.setup_logging(
logging_level: int = logging.INFO,
filter_warning: bool = True,
modules_to_filter: Optional[list[str]] = None,
set_level_for_all_loggers: bool = False,
) None#

Set up logging level and filters for the application.

Configures the logging level based on arguments, environment variables, or defaults. Optionally adds filters to suppress warnings or messages from specific modules.

Logging Level Precedence:

  1. Env var MEGATRON_BRIDGE_LOGGING_LEVEL

  2. logging_level argument

  3. Default: logging.INFO

Parameters:
  • logging_level – The desired logging level (e.g., logging.INFO, logging.DEBUG).

  • filter_warning – If True, adds a filter to suppress WARNING level messages.

  • modules_to_filter – An optional list of module name prefixes to filter out.

  • set_level_for_all_loggers – If True, sets the logging level for all existing loggers. If False (default), only sets the level for the root logger and loggers starting with ‘nemo’.

bridge.training.utils.log_utils.append_to_progress_log(
save_dir: str,
string: str,
barrier: bool = True,
) None#

Append a formatted string to the progress log file (rank 0 only).

Includes timestamp, job ID, and number of GPUs in the log entry.

Parameters:
  • save_dir – The directory where the ‘progress.txt’ file is located.

  • string – The message string to append.

  • barrier – If True, performs a distributed barrier before writing (rank 0 only).

bridge.training.utils.log_utils.barrier_and_log(string: str) None#

Perform a distributed barrier and then log a message on rank 0.

Parameters:

string – The message string to log.

bridge.training.utils.log_utils.log_single_rank(
logger: logging.Logger,
*args: Any,
rank: int = 0,
**kwargs: Any,
)#

If torch distributed is initialized, log only on rank

Parameters:
  • logger (logging.Logger) – The logger to write the logs

  • args (Tuple[Any]) – All logging.Logger.log positional arguments

  • rank (int, optional) – The rank to write on. Defaults to 0.

  • kwargs (Dict[str, Any]) – All logging.Logger.log keyword arguments