nemo_curator.metrics.utils

View as Markdown

Module Contents

Functions

NameDescription
_get_all_discovery_pathsExtract all file paths from all file_sd_configs entries in the prometheus config.
_is_process_running_from_pidfileCheck if a process is running by reading its PID from a file and verifying it’s alive.
_resolve_metrics_dirResolve the metrics directory, defaulting to DEFAULT_NEMO_CURATOR_METRICS_PATH.
_write_ray_default_dashboardsGenerate and write Ray’s default Grafana dashboards to the dashboards directory.
add_ray_prometheus_metrics_service_discoveryAdd the ray prometheus metrics service discovery to the prometheus config.
download_and_extract_prometheusDownload the prometheus tarball and extract it to the metrics directory.
download_grafanaDownload the grafana tarball and extract it to the metrics directory.
get_prometheus_portGet the port number that Prometheus is running on by reading the port file.
is_grafana_runningCheck if Grafana is currently running for this metrics instance.
is_prometheus_runningCheck if Prometheus is currently running for this metrics instance.
launch_grafanaLaunch the grafana server.
remove_ray_prometheus_metrics_service_discoveryRemove the ray prometheus metrics service discovery from the prometheus config.
run_prometheusRun the prometheus server.
write_grafana_configsWrite the grafana configs to the grafana directory.

API

nemo_curator.metrics.utils._get_all_discovery_paths(
prometheus_config: dict
) -> list[str]

Extract all file paths from all file_sd_configs entries in the prometheus config.

nemo_curator.metrics.utils._is_process_running_from_pidfile(
pid_file_path: str
) -> bool

Check if a process is running by reading its PID from a file and verifying it’s alive.

nemo_curator.metrics.utils._resolve_metrics_dir(
metrics_dir: str | None
) -> str

Resolve the metrics directory, defaulting to DEFAULT_NEMO_CURATOR_METRICS_PATH.

nemo_curator.metrics.utils._write_ray_default_dashboards(
dashboards_path: str
) -> None

Generate and write Ray’s default Grafana dashboards to the dashboards directory.

nemo_curator.metrics.utils.add_ray_prometheus_metrics_service_discovery(
ray_temp_dir: str,
metrics_dir: str | None = None
) -> None

Add the ray prometheus metrics service discovery to the prometheus config.

nemo_curator.metrics.utils.download_and_extract_prometheus(
metrics_dir: str | None = None,
os_type = None,
architecture = None,
prometheus_version = None
) -> str

Download the prometheus tarball and extract it to the metrics directory.

nemo_curator.metrics.utils.download_grafana(
metrics_dir: str | None = None
) -> str

Download the grafana tarball and extract it to the metrics directory.

nemo_curator.metrics.utils.get_prometheus_port(
metrics_dir: str | None = None
) -> int

Get the port number that Prometheus is running on by reading the port file.

nemo_curator.metrics.utils.is_grafana_running(
metrics_dir: str | None = None
) -> bool

Check if Grafana is currently running for this metrics instance.

nemo_curator.metrics.utils.is_prometheus_running(
metrics_dir: str | None = None
) -> bool

Check if Prometheus is currently running for this metrics instance.

nemo_curator.metrics.utils.launch_grafana(
grafana_dir: str,
grafana_ini_path: str,
grafana_web_port: int,
metrics_dir: str | None = None
) -> None

Launch the grafana server.

nemo_curator.metrics.utils.remove_ray_prometheus_metrics_service_discovery(
ray_temp_dir: str,
metrics_dir: str | None = None
) -> None

Remove the ray prometheus metrics service discovery from the prometheus config.

nemo_curator.metrics.utils.run_prometheus(
prometheus_dir: str,
prometheus_web_port: int,
metrics_dir: str | None = None
) -> None

Run the prometheus server.

nemo_curator.metrics.utils.write_grafana_configs(
grafana_web_port: int,
prometheus_web_port: int,
metrics_dir: str | None = None
) -> str

Write the grafana configs to the grafana directory.