For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI Reference
DocumentationAPI Reference
  • API Reference
    • Overview
        • Nemo Curator
          • Backends
          • Config
          • Core
          • Metrics
            • Constants
            • Start Prometheus Grafana
            • Utils
          • Models
          • Package Info
          • Pipeline
          • Stages
          • Tasks
          • Utils
    • Pipeline
    • ProcessingStage
    • CompositeStage
    • Resources
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Curator
On this page
  • Module Contents
  • Functions
  • API
API ReferenceFull Library ReferenceNemo CuratorNemo CuratorMetrics

nemo_curator.metrics.utils

||View as Markdown|
Previous

nemo_curator.metrics.start_prometheus_grafana

Next

nemo_curator.models

Module Contents

Functions

NameDescription
_get_all_discovery_pathsExtract all file paths from all file_sd_configs entries in the prometheus config.
_is_process_running_from_pidfileCheck if a process is running by reading its PID from a file and verifying it’s alive.
_resolve_metrics_dirResolve the metrics directory, defaulting to DEFAULT_NEMO_CURATOR_METRICS_PATH.
_write_ray_default_dashboardsGenerate and write Ray’s default Grafana dashboards to the dashboards directory.
add_ray_prometheus_metrics_service_discoveryAdd the ray prometheus metrics service discovery to the prometheus config.
download_and_extract_prometheusDownload the prometheus tarball and extract it to the metrics directory.
download_grafanaDownload the grafana tarball and extract it to the metrics directory.
get_prometheus_portGet the port number that Prometheus is running on by reading the port file.
is_grafana_runningCheck if Grafana is currently running for this metrics instance.
is_prometheus_runningCheck if Prometheus is currently running for this metrics instance.
launch_grafanaLaunch the grafana server.
remove_ray_prometheus_metrics_service_discoveryRemove the ray prometheus metrics service discovery from the prometheus config.
run_prometheusRun the prometheus server.
write_grafana_configsWrite the grafana configs to the grafana directory.

API

nemo_curator.metrics.utils._get_all_discovery_paths(
prometheus_config: dict
) -> list[str]

Extract all file paths from all file_sd_configs entries in the prometheus config.

nemo_curator.metrics.utils._is_process_running_from_pidfile(
pid_file_path: str
) -> bool

Check if a process is running by reading its PID from a file and verifying it’s alive.

nemo_curator.metrics.utils._resolve_metrics_dir(
metrics_dir: str | None
) -> str

Resolve the metrics directory, defaulting to DEFAULT_NEMO_CURATOR_METRICS_PATH.

nemo_curator.metrics.utils._write_ray_default_dashboards(
dashboards_path: str
) -> None

Generate and write Ray’s default Grafana dashboards to the dashboards directory.

nemo_curator.metrics.utils.add_ray_prometheus_metrics_service_discovery(
ray_temp_dir: str,
metrics_dir: str | None = None
) -> None

Add the ray prometheus metrics service discovery to the prometheus config.

nemo_curator.metrics.utils.download_and_extract_prometheus(
metrics_dir: str | None = None,
os_type = None,
architecture = None,
prometheus_version = None
) -> str

Download the prometheus tarball and extract it to the metrics directory.

nemo_curator.metrics.utils.download_grafana(
metrics_dir: str | None = None
) -> str

Download the grafana tarball and extract it to the metrics directory.

nemo_curator.metrics.utils.get_prometheus_port(
metrics_dir: str | None = None
) -> int

Get the port number that Prometheus is running on by reading the port file.

nemo_curator.metrics.utils.is_grafana_running(
metrics_dir: str | None = None
) -> bool

Check if Grafana is currently running for this metrics instance.

nemo_curator.metrics.utils.is_prometheus_running(
metrics_dir: str | None = None
) -> bool

Check if Prometheus is currently running for this metrics instance.

nemo_curator.metrics.utils.launch_grafana(
grafana_dir: str,
grafana_ini_path: str,
grafana_web_port: int,
metrics_dir: str | None = None
) -> None

Launch the grafana server.

nemo_curator.metrics.utils.remove_ray_prometheus_metrics_service_discovery(
ray_temp_dir: str,
metrics_dir: str | None = None
) -> None

Remove the ray prometheus metrics service discovery from the prometheus config.

nemo_curator.metrics.utils.run_prometheus(
prometheus_dir: str,
prometheus_web_port: int,
metrics_dir: str | None = None
) -> None

Run the prometheus server.

nemo_curator.metrics.utils.write_grafana_configs(
grafana_web_port: int,
prometheus_web_port: int,
metrics_dir: str | None = None
) -> str

Write the grafana configs to the grafana directory.