quickstart.cluster#

High-level quickstart cluster management API.

Module Contents#

Classes#

QuickstartCluster

High-level API for managing a quickstart cluster.

API#

exception quickstart.cluster.PreflightError(results: list[quickstart.preflight.PreflightResult])#

Bases: Exception

Raised when pre-flight checks fail.

Initialization

Initialize preflight error.

Parameters:

results – List of preflight check results.

class quickstart.cluster.QuickstartCluster(
config: quickstart.config.QuickstartConfig | None = None,
platform_config: quickstart.platform_config.PlatformConfig | None = None,
platform_config_path: pathlib.Path | None = None,
)#

High-level API for managing a quickstart cluster.

This is the main SDK interface for programmatic quickstart management.

Example

from nemo_platform_ext.quickstart import QuickstartCluster

cluster = QuickstartCluster() cluster.start() print(cluster.status()) cluster.stop()

The cluster runs a single nmp-api container with: - Docker socket mounted for job execution (DOOD pattern) - Persistent data volume for storage - Configurable platform settings

Initialization

Initialize a quickstart cluster.

Parameters:
  • config – Quickstart configuration. Loaded from default path if not provided.

  • platform_config – Platform configuration. Uses default if not provided.

  • platform_config_path – Path to platform config YAML. Overrides platform_config.

cluster_info() dict#

Get detailed cluster information.

Returns:

Dictionary with cluster configuration and state.

destroy() None#

Stop the cluster and remove all data.

is_running() bool#

Check if the cluster is running.

Returns:

True if the cluster container is running.

logs(
follow: bool = False,
tail: int | None = 100,
) collections.abc.Iterator[str]#

Stream cluster logs.

Parameters:
  • follow – Keep following log output.

  • tail – Number of lines to show from the end, or None for all logs.

Yields:

Log lines as strings.

preflight() list[quickstart.preflight.PreflightResult]#

Run pre-flight checks and return results.

Returns:

List of PreflightResult objects.

start(skip_preflight: bool = False, pull: bool = True) None#

Start the quickstart cluster.

Parameters:
  • skip_preflight – Skip pre-flight checks (not recommended).

  • pull – Pull the container image before starting.

Raises:
  • PreflightError – If pre-flight checks fail.

  • docker.errors.APIError – If Docker operations fail.

status() dict#

Get cluster status.

Returns:

  • running: bool

  • status: str

  • health: str

  • url: str (if running)

Return type:

Dictionary with status information including

stop() None#

Stop the quickstart cluster.

wait_for_healthy(timeout: int = 300, interval: int = 5) bool#

Wait for the cluster to become healthy.

Parameters:
  • timeout – Maximum time to wait in seconds.

  • interval – Time between health checks in seconds.

Returns:

True if cluster became healthy within timeout, False otherwise.