Data Designer NMP SDK Resources#

The data_designer.config module provides a consistent, context-agnostic experience for building Data Designer configs. Once you are ready to execute that config on the NMP Data Designer service, you use objects from the nemo_platform SDK. This page explains the NMP-specific objects used to interact with the Data Designer service.

DataDesignerResource#

The DataDesignerResource is the initial SDK object for working with Data Designer on NMP. It is analogous to the library’s data_designer.interface.DataDesigner object.

A DataDesignerResource is accessed directly from a NeMoPlatform instance:

import os
from nemo_platform import NeMoPlatform


sdk = NeMoPlatform(
    base_url=os.environ.get("NMP_BASE_URL", "http://localhost:8080"),
    workspace="default",
)
data_designer = sdk.data_designer  # this object is a DataDesignerResource

The DataDesignerResource is primarily used to make preview requests (preview) and create jobs (create), but exposes some additional useful methods:

Method	Description
`get_default_model_providers()`	Returns a list of model providers registered with the Models and Inference Gateway services that can be used in your Data Designer config.
`get_job_resource(job_name: str)`	Returns a `DataDesignerJobResource` for interacting with a job (see below).

DataDesignerJobResource#

The DataDesignerJobResource provides several helper methods for working with a job. It is returned by the DataDesignerResource#create method when you create a job; you can also use DataDesignerResource#get_job_resource to get an instance of this object for an existing job.

Some of the most useful methods are described below.

Method	Description
`wait_until_done()`	Polls the job service until the job reaches a terminal state. Prints job logs along the way.
`get_logs()`	Returns logs from the job as a list of dicts. Handles pagination automatically.
`download_artifacts()`	Downloads the job results as a tar archive. Returns a `DataDesignerJobResults` object (see below).

DataDesignerJobResults#

The DataDesignerJobResults object simplifies loading downloaded job results into memory.

Method	Description
`load_analysis()`	Returns a `DatasetProfilerResults` object (from the library) with an analysis of the dataset.
`load_dataset()`	Returns the output dataset as a Pandas DataFrame.
`load_processor_dataset(processor_name: str)`	Returns the named processor dataset as a Pandas DataFrame.