Nscale Topology Provider

View as Markdown

The nscale topology provider reads topology data from the Nscale Radar API and converts it into Topograph’s canonical three-tier topology graph.

The provider uses two Nscale APIs:

  • Radar API: returns each instance’s network path via GET /v1/topology
  • Instance API: returns instance metadata via GET /v2/instances?organizationID=<org>&regionID=<region>

The Radar response supplies the provider instance ID, switch path, and optional block ID. The Instance API response maps provider instance IDs to hostnames using metadata.id and metadata.name; this is used by the Slurm engine when Topograph discovers Slurm nodes automatically.

When to Use This Provider

Use this provider for Nscale environments where Radar is the topology source. It is most commonly used with the Slurm engine to generate topology.conf from the current Slurm node list.

If the request payload supplies explicit nodes, Topograph uses those instance ID to node name mappings directly. If nodes is omitted and the Slurm engine is used, Topograph runs scontrol show nodes -o, asks the Nscale Instance API for the instance catalog in the configured region, and keeps entries whose metadata.name matches a Slurm node name.

Prerequisites

  • A Radar API endpoint reachable from the Topograph host
  • An Instance API endpoint reachable from the Topograph host
  • An Nscale organization ID
  • An API token with permission to read topology and instance metadata
  • The Nscale region ID for the cluster
  • For Slurm auto-discovery, scontrol must be available to the Topograph process

Credentials

FieldRequiredDescription
orgYesNscale organization ID
tokenYesBearer token used for Radar and Instance API requests
regionRequired for Slurm auto-discoveryNscale region ID used for Instance API lookup and Slurm region assignment

Store credentials in a YAML file:

1org: <ORGANIZATION_ID>
2token: <API_TOKEN>
3region: <REGION_ID>

Reference that file from the Topograph config:

1credentialsPath: /etc/topograph/nscale-credentials.yaml

Credentials can also be supplied directly in the topology request payload under provider.creds.

Parameters

FieldRequiredDescription
radarApiUrlYesBase URL for the Radar API, for example https://radar.example.com
instanceApiUrlYesBase URL for the Instance API, for example https://api.example.com
trimTiersNoNumber of highest topology tiers to trim from output. Defaults to 0

The top-level Topograph pageSize setting controls pagination for the Radar topology request.

Configuration

Example Topograph config for Slurm:

1http:
2 port: 49021
3 ssl: false
4
5provider: nscale
6engine: slurm
7
8requestAggregationDelay: 15s
9credentialsPath: /etc/topograph/nscale-credentials.yaml
10
11providerParams:
12 radarApiUrl: https://radar.example.com
13 instanceApiUrl: https://api.example.com
14
15engineParams:
16 plugin: topology/tree
17 topologyConfigPath: /etc/slurm/topology.conf

Example request payload:

1{
2 "provider": {
3 "name": "nscale",
4 "creds": {
5 "org": "<ORGANIZATION_ID>",
6 "token": "<API_TOKEN>",
7 "region": "<REGION_ID>"
8 },
9 "params": {
10 "radarApiUrl": "https://radar.example.com",
11 "instanceApiUrl": "https://api.example.com"
12 }
13 },
14 "engine": {
15 "name": "slurm",
16 "params": {
17 "plugin": "topology/tree"
18 }
19 }
20}

If you already have the instance ID to hostname mapping, you can include it explicitly:

1{
2 "provider": {
3 "name": "nscale",
4 "creds": {
5 "org": "<ORGANIZATION_ID>",
6 "token": "<API_TOKEN>",
7 "region": "<REGION_ID>"
8 },
9 "params": {
10 "radarApiUrl": "https://radar.example.com",
11 "instanceApiUrl": "https://api.example.com"
12 }
13 },
14 "engine": {
15 "name": "slurm"
16 },
17 "nodes": [
18 {
19 "region": "<REGION_ID>",
20 "instances": {
21 "<INSTANCE_ID_1>": "node001",
22 "<INSTANCE_ID_2>": "node002"
23 }
24 }
25 ]
26}

How It Works

For each region in the compute instance list, the provider fetches topology pages from Radar:

GET <radarApiUrl>/v1/topology?limit=<pageSize>&offset=<offset>
Authorization: Bearer <token>
X-Organization: <org>
X-Region: <region>

Each returned instance is translated as follows:

Radar fieldTopograph field
instance_idInstance ID
network_node_path[0]Core tier
network_node_path[1]Spine tier
network_node_path[2]Leaf tier
block_idAccelerator / NVLink domain

For Slurm auto-discovery, the provider also fetches instance metadata:

GET <instanceApiUrl>/v2/instances?organizationID=<org>&regionID=<region>
Authorization: Bearer <token>

It builds the same map produced by:

$curl -s -H "Authorization: Bearer $TOKEN" \
> "$INSTANCE_API_URL/v2/instances?organizationID=$ORG&regionID=$REGION" \
> | jq -r '.[] | "\(.metadata.id)\t\(.metadata.name)"'

Verifying the Output

First verify that the Instance API returns the hostnames Slurm knows:

$curl -s -H "Authorization: Bearer $TOKEN" \
> "$INSTANCE_API_URL/v2/instances?organizationID=$ORG&regionID=$REGION" \
> | jq -r '.[] | "\(.metadata.id)\t\(.metadata.name)"'

Then trigger topology generation:

$id=$(curl -s -X POST -H "Content-Type: application/json" -d @payload.json http://localhost:49021/v1/generate)
$curl -s "http://localhost:49021/v1/topology?uid=$id"

For the Slurm engine, verify that the generated topology.conf contains the expected switch hierarchy or block topology for the Nscale instances.