UFM Bright Cluster Integration Plugin REST APIs

NVIDIA UFM Enterprise REST API Guide v6.14.1

The following authentication types are supported:

  • basic (/ufmRest)

  • client (/ufmRestV2)

  • token (/ufmRestV3)

  • Description: Gets the current streaming configurations

  • URL: GET ufmRest/plugin/bright/conf

  • Request Data: N/A

  • Response:

    Copy
    Copied!
                

    { "bright-config": {    "certificate": "-----BEGIN CERTIFICATE-----\nXXXXXXX\n-----END CERTIFICATE-----\n",    "certificate_key": "-----BEGIN PRIVATE KEY-----\nXXXXXXX\n-----END PRIVATE KEY-----\n",    "data_retention_period": "30d",    "enabled": true,    "host": "10.209.36.79",    "port": 8081,    "status": {      "err_message": "",      "status": "Healthy"    },    "timezone": "Europe/Amsterdam" }, "logs-config": {    "log_file_backup_count": 5,    "log_file_max_size": 10485760,    "logs_file_name": "/log/bright_plugin.log",    "logs_level": "INFO" } }

  • Description: Updates the current bright configurations

  • URL: PUT ufmRest/plugin/bright/conf

  • Request Data:

    Copy
    Copied!
                

    { "bright-config": {    "certificate": "-----BEGIN CERTIFICATE-----\nXXXXXXX\n-----END CERTIFICATE-----\n",    "certificate_key": "-----BEGIN PRIVATE KEY-----\nXXXXXXX\n-----END PRIVATE KEY-----\n",    "data_retention_period": "30d",    "enabled": true,    "host": "10.209.36.79",    "port": 8081,    "status": {      "err_message": "",      "status": "Healthy"    },    "timezone": "Europe/Amsterdam" }, "logs-config": {    "log_file_backup_count": 5,    "log_file_max_size": 10485760,    "logs_file_name": "/log/bright_plugin.log",    "logs_level": "INFO" } }

  • Response: string “Set configurations has been done successfully”

  • Status Codes:

    • 200 – Ok.

    • 400 – bad request (bad or missing parameters).

Configurations parameter details:

Parameter

Description

Host

Hostname or IP of the BCM server

Port

Port of the BCM server, normally will be 8081

Certificate

BMC client certificate content that could be located in the BMC server machine under .cm/XXX.pem

Certificate key

BMC client certificate key that could be located in the BMC server machine under .cm/XXX.key

Data retention period

UFM erases the data gathered in the database after the configured retention period. By default, after 30 days.

  • Description: Gets the cached nodes from the Bright Cluster Manager

  • URL: GET ufmRest/plugin/bright/data/nodes

  • Request Data: N/A

  • Response:

    Copy
    Copied!
                

    [ "node001", "swx-tor01" ]

  • Description: Gets the cached jobs from the Bright Cluster Manager nodes

  • URL: GET ufmRest/plugin/bright/data/jobs[?nodes=<node1,node2,…>]&from=timestamp1&to=timestamp2&tz=”requested_client_timezone”

  • Request Data: N/A

  • Response:

    Copy
    Copied!
                

    [ {    "account": "root",    "arguments": "",    "arrayID": "",    "baseType": "Job",    "cgroup": "",    "childType": "SlurmJob",    "commandLineInterpreter": "",    "comment": "",    "debug": false,    "dependencies": [],    "endtime": "2023-04-13T14:08:59",    "environmentVariables": [],    "executable": "",    "exitCode": 0,    "inqueue": "",    "jobID": "166",    "jobname": "interactive",    "mailList": "",    "mailNotify": false,    "mailOptions": "",    "maxWallClock": "UNLIMITED",    "memoryUse": 0,    "minMemPerNode": 0,    "modified": false,    "modules": [],    "nodes": [      "node001"    ],    "numberOfNodes": 1,    "numberOfProcesses": 8,    "oldLocalUniqueKey": 0,    "parallelEnvironment": "",    "parentID": "",    "pendingReasons": [      "NonZeroExitCode"    ],    "placement": "",    "priority": "4294901759",    "project": "",    "refJobQueueUniqueKey": 77309411329,    "refWlmClusterUniqueKey": 163208757249,    "requestedCPUCores": 0,    "requestedCPUs": 8,    "requestedGPUs": 0,    "requestedMemory": 0,    "requestedSlots": 0,    "resourceList": [],    "revision": "",    "runWallClock": 3,    "rundirectory": "/root",    "scriptFile": "",    "starttime": "2023-04-13T14:08:56",    "status": "FAILED",    "stderrfile": "",    "stdinfile": "",    "stdoutfile": "",    "submittime": "2023-04-13T14:08:56",    "taskID": "",    "toBeRemoved": false,    "uniqueKey": 70368744177830,    "userdefined": [],    "usergroup": "root",    "username": "root" } ]

Please be aware that the following filters are available as options (as indicated in the URL):

  • To filter jobs by node(s) name, use the parameter "nodes" followed by a comma-separated list of nodes (e.g. nodes=node1,node2,etc...).

  • To filter jobs by their creation timestamp, specify a start and end time in integer timestamp format (in milliseconds).

© Copyright 2023, NVIDIA. Last updated on Oct 24, 2023.