Triton Management Service Control

Deployment Guide (1.1.0)

Triton Management Service Control (tmsctl) is a command line utility for interacting with Triton Management Service (TMS). It provides commands for interacting with TMS, creating and managing leases, and managing service configuration.

Download tmsctl from NGC

This document includes a reference of all commands, and an explanation of some configuration options that can be used to control the behavior of tmsctl.

Common Options

Many commands share a few common options. These are documented here.

(--target|-t):<target>

Specify the instances of TMS on which to perform the operation.

Valid targets are URLs beginning with http:// or https://, or the name of a named instance (see the target command).

Examples:

  • --target https://www.example.com:30345: connects to TMS at the specified URL.

  • --target my_tms: connects to a TMS instance named my_tms previously specified using tmsctl target add.

Unless a default target is specified using tsmctl target commands, all commands require the --target option.

(--porcelain|-z)

Formats output in an easy-to-parse format for scripts; avoids fancy formatting for human readers.

Porcelain output does not attempt to colorize output or insert unnecessary whitespace to improve readability.

This output is not guaranteed to be stable between releases.


Lease

The tmsctl lease commands allow you to perform operations on leases, such as creating, renewing, and releasing them.

Lease Create

tmsctl lease create [(--target|-t):<target>] [(--porcelain|-z)] (--model|-m):<model> [{Duration Options}] [{Automatic Renewal Options}] [{Autoscaling Options}] [{Triton Options}]

tmsctl lease create [(--target|-t):<target>] [(--porcelain|-z)] (--model|-m):<model> (--triton-pool|-p):<name> --quota:<quota> [{Duration Options}] [{Automatic Renewal Options}]

Connects to <target> and creates a lease for Triton Inference Server to serve one or more <model>.

Provides a <lease-id> for the newly created lease when successful.

An error code is returned when no default <target> exists and (--target|-t) has not been specified.

To learn about how to package models, see the model repository documentation.

Options

--model|-m:<model>

<model> is a comma-separated list of <name>=<value> pairs describing a model.

This option can be included multiple times, once for each unique model required. All models in a lease are loaded and provided by a single Triton Inference Server. If the set of models is too large or requires too many resources, Triton may fail to load them. In the event of a failure, an error is returned and the lease is made invalid.

The allowed <name>=<value> pairs are:

  • name (required): the name of the model. Must match the name expected by Triton.

  • uri (required): the URI from which to get the model.

  • count (optional, default=0): the number of instances of the model to load, or 0 to use the model’s default count.

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.

Automatic Renewal Options

--auto-renew

Makes a new lease eligible for automatic renewal. Auto-renewal rules are determined by the server.

--auto-renew-activity-window

The time window during which a lease must be used, before expiration, for it to be automatically renewed. Durations must be specified in (<hours>h)(<minutes>m)(<seconds>s) format (for example, 1h30m15s, 1h30m, 1h, 1m30s).

Autoscaling Options

Options related to automatically scaling the number of instances of a lease.

Note

Autoscaling and using pooled Triton instances are mutually exclusive. If any of these options are used at the same time pool options are used, an error is reported.

--enable-autoscaling

Enable autoscaling for this lease. This is automatically turned on if any of the other options related to autoscaling are set.

--autoscaling-max-replicas

Set the maximum number of replicas when autoscaling. Valid values are any positive integer. Implies --enable-autoscaling when provided.

--autoscaling-min-replicas

Set the minimum number of replicas when autoscaling. Valid values are any non-negative integer. Implies --enable-autoscaling when provided.

--autoscaling-metric-cpu-utilization

Set the state of autoscaling based on CPU utilization. Valid values are enable, disable, and server-default (default).

--autoscaling-metric-cpu-utilization-threshold

Set the threshold for autoscaling based on CPU utilization. The value must be a number between 0 (exclusive) and 100 (inclusive).

--autoscaling-metric-gpu-utilization

Set the state of autoscaling based on CPU utilization. Valid values are enable, disable, and server-default (default).

--autoscaling-metric-gpu-utilization-threshold

Set the threshold for autoscaling based on GPU utilization. The value must be a number between 0 (exclusive) and 100 (inclusive).

--autoscaling-metric-queue-time

Set the state of autoscaling based on queue time. Valid values are enable, disable, and server-default (default).

--autoscaling-metric-queue-time-threshold

Set the threshold for autoscaling based on queue time. Durations must be specified in (<hours>h)(<minutes>m)(<seconds>s) format (for example, 1h30m15s, 1h30m, 1h, 1m30s).

Duration Options

--duration

The initial duration of the lease. Durations must be specified in (<hours>h)(<minutes>m)(<seconds>s) format (for example, 1h30m15s, 1h30m, 1h, 1m30s).

--renewal-duration

The duration for which the lease renews when renewed. Durations must be specified in (<hours>h)(<minutes>m)(<seconds>s) format (for example, 1h30m15s, 1h30m, 1h, 1m30s).

Triton Options

Options related to how the Triton instance is created.

Note

Specifying Triton options and using pooled Triton instances are mutually exclusive. If any of these options are used at the same time pool options are used, an error is reported.

--triton-image|-i

Specifies the Triton container image for deployment of the lease. <triton-image> must be in the allowed list of Triton container images that are managed by the TMS administrator.

--triton-resources

Specifies the hardware resources for allocation to the Triton server for this lease.

Expected format: cpu=<count>,gpu=<count>,repository-size=<memory>,system-memory=<memory>,shared-memory=<memory>, where <count> is expected to be a positive integer, and <memory> is expected to be a positive number followed by Ki, Mi, or Gi to indicate the amount of memory.

When not provided and a pool is not specified, server-configured defaults are used.

Triton Pool Options

Options related to the creation of Triton pools.

Note

Specifying pool options is mutually exclusive with autoscaling options and Triton options. If any of these options are used at the same time as those, an error is reported.

--triton-pool|-p

Specifies the Triton Pool, by name, that the lease should be deployed into. Must be specified along with the --quota option.

--quota

Specifies the amount of available quota the lease consumes from a single instance of Triton in the target pool. Must be specified along with (--triton-pool|-p):<name>. Must be greater than zero.


Lease List

tmsctl lease list [(--target|-t):<target>] [(--porcelain|-z)]

tmsctl leases [(--target|-t):<target>] [(--porcelain|-z)]

Connects to <target> and list all active and pending leases.

By default, a summary of each lease is listed. Adding the --verbose flag increases the amount of output.

When no default <target> exists and (--target|-t) has not been specified, an error occurs.

Options

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.


Lease Release

tmsctl lease release <lease-id> [(--target|-t):<target>] [(--porcelain|-z)]

Connects to <target> and release lease <lease-id>.

When no default <target> exists and (--target|-t) has not been specified, an error occurs.

Options

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.

<lease-id>

Unique identifier of a lease.


Lease Renew

tmsctl lease renew <lease-id> [(--target|-t):<target>] [(--porcelain|-z)]

Connects to <target> and renew lease <lease-id>.

When no default <target> exists and (--target|-t) has not been specified, an error occurs.

Options

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.

<lease-id>

Unique identifier of a lease.


Lease Status

tmsctl lease status <lease-id> [(--target|-t):<target>] [(--porcelain|-z)]

Connects to <target> to get the current status of a lease.

When no default <target> exists and (--target|-t) has not been specified, an error occurs.

Options

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.

<lease-id>

Unique identifier of a lease.


Lease Name

Provides functionality for managing names associated with leases.

Lease Name Create

tmsctl lease name create (--name:)<lease-name> (--lease:)<lease-id> [(--target|-t):<target>] [(--porcelain|-z)]

Creates a new <lease-name> for an existing lease <lease-id>.

When no default <target> exists and (--target|-t) has not been specified, an error occurs.

Options

--lease

The unique identifier of a lease to which the name should refer. Can be specified without the --lease flag, if the name is specified first.

--name

The name of a lease to create. Can be specified without the --name flag, if it is the first positional argument.

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.

<lease-id>

The unique identifier of a lease to which the name should refer.

<lease-name>

The name of a lease to create.


Lease Name Delete

tmsctl lease name delete (--name:)<lease-name> ((--lease:)<lease-id>|--force) [(--target|-t):<target>] [(--porcelain|-z)]

Deletes an existing <lease-name> for a specified lease <lease-id>.

The lease <lease-id> associated with the <lease-name> does not have to be specified if the --force flag is provided.

When no default <target> exists and (--target|-t) has not been specified, an error occurs.

Options

--force

Delete the name regardless of what lease it currently refers to.

--lease

The unique identifier of a lease to which the name should refer. If this does not match what the name actually refers to, an error occurs. Can be specified without the --lease flag if the name is specified first.

--name

The name of a lease to delete. Can be specified without the --name flag if it is the first positional argument.

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.

<lease-id>

The unique identifier of a lease to which the name should refer. If this does not match what the name actually refers to, an error occurs. Can be specified without the --lease flag if the name is specified first.

<lease-name>

The name of a lease to delete.


Lease Name List

tmsctl lease name list [(--target|-t):<target>] [(--porcelain|-z)]

Connects to <target> and lists all lease names associated with existing leases.

When no default <target> exists and (--target|-t) has not been specified, an error occurs.

Options

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.


tmsctl lease name move (--name:)<lease-name> ((--source-lease:)<source-lease-id>|--force) (--target-lease:)<target-lease-id> [(--target|-t):<target>] [(--porcelain|-z)]

Moves a <lease-name> from one lease <source-lease-id> to another <target-lease-id>.

The lease <source-lease-id> associated with the <lease-name> does not have to be specified if the --force flag is provided.

When no default <target> exists and (--target|-t) has not been specified, an error occurs.

Options

--force

Move the name regardless of what lease it currently refers to.

--name

The name of a lease to move. Can be specified without the --name flag if it is the first positional argument.

--source-lease The unique identifier of the lease to which the name should currently refer. If this does not match what the name actually refers to, an error occurs. Can be specified without the --source-lease flag if the name is specified first.

--target-lease The unique identifier of the new lease to which the name should refer. Can be specified without the --target-lease flag if the name and source lease are specified first.

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.


Pool

Commands for managing Triton pools.

Pool Create

tmsctl pool create <name> (--instance-quota|-q):<quota> [--disable-backend-uniqueness] [(--instances|-c):<count>] [--triton-resources:<resources>] [--triton-container:<image-name>] [(--target|-t):<target>][(--porcelain|-z)]

An error code is returned when no default exists and (–target|-t) has not been specified.

Options

--disable-backend-uniqueness

Disables Triton backend uniqueness enforcement.

By default, Triton pools segregate Triton instances by the Triton backends that are used by the loaded models. Disabling this segregation enables leases with models that have differing Triton backend requirements to be collocated. The mixing of Triton backends can cause runtime out-of-memory errors.

--instance-quota|-q

Specifies the maximum allocatable quota per Triton instance in the pool as an integer. This value limits the number of leases that can be assigned to a single Triton instance in the pool based. Leases deployed into the pool must specify the amount of quota that they consume and can only be placed on Triton instances with sufficient remaining quota. Specifying a quota larger than is physically available can lead to resource exhaustion errors and server crashes. When the provided value is outside the configured server limits, the pool creation request fails. Value is required and must be greater than zero.

--instances|-c

Specifies the minimum and maximum number of Triton instances allowed to exist in the pool. Expected format: <minimum>,<maximum> where <minimum> and/or <maximum> can be replaced with * to use the configured server default value. When not provided the configured server defaults are used.

--triton-container

Specifies the Triton container image for use by all Triton instances in the pool. <image-name> must be in the allowed list of Triton container images managed by the TMS administrator.

--triton-resources

Specifies the hardware resources to allocate to the Triton server for this lease.

Expected format: cpu=<count>,gpu=<count>,repository-size=<memory>,system-memory=<memory>,shared-memory=<memory>, where <count> is expected to be a positive integer, and <memory> is expected to be a positive number followed by Ki, Mi, or Gi to indicate the amount of memory.

When not provided, configured server defaults are used.

<name>

Unique name of the pool. This name is used to reference the pool when creating leases that make use of the pool. Pool names can contain only alphanumeric, hyphen, and underscore characters. Pool names are case insensitive and must not conflict with any existing, active pool. Value is required. Maximum allowed size is 512 characters and minimum size is 8 characters.

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.


Pool Delete

tmsctl pool delete (<triton-pool-name>|<triton-pool-name>) [(--target|-t):<target>] [(--porcelain|-z)]

An error code is returned when no default exists and (–target|-t) has not been specified.

Options

<triton-pool-name> Unique identifier of the pool. Represented as 32 character UUID.

<triton-pool-name> Unique name of the pool.

Pool names can contain only alphanumeric, hyphen, and underscore characters. Pool names are case insensitive. Maximum allowed size is 1024 bytes.

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.


Pool List

tmsctl pool list [(--verbose|-v)] [(--target|-t):<target>] [(--porcelain|-z)]

An error code is returned when no default exists and (–target|-t) has not been specified.

Options

--verbose|-v

Whether to produce more verbose details about the pools.

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.


Pool Status

tmsctl pool status (<triton-pool-name>|<triton-pool-id>) [(--verbose|-v)] [(--target|-t):<target>] [(--porcelain|-z)]

An error code is returned when no default exists and (–target|-t) has not been specified.

Options

--verbose|-v

Whether to produce more verbose details about the pools.

<triton-pool-id>

Unique identifier of the pool. Represented as 32 character UUID.

<triton-pool-name>

Unique name of the pool. Pool names can contain only alphanumeric, hyphen, and underscore characters. Pool names are case insensitive. Maximum allowed size is 1024 bytes.

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.


Allowlist

Allowlist Add

tmsctl allowlist add <image> [(--target|-t):<target>] [(--porcelain|-z)]

Connects to <target> and adds a Triton container image to the Triton allowlist.

When no default <target> exists and (--target|-t) has not been specified, an error occurs.

Options

<image>

Container image to add the list of allowed Triton container images.

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.


Allowlist List

tmsctl allowlist list [(--target|-t):<target>] [(--porcelain|-z)]

Connects to <target> and lists the Triton container images new leases are allowed to be created with.

When no default <target> exists and (--target|-t) has not been specified, an error occurs.

Options

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.


Allowlist Remove

tmsctl allowlist rm <image> [(--target|-t):<target>] [(--porcelain|-z)]

Connects to <target> and removes a Triton container image from the Triton allow-list.

When no default <target> exists and (--target|-t) has not been specified, an error occurs.

Options

<image>

Container image to remove from the list of allowed Triton container images.

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.

--target|-t

Determines the Triton Management Service to connect to. See Common Options.

Target

Target Add

tmsctl target add [--force] [--set-default] <name> <url>

Adds a new <target> to the set of configured Triton Management Services.

<url> is required to be prefixed with “http://” or “https://”.

When <target> already exists in the list of configured Triton Management Services, an error occurs unless --force is specified.

Options

--force

Allows for the replacement of an existing configured Triton Management Service, when specified, with <target> and <url>.

--set-default

Sets as default target for future commands that require a connection to Triton Management Service.


Target List

tmsctl target list [(--porcelain|-z)] tmsctl targets [(--porcelain|-z)]

Lists all configured Triton Management Server targets.

Options

--porcelain|-z

Formats output in an easy-to-parse format for scripts. See Common Options.


Target Remove

tmsctl target rm [--force] <name>

Removes <target> from the set of configured Triton Management Services.

If the target is the default, it is not removed unless the --force flag is used.

Options

--force

Removes <target> regardless of whether it has been set to default or not.


Target Set

tmsctl target set <name>

tmsctl target <target>

Sets <target> as the default target for future commands that require a connection to Triton Management Service.

<target> must have already been added to list of possible targets.

See tmsctl target add for additional details.

On startup, tmsctl reads a configuration, if the .tmsctlconfig name exists in the home directory. This file contains information about any named targets (see the target command) and options that are used configure the output of tmsctl.

The format of .tmsctlconfig is not guaranteed to be stable and is subject to change. For now, it is a JSON file.

Configuring Output Colors

By default, tmsctl outputs everything in the default colors of the console. This can be changed by adding an entry named "console" at the top level of the configuration file and setting is "enable-colors" property to "true". The following example configuration tells tmsctl to enable colors with its default color scheme:

Copy
Copied!
            

{ "console": { "enable-colors": "true" } }

If the default tmsctl color scheme does not work well with your preferred terminal settings, you can customize the set of colors that tmsctl uses. When colors are enabled, tmsctl reads the additional properties from the "console" object to control text color:

  • "color": used for most output

  • "emphasis": used for lines that add emphasis (for example, lease IDs in tmsctl lease create)

  • "error": used for errors

  • "header": used for header lines

  • "understated": used for output that can often be ignored

  • "warning": used for warnings

Some options can have "-back" added to control the background color of the corresponding entry. The options that support "-back" are "emphasis", "error", "warining" and "understated”.

Allowed values for colors are those listed by the .NET class (ConsoleColor)[https://learn.microsoft.com/en-us/dotnet/api/system.consolecolor]. Colors must be provided in lower case, with words separated by a -. For example, to use ConsoleColor.DarkGreen, you specify "dark-green" in the configuration file.

© Copyright 2023, NVIDIA. Last updated on Dec 11, 2023.