Model repository settings for the Triton Inference Server.
Definition at line 453 of file infer_trtis_server.h.
Public Member Functions | |
bool | initFrom (const ic::TritonModelRepo &repo, const std::vector< int > &devIds) |
Populate the RepoSettings instance with the values from the TritonModelRepo protobuf message. More... | |
bool | operator== (const RepoSettings &other) const |
Comparison operators. More... | |
bool | operator!= (const RepoSettings &other) const |
Data Fields | |
std::set< std::string > | roots |
Set of model repository directories. More... | |
uint32_t | logLevel = 0 |
Level of the Triton log output. More... | |
bool | tfAllowSoftPlacement = true |
Flag to enable/disable soft placement of TF operators. More... | |
float | tfGpuMemoryFraction = 0 |
TensorFlow GPU memory fraction per process. More... | |
bool | strictModelConfig = true |
Flag to enable/disable Triton strict model configuration. More... | |
double | minComputeCapacity = TRITON_DEFAULT_MINIMUM_COMPUTE_CAPABILITY |
The minimun supported compute compability for Triton server. More... | |
uint64_t | pinnedMemBytes = TRITON_DEFAULT_PINNED_MEMORY_BYTES |
Pre-allocated pinned memory on host for Triton runtime. More... | |
std::string | backendDirectory {TRITON_DEFAULT_BACKEND_DIR} |
The path to the Triton backends directory. More... | |
int32_t | controlMode = (int32_t)TRITONSERVER_MODEL_CONTROL_EXPLICIT |
Triton model control mode. More... | |
std::map< uint32_t, uint64_t > | cudaDevMemMap |
Map of the device IDs and corresponding size of CUDA memory pool to be allocated. More... | |
std::vector< BackendConfig > | backendConfigs |
Array of backend configurations settings. More... | |
std::string | debugStr |
Debug string of the TritonModelRepo protobuf message. More... | |
bool nvdsinferserver::triton::RepoSettings::initFrom | ( | const ic::TritonModelRepo & | repo, |
const std::vector< int > & | devIds | ||
) |
Populate the RepoSettings instance with the values from the TritonModelRepo protobuf message.
[in] | repo | The model repository configuration proto message. |
[in] | devIds | Not used. |
|
inline |
Definition at line 522 of file infer_trtis_server.h.
References operator==().
bool nvdsinferserver::triton::RepoSettings::operator== | ( | const RepoSettings & | other | ) | const |
Comparison operators.
Check that the two repository settings are same/different. Different control modes are reported as warning. CudaDeviceMem is not checked.
Referenced by operator!=().
std::vector<BackendConfig> nvdsinferserver::triton::RepoSettings::backendConfigs |
Array of backend configurations settings.
Definition at line 498 of file infer_trtis_server.h.
std::string nvdsinferserver::triton::RepoSettings::backendDirectory {TRITON_DEFAULT_BACKEND_DIR} |
The path to the Triton backends directory.
Definition at line 485 of file infer_trtis_server.h.
int32_t nvdsinferserver::triton::RepoSettings::controlMode = (int32_t)TRITONSERVER_MODEL_CONTROL_EXPLICIT |
Triton model control mode.
Definition at line 489 of file infer_trtis_server.h.
std::map<uint32_t, uint64_t> nvdsinferserver::triton::RepoSettings::cudaDevMemMap |
Map of the device IDs and corresponding size of CUDA memory pool to be allocated.
Definition at line 494 of file infer_trtis_server.h.
std::string nvdsinferserver::triton::RepoSettings::debugStr |
Debug string of the TritonModelRepo protobuf message.
Definition at line 503 of file infer_trtis_server.h.
uint32_t nvdsinferserver::triton::RepoSettings::logLevel = 0 |
Level of the Triton log output.
Definition at line 461 of file infer_trtis_server.h.
double nvdsinferserver::triton::RepoSettings::minComputeCapacity = TRITON_DEFAULT_MINIMUM_COMPUTE_CAPABILITY |
The minimun supported compute compability for Triton server.
Definition at line 477 of file infer_trtis_server.h.
uint64_t nvdsinferserver::triton::RepoSettings::pinnedMemBytes = TRITON_DEFAULT_PINNED_MEMORY_BYTES |
Pre-allocated pinned memory on host for Triton runtime.
Definition at line 481 of file infer_trtis_server.h.
std::set<std::string> nvdsinferserver::triton::RepoSettings::roots |
Set of model repository directories.
Definition at line 457 of file infer_trtis_server.h.
bool nvdsinferserver::triton::RepoSettings::strictModelConfig = true |
Flag to enable/disable Triton strict model configuration.
Definition at line 473 of file infer_trtis_server.h.
bool nvdsinferserver::triton::RepoSettings::tfAllowSoftPlacement = true |
Flag to enable/disable soft placement of TF operators.
Definition at line 465 of file infer_trtis_server.h.
float nvdsinferserver::triton::RepoSettings::tfGpuMemoryFraction = 0 |
TensorFlow GPU memory fraction per process.
Definition at line 469 of file infer_trtis_server.h.