Depending on the size of your containers and models, it usually takes up to 30 minutes for your function to deploy, although durations up to 2 hours are permitted. If you believe your function should have deployed already, or if it has entered an error state, review the logs to understand what happened.

See below for information on how to view and use logs to troubleshoot issues with your functions.

There may not be enough capacity available to fulfill your deployment. Try reducing the number of instances you are requesting or changing the GPU/instance type used by your function.

Please review the logs and update your container or model as required.

This error typically occurs when the inference container expects a model file in a specified location, but the file is not present. Ensure the path for your model files is correct and the necessary files, like config.json, are available at that location.

This message means that the system did not find an environment variable named MODELS during the worker pod’s initialization. If your setup requires this environment variable, ensure it’s defined in your function’s configuration.

Previously, models were compressed, and the system needed to extract them before use. This is no longer necessary, but the log entry indicating model extraction persists from this legacy setup. Your model files can now be in a standard file structure without compression.

Please note that when uploading containers to NGC, the maximum size allowed per layer is 10GB. Additionally, the maximum size for uploading a model is capped at 5TB.

The config.json file should be located under the /config/models/$model-name directory in your environment. Ensure the file is in the correct path as specified in your function’s configuration.

The “copying model” message indicates that the system is making the model files available for the inference container. This process does not affect

What errors do we need to handle in our client?

  • Invalid credentials: Client ID and secret are invalid to fetch token.

  • Invalid Token: May be expired or some other unknown issue.

  • Invalid Json input: The JSON provided is not formatted correctly.

  • HTTP standard errors: Generic HTTP errors that occur during communication.

  • cuOpt lib errors: Errors such as 419 or others that occur due to malformed JSON input.

  • Service lib errors: When the service solver fails to find a solution. This could be a generic error, a segmentation fault, or simply an infeasible problem.

  • S3 errors: Issues with uploading to or pulling data from S3.

See Terminology section.

Previous Asset Management
Next Terminology
© Copyright 2023-2024, NVIDIA. Last updated on Feb 16, 2024.