Function Management

Developer Guide (Latest)

Each time you create a new function a function ID will be created along with a function version ID. You can create additional versions of this function by specifying other models/containers/helm charts to use and additional configuration etc. Here is a sample API call:


curl -X 'POST' \ '$FUNCTION_ID/versions' \ -H 'accept: application/json' \ -H 'Authorization: Bearer $API_KEY' -d '{ "name": "echo_function", "inferenceUrl": "/echo", "containerImage": "$ORG_NAME/echo:latest", "apiBodyFormat": "CUSTOM" }'

Multiple function versions allow for different deployment configurations for each version while still be accessible through a single function endpoint. Multiple function versions can also be used to support A/B testing.


Function versioning should only be used if the APIs between the various versions are intercompatible. Different APIs should be created as new functions.

Cloud Functions supports a subset of public functions available for all Cloud Function users to invoke. Functions “shared” to another NGC organization for invocation are also supported. These will be listed as public or shared respectively in the functions list API response.

For debugging deployment failures see Deployment Failures.

For debugging invocation failures see Statuses and Errors.

See below for adding logging to your inference container, and viewing metrics.

This section gives an overview of available metrics and logs within the Cloud Functions UI. Note that for full observability of production workloads, it’s recommended to emit logs, metrics, analytics etc. to third party monitoring tools from within your container.

Emit and View Inference Container Logs

View inference container logs in the Cloud Functions UI via the “Logs” tab in the function details page. To get here, click any function version from the “Functions” list and click “View Details” on the side panel to the right.


Logs are currently available with up to 48 hours history, with the ability to view as expanded rows for scanning, or as a “window” view for ease of copying and pasting.


Note as a prerequisite, your inference container will have to be instrumented to emit logs. This is highly recommended.

How to Add Logs to Your Inference Container

Here is an example of adding NVCF compatible logs. The helper function for logging below, along with other helper functions, can be imported from the Helper Functions repository.


import logging def get_logger() -> logging.Logger: """ gets a Logger that logs in a format compatible with NVCF :return: logging.Logger """ sys.stdout.reconfigure(encoding="utf-8") logging.basicConfig( level=logging.INFO, format="%(asctime)s[%(levelname)s] [INFERENCE]%(message)s", handlers=[logging.StreamHandler(sys.stdout)], ) logger = logging.getLogger(__name__) return logger class MyServer: def __init__(self): self.logger = get_logger() def _infer_fn(self, request):"Got a request!")

View Function Metrics

NVCF exposes the following metrics by default.

  • Instance counts (current, min and max)

  • Invocation activity and queue depth

  • Total invocation count, success rate and failure count

  • Average inference time

Metrics are viewable upon clicking any function from the “Functions” list page. The function overview page will display aggregated values across all function versions.


When clicking into a function version’s details page, you will then see metrics for this specific function version.



There may be up to a 5 minute delay on metric ingestion. Any timeseries queries within the page are aggregated on 5 minute intervals with a step set to show 500 data points. All stat queries are based on the total selected time period and reduced to either show the latest total value or a mean value.

Previous Function Deployment
Next Function Permissions
© Copyright © 2024, NVIDIA Corporation. Last updated on Jun 7, 2024.