Debuggability Guide#
Overview#
NVIDIA Cloud Functions provides comprehensive debuggability features through two main approaches:
Real-Time Logs
Access near real-time logs for faster debugging
Available through both NGC UI and CLI
No long-term storage, logs are ephemeral during workload lifecycle
Significantly reduced latency compared to traditional logging solutions
Remote Command Execution
Execute commands on function containers for debugging purposes
Support for common Linux commands in NGC CLI
Secure, controlled access to container environments
Real-Time Logs#
Real-time logs allow you to view function logs with minimal latency, providing immediate feedback during development and troubleshooting.
Key Benefits#
Immediate Feedback: View logs in near real-time, reducing debugging cycles
Reduced Latency: Significantly faster than historical log solutions
Multiple Access Methods: Available through both NGC UI and CLI interfaces
Getting Started#
Real-time logs are accessible for deployed NVIDIA Cloud Functions.
Access Logs via NGC UI
Navigate to your function in the NGC UI
Click the 3-dots button next to an active function
Select “View Logs” or “View Version Logs”
In the Logs page, you’ll see two tabs:
History Logs: For historical log analysis across different function version instances
Live Tail Logs: For near real-time log streaming
Using Live Tail Logs in NGC UI
Select the “Live Tail Logs” tab
Choose the Cluster name and Instance ID
Click “Start Session” to begin viewing live logs
Use “Pause Session” to temporarily halt the log stream
Use “Resume Session” to continue viewing logs
Click “Stop Session” to end the streaming session
Filter logs using the search box for quick identification of specific events
Note
Real-time logs are available after a function instance is actively running and a real-time logging session has begun. Once an instance terminates or restarts, these logs are no longer accessible. For historical log analysis, use the History Logs tab.
Live Tail logs are not stored and cannot be ‘replayed’ after a session ends or after the 50k buffer is exceeded.
Currently, live tail logs are only supported for functions deployed to GFN and DGXC cloud environments (note that not all GFN and DGXC environments may be supported).
Remote Command Execution#
Remote command execution allows you to run commands directly in your function’s container environment for advanced debugging purposes. Please note that the feature will depend on the user’s own container environment, i.e. if the container is a distroless container, you may not be able to access your target function container file system. Additionally, the default working directory will be the root directory of the target container when executing commands.
Key Benefits#
Interactive Debugging: Execute commands for troubleshooting without redeploying
Container Inspection: Examine file systems, processes, and environment variables
Secure Access: Commands are executed in a controlled, secure environment
Distroless Support: Debug containers with minimal operating system components
Getting Started#
View Available Instances
Navigate to your function in the NGC UI or use the CLI
Use the CLI to list instances:
1ngc cf fn instance ls <function-id>:<version-id>
2--org <org-id> #NGC Organization ID
3--team <team-name> #Team name in an org
Execute Commands via NGC CLI
1ngc cf fn instance exec <function-id>:<version-id>
2--org <org-id> #NGC Organization ID
3--team <team-name> #Team name in an org
4--instance-id <instance-id> #Instance ID
5--pod-name <pod-name> #Pod name used
6--container-name <container-name> #Container name used
7--command "<linux-command>" #linux command to be executed
1ngc cf fn instance exec my-function:v1
2--org my-organization
3--team my-team
4--instance-id --instance-id instance-1
5--pod-name pod-1234
6--container-name main
7--command "ls -la"
NGC CLI Requirements and Examples#
CLI Version Requirements#
The debuggability features are only available in NGC CLI versions 3.131.5 and newer.
Detailed CLI Examples#
List Function Instances, Containers, and Pods
1ngc cf fn instance ls <function-id>:<version-id>
2--org <org-id> #NGC Organization ID
3--team <team-name> #Team name in an org
1ngc cf fn instance ls my-function:v1
2--org my-organization
3--team my-team
Execute Commands on Target Containers
1ngc cf fn instance exec <function-id>:<version-id>
2--org <org-id> #NGC Organization ID
3--team <team-name> #Team name in an org
4--instance-id <instance-id> #Instance ID
5--pod-name <pod-name> #Pod name used
6--container-name <container-name> #Container name used
7--command "<linux-command>" #linux command to be executed
1ngc cf fn instance exec my-function:v1
2--org my-organization
3--team my-team
4--instance-id instance-1
5--pod-name pod-1234
6--container-name main
7--command "ls -la"
Attach Log Output from a Specific Pod Container
1ngc cf fn instance logs <function-id>:<version-id>
2--org <org-id> #NGC Organization ID
3--team <team-name> #Team name in an org
4--instance-id <instance-id> #Instance ID
5--pod-name <pod-name> #Pod name used
6--container-name <container-name> #Container name used
1ngc cf fn instance logs my-function:v1 --org my-organization
2--team my-team
3--instance-id instance-1
4--pod-name pod-1234
5--container-name main
Attach Log Output from an Entire Instance
1ngc cf fn instance logs <function-id>:<version-id>
2--org <org-id> #NGC Organization ID
3--team <team-name> #Team name in an org
4--instance-id <instance-id> #Instance ID
1ngc cf fn instance logs my-function:v1
2--org my-organization
3--team my-team
4--instance-id instance-1
Supported Commands#
The following commands are supported for remote execution:
Command/Method |
Description |
---|---|
cat |
Display file contents |
ls |
List directory contents |
cd |
Change directory |
pwd |
Print working directory |
man |
Display manual pages |
sort |
Sort lines of text files |
df |
Report file system disk space usage |
du |
Estimate file space usage |
grep |
Search for patterns in files |
find |
Search for files |
head |
Display beginning of files |
more |
Page through text |
less |
Page through text with more features |
tail |
Display end of files |
wc |
Print newline, word, and byte counts |
cut |
Remove sections from lines |
echo |
Display a line of text |
printf |
Format and print data |
Print data |
|
ps |
Report process status |
base64 |
Base64 encode/decode |
Pipe (|) |
Pipe output |
Input redirect (<) |
Redirect input |
Command separator (;) |
Separate commands |
Command chaining (&&) |
Chain commands |
Note
The command execution environment is isolated and has no impact on the function’s running state. Command execution is logged for security and audit purposes.
Security#
NVCF ensures secure debugging capabilities:
Authentication and authorization for all debugging actions
Container isolation prevents unauthorized access
Limited command set to prevent system modifications
Access control based on NGC permissions
All debugging actions are logged and auditable
Troubleshooting#
Common Error Codes#
Error Code |
Description |
Possible Resolution |
---|---|---|
400 (BadRequestException) |
Function is inactive or invalid parameters provided |
Ensure function is active and parameters are correct |
401 (NotAuthorizedException) |
Invalid authentication token |
Check that your NGC API key or SSA token is valid |
403 (ForbiddenException) |
Insufficient permissions or function does not exist |
Verify that your token has the appropriate scopes and the function exists |
404 (NotFoundException) |
Selected pod/container/instance does not exist |
Verify that the specified resources exist and are correctly named |
429 (TooManyRequestsException) |
Rate limit exceeded |
Reduce the frequency of requests and try again later |
500 (UpstreamException) |
Internal service error |
Contact support if the issue persists |
Required Permissions#
To use the debuggability features, ensure your NGC API key has the correct permissions:
When generating an NGC API key from the NGC console, select the “Cloud Function” permission
This permission grants the necessary access to use both Live Tail Logs and Command Execution features
Limitations#
Real-time logs are ephemeral with no long-term storage
Historical logs are still available through the standard logging system
Command execution is limited to a predefined set of commands
Debugging sessions have a maximum duration of 2 hours
Output size is limited to 2MB per command
Live tail logs are only supported for functions deployed to GFN and DGXC cloud environments
Live tail logs view maintains a maximum of 50,000 lines in the console buffer
Real-time logs cannot be searched on aggregate across all functions (e.g., searching for a string across all functions in an organization)
Appendix A: Terminology#
Term |
Definition |
---|---|
NGC |
NVIDIA GPU Cloud which provides a way for users to set up and manage access to NVIDIA cloud services |
NVCF |
NVIDIA Cloud Functions |
Ephemeral Container |
A temporary container created within a pod for debugging purposes |
Real-time Logs |
Logs streamed with minimal latency during function execution |
DGXC |
DGX Cloud service |
History Logs |
Logs stored for longer-term analysis with search capabilities |
Live Tail Logs |
Near real-time streaming logs with minimal latency |
Distroless Container |
A container image with minimal operating system components |