Reference#
This page provides links to additional resources for learning about CUDA Graphs and related topics. These materials complement this guide and offer deeper technical details, alternative perspectives, and comprehensive API documentation.
Official Documentation#
CUDA#
CUDA Programming Guide - CUDA Graphs: The authoritative source on CUDA Graphs, covering all features, constraints, and low-level APIs
CUDA C++ Programming Guide - CUDA Graphs: Legacy documentation (no longer updated as of CUDA 13.0), useful for older CUDA versions
CUDA Runtime API - Graph Management: Complete API reference for CUDA graph functions
Getting Started with CUDA Graphs (NVIDIA Blog): Introductory blog post with examples and use cases
PyTorch#
PyTorch CUDA Semantics - CUDA Graphs: PyTorch’s official CUDA graph documentation with API details and usage patterns
torch.cuda.CUDAGraph API: Low-level CUDAGraph class reference
torch.cuda.graph(): Context manager for stream capture
torch.cuda.make_graphed_callables(): High-level API for graphing callables
PyTorch Reproducibility: Guide for deterministic behavior, relevant for graph validation
Frameworks#
Megatron-LM: NVIDIA’s framework for training large language models with built-in CUDA graph support
Megatron Core Documentation: Official Megatron Core developer guide
CUDAGraph Trees: PyTorch compiler’s automatic CUDA graph system
NCCL User Guide - CUDA Graphs: Using NCCL collectives with CUDA graphs
Technical Blogs and Articles#
Accelerating PyTorch with CUDA Graphs: PyTorch blog post on CUDA graph integration
Dynamic Control Flow in CUDA Graphs with Conditional Nodes: NVIDIA blog post on conditional nodes (IF, WHILE, SWITCH) for dynamic control flow in CUDA graphs
CUDA 10 Features Revealed: Original announcement of CUDA Graphs feature
Tools and Profiling#
NVIDIA Nsight Systems: GPU profiling tool for identifying performance bottlenecks and synchronizations
NVIDIA Nsight Compute: Detailed kernel-level profiling
Research Papers#
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism: Paper introducing Megatron’s parallelism strategies
Community Resources#
PyTorch Discuss Forums: Community Q&A, search for “CUDA graphs” for discussions
PyTorch GitHub Issues: Bug reports and feature requests, searchable for CUDA graph issues
CUDA Zone: NVIDIA’s CUDA developer portal with tutorials and resources