Contents
- 1. Introduction
- 2. Programming Model
- 3. PTX Machine Model
- 4. Syntax
-
5. State Spaces, Types, and Variables
- 5.1. State Spaces
- 5.2. Types
- 5.3. Texture Sampler and Surface Types
- 5.4. Variables
- 5.5. Tensors
- 6. Instruction Operands
- 7. Abstracting the ABI
- 8. Memory Consistency Model
-
9. Instruction Set
- 9.1. Format and Semantics of Instruction Descriptions
- 9.2. PTX Instructions
- 9.3. Predicated Execution
- 9.4. Type Information for Instructions and Operands
- 9.5. Divergence of Threads in Control Constructs
- 9.6. Semantics
-
9.7. Instructions
-
9.7.1. Integer Arithmetic Instructions
- 9.7.1.1. Integer Arithmetic Instructions:
add - 9.7.1.2. Integer Arithmetic Instructions:
sub - 9.7.1.3. Integer Arithmetic Instructions:
mul - 9.7.1.4. Integer Arithmetic Instructions:
mad - 9.7.1.5. Integer Arithmetic Instructions:
mul24 - 9.7.1.6. Integer Arithmetic Instructions:
mad24 - 9.7.1.7. Integer Arithmetic Instructions:
sad - 9.7.1.8. Integer Arithmetic Instructions:
div - 9.7.1.9. Integer Arithmetic Instructions:
rem - 9.7.1.10. Integer Arithmetic Instructions:
abs - 9.7.1.11. Integer Arithmetic Instructions:
neg - 9.7.1.12. Integer Arithmetic Instructions:
min - 9.7.1.13. Integer Arithmetic Instructions:
max - 9.7.1.14. Integer Arithmetic Instructions:
popc - 9.7.1.15. Integer Arithmetic Instructions:
clz - 9.7.1.16. Integer Arithmetic Instructions:
bfind - 9.7.1.17. Integer Arithmetic Instructions:
fns - 9.7.1.18. Integer Arithmetic Instructions:
brev - 9.7.1.19. Integer Arithmetic Instructions:
bfe - 9.7.1.20. Integer Arithmetic Instructions:
bfi - 9.7.1.21. Integer Arithmetic Instructions:
szext - 9.7.1.22. Integer Arithmetic Instructions:
bmsk - 9.7.1.23. Integer Arithmetic Instructions:
dp4a - 9.7.1.24. Integer Arithmetic Instructions:
dp2a
- 9.7.1.1. Integer Arithmetic Instructions:
-
9.7.2. Extended-Precision Integer Arithmetic Instructions
- 9.7.2.1. Extended-Precision Arithmetic Instructions:
add.cc - 9.7.2.2. Extended-Precision Arithmetic Instructions:
addc - 9.7.2.3. Extended-Precision Arithmetic Instructions:
sub.cc - 9.7.2.4. Extended-Precision Arithmetic Instructions:
subc - 9.7.2.5. Extended-Precision Arithmetic Instructions:
mad.cc - 9.7.2.6. Extended-Precision Arithmetic Instructions:
madc
- 9.7.2.1. Extended-Precision Arithmetic Instructions:
-
9.7.3. Floating-Point Instructions
- 9.7.3.1. Floating Point Instructions:
testp - 9.7.3.2. Floating Point Instructions:
copysign - 9.7.3.3. Floating Point Instructions:
add - 9.7.3.4. Floating Point Instructions:
sub - 9.7.3.5. Floating Point Instructions:
mul - 9.7.3.6. Floating Point Instructions:
fma - 9.7.3.7. Floating Point Instructions:
mad - 9.7.3.8. Floating Point Instructions:
div - 9.7.3.9. Floating Point Instructions:
abs - 9.7.3.10. Floating Point Instructions:
neg - 9.7.3.11. Floating Point Instructions:
min - 9.7.3.12. Floating Point Instructions:
max - 9.7.3.13. Floating Point Instructions:
rcp - 9.7.3.14. Floating Point Instructions:
rcp.approx.ftz.f64 - 9.7.3.15. Floating Point Instructions:
sqrt - 9.7.3.16. Floating Point Instructions:
rsqrt - 9.7.3.17. Floating Point Instructions:
rsqrt.approx.ftz.f64 - 9.7.3.18. Floating Point Instructions:
sin - 9.7.3.19. Floating Point Instructions:
cos - 9.7.3.20. Floating Point Instructions:
lg2 - 9.7.3.21. Floating Point Instructions:
ex2 - 9.7.3.22. Floating Point Instructions:
tanh
- 9.7.3.1. Floating Point Instructions:
-
9.7.4. Half Precision Floating-Point Instructions
- 9.7.4.1. Half Precision Floating Point Instructions:
add - 9.7.4.2. Half Precision Floating Point Instructions:
sub - 9.7.4.3. Half Precision Floating Point Instructions:
mul - 9.7.4.4. Half Precision Floating Point Instructions:
fma - 9.7.4.5. Half Precision Floating Point Instructions:
neg - 9.7.4.6. Half Precision Floating Point Instructions:
abs - 9.7.4.7. Half Precision Floating Point Instructions:
min - 9.7.4.8. Half Precision Floating Point Instructions:
max - 9.7.4.9. Half Precision Floating Point Instructions:
tanh - 9.7.4.10. Half Precision Floating Point Instructions:
ex2
- 9.7.4.1. Half Precision Floating Point Instructions:
- 9.7.5. Mixed Precision Floating-Point Instructions
- 9.7.6. Comparison and Selection Instructions
- 9.7.7. Half Precision Comparison Instructions
-
9.7.8. Logic and Shift Instructions
- 9.7.8.1. Logic and Shift Instructions:
and - 9.7.8.2. Logic and Shift Instructions:
or - 9.7.8.3. Logic and Shift Instructions:
xor - 9.7.8.4. Logic and Shift Instructions:
not - 9.7.8.5. Logic and Shift Instructions:
cnot - 9.7.8.6. Logic and Shift Instructions:
lop3 - 9.7.8.7. Logic and Shift Instructions:
shf - 9.7.8.8. Logic and Shift Instructions:
shl - 9.7.8.9. Logic and Shift Instructions:
shr
- 9.7.8.1. Logic and Shift Instructions:
-
9.7.9. Data Movement and Conversion Instructions
- 9.7.9.1. Cache Operators
- 9.7.9.2. Cache Eviction Priority Hints
- 9.7.9.3. Data Movement and Conversion Instructions:
mov - 9.7.9.4. Data Movement and Conversion Instructions:
mov - 9.7.9.5. Data Movement and Conversion Instructions:
shfl(deprecated) - 9.7.9.6. Data Movement and Conversion Instructions:
shfl.sync - 9.7.9.7. Data Movement and Conversion Instructions:
prmt - 9.7.9.8. Data Movement and Conversion Instructions:
ld - 9.7.9.9. Data Movement and Conversion Instructions:
ld.global.nc - 9.7.9.10. Data Movement and Conversion Instructions:
ldu - 9.7.9.11. Data Movement and Conversion Instructions:
st - 9.7.9.12. Data Movement and Conversion Instructions:
st.async - 9.7.9.13. Data Movement and Conversion Instructions:
st.bulk - 9.7.9.14. Data Movement and Conversion Instructions:
multimem.ld_reduce,multimem.st,multimem.red - 9.7.9.15. Data Movement and Conversion Instructions:
prefetch,prefetchu - 9.7.9.16. Data Movement and Conversion Instructions:
applypriority - 9.7.9.17. Data Movement and Conversion Instructions:
discard - 9.7.9.18. Data Movement and Conversion Instructions:
createpolicy - 9.7.9.19. Data Movement and Conversion Instructions:
isspacep - 9.7.9.20. Data Movement and Conversion Instructions:
cvta - 9.7.9.21. Data Movement and Conversion Instructions:
cvt - 9.7.9.22. Data Movement and Conversion Instructions:
cvt.pack - 9.7.9.23. Data Movement and Conversion Instructions:
mapa - 9.7.9.24. Data Movement and Conversion Instructions:
getctarank -
9.7.9.25. Data Movement and Conversion Instructions: Asynchronous copy
- 9.7.9.25.1. Completion Mechanisms for Asynchronous Copy Operations
- 9.7.9.25.2. Async Proxy
- 9.7.9.25.3. Data Movement and Conversion Instructions: Non-bulk copy
- 9.7.9.25.4. Data Movement and Conversion Instructions: Bulk copy
-
9.7.9.25.5. Data Movement and Conversion Instructions: Tensor copy
- 9.7.9.25.5.1. Restriction on Tensor Copy instructions
- 9.7.9.25.5.2. Data Movement and Conversion Instructions:
cp.async.bulk.tensor - 9.7.9.25.5.3. Data Movement and Conversion Instructions:
cp.reduce.async.bulk.tensor - 9.7.9.25.5.4. Data Movement and Conversion Instructions:
cp.async.bulk.prefetch.tensor
- 9.7.9.25.6. Data Movement and Conversion Instructions: Bulk and Tensor copy completion instructions
- 9.7.9.26. Data Movement and Conversion Instructions:
tensormap.replace
- 9.7.10. Texture Instructions
- 9.7.11. Surface Instructions
- 9.7.12. Control Flow Instructions
-
9.7.13. Parallel Synchronization and Communication Instructions
- 9.7.13.1. Parallel Synchronization and Communication Instructions:
bar,barrier - 9.7.13.2. Parallel Synchronization and Communication Instructions:
bar.warp.sync - 9.7.13.3. Parallel Synchronization and Communication Instructions:
barrier.cluster - 9.7.13.4. Parallel Synchronization and Communication Instructions:
membar/fence - 9.7.13.5. Parallel Synchronization and Communication Instructions:
atom - 9.7.13.6. Parallel Synchronization and Communication Instructions:
red - 9.7.13.7. Parallel Synchronization and Communication Instructions:
red.async - 9.7.13.8. Parallel Synchronization and Communication Instructions:
vote(deprecated) - 9.7.13.9. Parallel Synchronization and Communication Instructions:
vote.sync - 9.7.13.10. Parallel Synchronization and Communication Instructions:
match.sync - 9.7.13.11. Parallel Synchronization and Communication Instructions:
activemask - 9.7.13.12. Parallel Synchronization and Communication Instructions:
redux.sync - 9.7.13.13. Parallel Synchronization and Communication Instructions:
griddepcontrol - 9.7.13.14. Parallel Synchronization and Communication Instructions:
elect.sync -
9.7.13.15. Parallel Synchronization and Communication Instructions:
mbarrier- 9.7.13.15.1. Size and alignment of mbarrier object
- 9.7.13.15.2. Contents of the mbarrier object
- 9.7.13.15.3. Lifecycle of the mbarrier object
- 9.7.13.15.4. Phase of the mbarrier object
- 9.7.13.15.5. Tracking asynchronous operations by the mbarrier object
- 9.7.13.15.6. Phase Completion of the mbarrier object
- 9.7.13.15.7. Arrive-on operation on mbarrier object
- 9.7.13.15.8. mbarrier support with shared memory
- 9.7.13.15.9. Parallel Synchronization and Communication Instructions:
mbarrier.init - 9.7.13.15.10. Parallel Synchronization and Communication Instructions:
mbarrier.inval - 9.7.13.15.11. Parallel Synchronization and Communication Instructions:
mbarrier.expect_tx - 9.7.13.15.12. Parallel Synchronization and Communication Instructions:
mbarrier.complete_tx - 9.7.13.15.13. Parallel Synchronization and Communication Instructions:
mbarrier.arrive - 9.7.13.15.14. Parallel Synchronization and Communication Instructions:
mbarrier.arrive_drop - 9.7.13.15.15. Parallel Synchronization and Communication Instructions:
cp.async.mbarrier.arrive - 9.7.13.15.16. Parallel Synchronization and Communication Instructions:
mbarrier.test_wait/mbarrier.try_wait - 9.7.13.15.17. Parallel Synchronization and Communication Instructions:
mbarrier.pending_count
- 9.7.13.16. Parallel Synchronization and Communication Instructions:
tensormap.cp_fenceproxy - 9.7.13.17. Parallel Synchronization and Communication Instructions:
clusterlaunchcontrol.try_cancel - 9.7.13.18. Parallel Synchronization and Communication Instructions:
clusterlaunchcontrol.query_cancel
- 9.7.13.1. Parallel Synchronization and Communication Instructions:
-
9.7.14. Warp Level Matrix Multiply-Accumulate Instructions
- 9.7.14.1. Matrix Shape
- 9.7.14.2. Matrix Data-types
- 9.7.14.3. Block Scaling
-
9.7.14.4. Matrix multiply-accumulate operation using
wmmainstructions -
9.7.14.5. Matrix multiply-accumulate operation using
mmainstruction- 9.7.14.5.1. Matrix Fragments for
mma.m8n8k4with.f16floating point type - 9.7.14.5.2. Matrix Fragments for
mma.m8n8k4with.f64floating point type - 9.7.14.5.3. Matrix Fragments for
mma.m8n8k16 - 9.7.14.5.4. Matrix Fragments for
mma.m8n8k32 - 9.7.14.5.5. Matrix Fragments for
mma.m8n8k128 - 9.7.14.5.6. Matrix Fragments for
mma.m16n8k4 - 9.7.14.5.7. Matrix Fragments for
mma.m16n8k8 - 9.7.14.5.8. Matrix Fragments for
mma.m16n8k16with floating point type - 9.7.14.5.9. Matrix Fragments for
mma.m16n8k16with integer type - 9.7.14.5.10. Matrix Fragments for
mma.m16n8k32 - 9.7.14.5.11. Matrix Fragments for
mma.m16n8k64 - 9.7.14.5.12. Matrix Fragments for
mma.m16n8k128 - 9.7.14.5.13. Matrix Fragments for
mma.m16n8k256 - 9.7.14.5.14. Multiply-and-Accumulate Instruction:
mma - 9.7.14.5.15. Warp-level matrix load instruction:
ldmatrix - 9.7.14.5.16. Warp-level matrix store instruction:
stmatrix - 9.7.14.5.17. Warp-level matrix transpose instruction:
movmatrix
- 9.7.14.5.1. Matrix Fragments for
-
9.7.14.6. Matrix multiply-accumulate operation using
mma.spinstruction with sparse matrix A- 9.7.14.6.1. Sparse matrix storage
-
9.7.14.6.2. Matrix fragments for multiply-accumulate operation with sparse matrix A
- 9.7.14.6.2.1. Matrix Fragments for sparse
mma.m16n8k16with.f16and.bf16types - 9.7.14.6.2.2. Matrix Fragments for sparse
mma.m16n8k32with.f16and.bf16types - 9.7.14.6.2.3. Matrix Fragments for sparse
mma.m16n8k16with.tf32floating point type - 9.7.14.6.2.4. Matrix Fragments for sparse
mma.m16n8k8with.tf32floating point type - 9.7.14.6.2.5. Matrix Fragments for sparse
mma.m16n8k32with.u8/.s8integer type - 9.7.14.6.2.6. Matrix Fragments for sparse
mma.m16n8k64with.u8/.s8/.e4m3/.e5m2type - 9.7.14.6.2.7. Matrix Fragments for sparse
mma.m16n8k64with.u4/.s4integer type - 9.7.14.6.2.8. Matrix Fragments for sparse
mma.m16n8k128with.u4/.s4integer type
- 9.7.14.6.2.1. Matrix Fragments for sparse
- 9.7.14.6.3. Multiply-and-Accumulate Instruction:
mma.sp/mma.sp::ordered_metadata
-
9.7.15. Asynchronous Warpgroup Level Matrix Multiply-Accumulate Instructions
- 9.7.15.1. Warpgroup
- 9.7.15.2. Matrix Shape
- 9.7.15.3. Matrix Data-types
- 9.7.15.4. Async Proxy
-
9.7.15.5. Asynchronous Warpgroup Level Matrix Multiply-Accumulate Operation using
wgmma.mma_asyncinstruction -
9.7.15.6. Asynchronous Warpgroup Level Multiply-and-Accumulate Operation using
wgmma.mma_async.spinstruction -
9.7.15.7. Asynchronous
wgmmaProxy Operations
-
9.7.16. TensorCore 5th Generation Family Instructions
- 9.7.16.1. Tensor Memory
- 9.7.16.2. Matrix and Data Movement Shape
- 9.7.16.3. Major-ness supported by Strides
- 9.7.16.4. Matrix Descriptors
- 9.7.16.5. Issue Granularity
- 9.7.16.6. Memory Consistency Model for 5th generation of TensorCore operations
- 9.7.16.7. Tensor Memory Allocation and Management Instructions
- 9.7.16.8. Tensor Memory and Register Load/Store Instructions
- 9.7.16.9. Tensor Memory Data Movement Instructions
-
9.7.16.10. TensorCore 5th Generation Matrix Multiply and accumulate Operations
- 9.7.16.10.1. Transpose and Negate operations
- 9.7.16.10.2. Matrix Layout Organization
- 9.7.16.10.3. Valid Combinations of Type-Size, Major-ness and Swizzling
-
9.7.16.10.4. Packing formats of elements in Tensor and Shared memory
- 9.7.16.10.4.1. Packing format for matrix D in Tensor Memory
- 9.7.16.10.4.2. Packing format for matrix A and B
- 9.7.16.10.4.3. Packing format used for matrix A by
.kind::mxf8f6f4in Tensor Memory - 9.7.16.10.4.4. Packing format used for matrix A and B by
.kind::mxf8f6f4in Shared Memory - 9.7.16.10.4.5. Packing format used for matrix A by
.kind::mxf4and.kind::mxf4nvf4in Tensor Memory - 9.7.16.10.4.6. Packing format used for matrix A and B by
.kind::mxf4and.kind::mxf4nvf4in Shared Memory
-
9.7.16.10.5. Data Path Layout Organization
- 9.7.16.10.5.1. Layout A (M = 256)
- 9.7.16.10.5.2. Layout B (M = 128 + cta-group::2 + Dense A matrix)
- 9.7.16.10.5.3. Layout C (M = 128 + cta-group::2 + Sparse A matrix)
- 9.7.16.10.5.4. Layout D (M = 128 + cta-group::1)
- 9.7.16.10.5.5. Layout E (M = 64 + .ws mode)
- 9.7.16.10.5.6. Layout F (M = 64 + non .ws mode)
- 9.7.16.10.5.7. Layout G (M = 32)
- 9.7.16.10.6. Shared Memory Layout and Swizzling
-
9.7.16.10.7. Block Scaling
- 9.7.16.10.7.1. Valid combinations of scale_vectorsize with types and MMA-Kind
-
9.7.16.10.7.2. Scale Factor A ID
- 9.7.16.10.7.2.1. Layout of the Scale Factor A Matrix for scale_vec::1X/block32 with K=32/K=64
- 9.7.16.10.7.2.2. Layout of the Scale Factor A Matrix for scale_vec::2X/block32 with K=64/K=128
- 9.7.16.10.7.2.3. Layout of the Scale Factor A Matrix for scale_vec::4X/block16 with K=64/K=128
- 9.7.16.10.7.2.4. Layout of the Scale Factor A Matrix for block32 with K=96 (Semantically equivalent to scale_vec::3X)
- 9.7.16.10.7.2.5. Layout of the Scale Factor A Matrix for block16 with K=96 (Semantically equivalent to scale_vec::6X)
-
9.7.16.10.7.3. Scale Factor B ID
- 9.7.16.10.7.3.1. Layout of the Scale Factor B Matrix for scale_vec::1X/block32 with K=32/K=64
- 9.7.16.10.7.3.2. Layout of the Scale Factor B Matrix for scale_vec::2X/block32 with K=64/K=128
- 9.7.16.10.7.3.3. Layout of the Scale Factor B Matrix for scale_vec::4X/block16 with K=64/K=128
- 9.7.16.10.7.3.4. Layout of the Scale Factor B Matrix for block32 with K=96 (Semantically equivalent to scale_vec::3X)
- 9.7.16.10.7.3.5. Layout of the Scale Factor B Matrix for block16 with K=96 (Semantically equivalent to scale_vec::6X)
-
9.7.16.10.8. Sparse Matrices
- 9.7.16.10.8.1. Sparse
tcgen05.mma.spwith.kind::tf32 - 9.7.16.10.8.2. Sparse
tcgen05.mma.spwith.kind::f16,.kind::f8f6f4,.kind::mxf8f6f4,.kind::i8 - 9.7.16.10.8.3. Sparse
tcgen05.mma.spwith.kind::mxf4and.kind::mxf4nvf4 -
9.7.16.10.8.4. Sparsity selector
- 9.7.16.10.8.4.1. Layout of the Sparsity Metadata Matrix for M = 64 for
.kind::f16 - 9.7.16.10.8.4.2. Layout of the Sparsity Metadata Matrix for M = 128 / M = 256 for
.kind::f16 - 9.7.16.10.8.4.3. Layout of the Sparsity Metadata Matrix for M = 64 for
.kind::tf32 - 9.7.16.10.8.4.4. Layout of the Sparsity Metadata Matrix for M = 128 / M = 256 for
.kind::tf32 - 9.7.16.10.8.4.5. Layout of the Sparsity Metadata Matrix for M = 64 for
.kind::f8f6f4,.kind::mxf8f6f4,.kind::i8,.kind::mxf4,.kind::mxf4nvf4 - 9.7.16.10.8.4.6. Layout of the Sparsity Metadata Matrix for M = 128 / M = 256 for
.kind::f8f6f4,.kind::mxf8f6f4,.kind::i8,.kind::mxf4,.kind::mxf4nvf4
- 9.7.16.10.8.4.1. Layout of the Sparsity Metadata Matrix for M = 64 for
- 9.7.16.10.8.5. Alignment restriction
- 9.7.16.10.8.1. Sparse
- 9.7.16.10.9. TensorCore 5th Generation of MMA Instructions
- 9.7.16.11. TensorCore 5th Generation Specialized Synchronization Operations
- 9.7.16.12. TensorCore 5th Generation Async Synchronization Operations
- 9.7.17. Stack Manipulation Instructions
- 9.7.18. Video Instructions
- 9.7.19. Miscellaneous Instructions
-
9.7.1. Integer Arithmetic Instructions
-
10. Special Registers
- 10.1. Special Registers:
%tid - 10.2. Special Registers:
%ntid - 10.3. Special Registers:
%laneid - 10.4. Special Registers:
%warpid - 10.5. Special Registers:
%nwarpid - 10.6. Special Registers:
%ctaid - 10.7. Special Registers:
%nctaid - 10.8. Special Registers:
%smid - 10.9. Special Registers:
%nsmid - 10.10. Special Registers:
%gridid - 10.11. Special Registers:
%is_explicit_cluster - 10.12. Special Registers:
%clusterid - 10.13. Special Registers:
%nclusterid - 10.14. Special Registers:
%cluster_ctaid - 10.15. Special Registers:
%cluster_nctaid - 10.16. Special Registers:
%cluster_ctarank - 10.17. Special Registers:
%cluster_nctarank - 10.18. Special Registers:
%lanemask_eq - 10.19. Special Registers:
%lanemask_le - 10.20. Special Registers:
%lanemask_lt - 10.21. Special Registers:
%lanemask_ge - 10.22. Special Registers:
%lanemask_gt - 10.23. Special Registers:
%clock,%clock_hi - 10.24. Special Registers:
%clock64 - 10.25. Special Registers:
%pm0…%pm7 - 10.26. Special Registers:
%pm0_64…%pm7_64 - 10.27. Special Registers:
%envreg<32> - 10.28. Special Registers:
%globaltimer,%globaltimer_lo,%globaltimer_hi - 10.29. Special Registers:
%reserved_smem_offset_begin,%reserved_smem_offset_end,%reserved_smem_offset_cap,%reserved_smem_offset_<2> - 10.30. Special Registers:
%total_smem_size - 10.31. Special Registers:
%aggr_smem_size - 10.32. Special Registers:
%dynamic_smem_size - 10.33. Special Registers:
%current_graph_exec
- 10.1. Special Registers:
-
11. Directives
- 11.1. PTX Module Directives
- 11.2. Specifying Kernel Entry Points and Functions
- 11.3. Control Flow Directives
-
11.4. Performance-Tuning Directives
- 11.4.1. Performance-Tuning Directives:
.maxnreg - 11.4.2. Performance-Tuning Directives:
.maxntid - 11.4.3. Performance-Tuning Directives:
.reqntid - 11.4.4. Performance-Tuning Directives:
.minnctapersm - 11.4.5. Performance-Tuning Directives:
.maxnctapersm(deprecated) - 11.4.6. Performance-Tuning Directives:
.noreturn - 11.4.7. Performance-Tuning Directives:
.pragma - 11.4.8. Performance-Tuning Directives:
.abi_preserve - 11.4.9. Performance-Tuning Directives:
.abi_preserve_control
- 11.4.1. Performance-Tuning Directives:
- 11.5. Debugging Directives
- 11.6. Linking Directives
- 11.7. Cluster Dimension Directives
- 11.8. Miscellaneous Directives
-
12. Descriptions of
.pragmaStrings -
13. Release Notes
- 13.1. Changes in PTX ISA Version 9.0
- 13.2. Changes in PTX ISA Version 8.8
- 13.3. Changes in PTX ISA Version 8.7
- 13.4. Changes in PTX ISA Version 8.6
- 13.5. Changes in PTX ISA Version 8.5
- 13.6. Changes in PTX ISA Version 8.4
- 13.7. Changes in PTX ISA Version 8.3
- 13.8. Changes in PTX ISA Version 8.2
- 13.9. Changes in PTX ISA Version 8.1
- 13.10. Changes in PTX ISA Version 8.0
- 13.11. Changes in PTX ISA Version 7.8
- 13.12. Changes in PTX ISA Version 7.7
- 13.13. Changes in PTX ISA Version 7.6
- 13.14. Changes in PTX ISA Version 7.5
- 13.15. Changes in PTX ISA Version 7.4
- 13.16. Changes in PTX ISA Version 7.3
- 13.17. Changes in PTX ISA Version 7.2
- 13.18. Changes in PTX ISA Version 7.1
- 13.19. Changes in PTX ISA Version 7.0
- 13.20. Changes in PTX ISA Version 6.5
- 13.21. Changes in PTX ISA Version 6.4
- 13.22. Changes in PTX ISA Version 6.3
- 13.23. Changes in PTX ISA Version 6.2
- 13.24. Changes in PTX ISA Version 6.1
- 13.25. Changes in PTX ISA Version 6.0
- 13.26. Changes in PTX ISA Version 5.0
- 13.27. Changes in PTX ISA Version 4.3
- 13.28. Changes in PTX ISA Version 4.2
- 13.29. Changes in PTX ISA Version 4.1
- 13.30. Changes in PTX ISA Version 4.0
- 13.31. Changes in PTX ISA Version 3.2
- 13.32. Changes in PTX ISA Version 3.1
- 13.33. Changes in PTX ISA Version 3.0
- 13.34. Changes in PTX ISA Version 2.3
- 13.35. Changes in PTX ISA Version 2.2
- 13.36. Changes in PTX ISA Version 2.1
- 13.37. Changes in PTX ISA Version 2.0
- 14. Notices