Transformer Engine

1.13.0
Version select:

Home

Getting Started

Installation
Getting Started
Frequently Asked Questions (FAQ)
- FP8 checkpoint compatibility

Python API documentation

Common API
- Format
- DelayedScaling
Framework-specific API

Examples and Tutorials

Using FP8 with Transformer Engine
- Introduction to FP8
- Using FP8 with Transformer Engine
Performance Optimizations
Accelerating a Hugging Face Llama 2 and Llama 3 models with Transformer Engine

Advanced

C/C++ API
Attention Is All You Need!

Transformer Engine

C/C++ API
View page source

C/C++ API

The C/C++ API allows you to access the custom kernels defined in libtransformer_engine.so library directly from C/C++, without Python.

Headers

activation.h
cast.h
- nvte_fp8_quantize()
- nvte_fp8_dequantize()
gemm.h
fused_attn.h
layer_norm.h
rmsnorm.h
softmax.h
transformer_engine.h
transpose.h

Previous Next

Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

© Copyright 2022-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved..