Transport layer security (TLS) is a cryptographic protocol designed to provide communications security over a computer network. The protocol is widely used in applications such as email, instant messaging, and voice over IP (VoIP), but its use in securing HTTPS remains the most publicly visible.

The TLS protocol aims primarily to provide cryptography, including privacy (confidentiality), integrity, and authenticity using certificates, between two or more communicating computer applications. It runs in the application layer and is itself composed of two layers: the TLS record and the TLS handshake protocols.

TLS works over TCP and consists of 3 phases:

Handshake – establishment of a connection Application – sending and receiving encrypted packets Termination – connection termination

In the handshake phase, the client and server decide on which cipher suites they will use, and exchange keys and certificates according to the following flow:

Client hello, provides the server at a minimum with the following: A key exchange algorithm, to determine how symmetric keys are exchanged

An authentication or digital signature algorithm, which dictates how server authentication and client authentication (if required) are implemented

A bulk encryption cipher, which is used to encrypt the data

A hash/MAC (message authentication code) function, which determines how data integrity checks are carried out

The version of the protocol it understands

The cipher suites it is capable of working with

A unique random number, which is important to guard against replay attacks Server hello: Selects a cipher suite

Generates its own random number

Assigns a session ID to the TLS connection

Sends enough information to complete a key exchange—most often, this means sending a certificate including an RSA public key Client: Responsible for completing the key exchange using the information the server provided

At this point, the connection is secured, both sides have agreed on an encryption algorithm, a MAC algorithm, and respective keys.

The Linux kernel provides TLS offload infrastructure. kTLS (kernel TLS) offloads TLS handling from the user-space to the kernel-space.

kTLS has 3 modes of operation:

SW – all operation is handled in kernel (i.e., handshake, encryption, decryption)

HW-offload (the focus of this guide) – handshake and error handling are performed in software. Packets are encrypted/decrypted in hardware. In this case, there is an additional offload from the kernel to the hardware.

HW-record – all operations are handled by the hardware (driver and firmware) including the handshake. It also handles its own TCP session. This option is currently not supported.

Note It is important to understand that Rx (receiving) and Tx (sending) can have two separate modes. For example, Rx can be dealt in SW mode but Tx in HW-offload mode (i.e., the hardware will only encrypt but not decrypt).





In general, the TLS HW-offload performs best and provides optimal value on longer lived sessions, with relatively large packets. Scaling in terms of concurrent connections and connections per second is use-case dependent (e.g., the amount of active concurrent connections from the overall open concurrent connections is material).

It is necessary to learn the following terms before proceeding:

The transport interface send (TIS) object is responsible for performing all transport-related operations of the transmit side. Messages from Send Queues (SQs) get segmented and transmitted by the TIS including all transport required implications. For example, in the case of a large send offload, the TIS is responsible for the segmentation. The NVIDIA® ConnectX® hardware uses a TIS object to save and access the TLS crypto information and state of an offloaded Tx kTLS connection.

The transport interface receive (TIR) object is responsible for performing all transport-related operations on the receive side. TIR performs the packet processing and reassembly and is also responsible for demultiplexing the packets into different receive queues (RQs).

Both TIS and TIR hold the data encryption key (DEK).

Note The following flow does not include resync and errors.

Establishes a TLS connection with remote host (server or client) by handling a TLS handshake by kernel on current host. Initializes the following state for each connection, Rx and Tx: Crypto secrets (e.g., public key)

Crypto processing state

Record metadata (e.g., record sequence number, offset)

Expected TCP sequence number

Tx flow:

Packets belonging to device offloaded sockets arrive to the kernel and it does not encrypt them. Kernel performs record framing and marks the packet with a connection identifier. Kernel sends packets to the device driver for offloading. Device checks that the sequence number matches the state in the TIS and performs encryption and authentication.

Rx flow:

When the connection is created, a HW steering rule is added to steer packets to their respective TIR. Device receives the packet then validates and checks that sequence number of TCP matches the state in the TIR. Performs decryption and authentication, and indicates in the CQE (completion queue entry). Kernel understands that the packet is already decrypted so it does not decrypt it itself and passes it on to the user-space.

When the sequence number does not match expectations or if any other error occurs, the hardware gives control back to the SW which handles the problem.

See more about kTLS modes, resync, and error handling in the Linux Kernel documentation.