cuPQC-Hash Usage#
The first step in using the cuPQC-Hash library is defining the hash operation you wish to perform. We specify operations by combining cuPQC-Hash operators to form a cuPQC-Hash descriptor. A well-defined cuPQC-Hash descriptor must include the algorithm, security category, and either thread or warp. Note that for the Poseidon2-BabyBear and Merkle Tree algorithms, the security category is not required.
As an example we show how to construct a SHA-3 256 hashing object using the warp configuration, that is we construct a SHA-3 hash utilizing a full warp.
// Include the cuPQC-Hash library header
#include <hash.hpp>
using namespace cupqc;
// cuPQC-Hash descriptor
// Define SHA3-256 with warp-based execution
using SHA3_256_WARP = decltype(SHA3_256() + Warp());
Executing Operations#
These cuPQC-Hash descriptor types are instantiated into objects inside user kernels.
The cuPQC-Hash descriptor object provides a few relevant methods, reset(), update(...), finalize(), and digest(...).
The reset() method resets the hash object to its initial state. The update(...) method takes a pointer to a buffer of data and its length, and hashes the data.
The finalize() method finalizes the hash object and outputs the hash of the data into a buffer.
The digest(...) method takes a pointer to a buffer of data and its length, and outputs the hash of the data into the buffer.
For more information on these functions, see the device functions API reference.
To demonstrate cuPQC-Hash execution, we will continue with our previous example and write a kernel for a batched single hashed operation – each thread block will compute hash utilizing a warp:
//cuPQC-Hash header
#include <hash.hpp>
//We use the cuPQC SDK namespace.
using namespace cupqc;
//cuPQC-Hash descriptor
using SHA3_256_WARP = decltype(SHA3_256() + Warp());
__global__ void hash_sha3_kernel(uint8_t* digest, const uint8_t* msg, size_t inbuf_len)
{
SHA3_256_WARP hash {};
hash.reset();
hash.update(msg, inbuf_len);
hash.finalize();
hash.digest(digest, SHA3_256_WARP::digest_size);
}
Launching Kernels#
Launching a kernel with the cuPQC-Hash library is straightforward. This example takes a message buffer and outputs a digest buffer. Since we’re using warp-based execution, we launch with 32 threads per block (the warp size).
void hash_sha3(std::vector<uint8_t>& digest, std::vector<uint8_t>& msg)
{
uint8_t* d_msg;
uint8_t* d_digest;
cudaMalloc(reinterpret_cast<void**>(&d_msg), msg.size());
cudaMalloc(reinterpret_cast<void**>(&d_digest), digest.size());
cudaMemcpy(d_msg, msg.data(), msg.size(), cudaMemcpyHostToDevice);
hash_sha3_kernel<<<1, 32>>>(d_digest, d_msg, msg.size());
cudaMemcpy(digest.data(), d_digest, digest.size(), cudaMemcpyDeviceToHost);
cudaFree(d_msg);
cudaFree(d_digest);
}
Compilation#
In order to compile a program which includes hash.hpp, as well the cuPQC-Hash API, users need to pass the location of the
cuPQC SDK headers, as well as link with the static library provided. Details on how to use cuPQC-Hash in your project can be found in
the Getting Started guide. The cuPQC-Hash library is built with link-time-optimization (LTO),
so we need to use the appropriate flags when compiling.
nvcc -std=c++17 -dlto -arch=sm_80 \
-I<path_to_cupqc_sdk_include> -I<path_to_commondx> \
-L<path_to_static_lib> -lcupqc-hash \
cupqc_hash_intro_example.cu -o cupqc_hash_intro_example
For detailed installation and compilation instructions, see the Getting Started guide. For information on supported features, see SDK Features.