Deviceless Ahead-of-time Compilation#
Ahead-of-time Compilation#
Since cuDNN 9.8, customers are allowed to create and finalize an execution plan devicelessly with a device property descriptor. This helps customers to cover the plan build time ahead of the exection.
Typical workflow:
Create a device property descriptor from device, serialize it out. This requires the device.
Deserialize the device property, create an execution plan from it as well as the computation graph, serialize the plan out. This doesn’t require the device.
Deserialize and execute the execution plan on devices with the same properties. This requires the device.
Refer to the corresponding C++ sample in samples/cpp/misc/deviceless_aot_compilation.cpp.
cuDNN Device Properties#
cuDNN device property descriptor describes the properties of a GPU device, is serializable and can be used to query cuDNN heuristics / create an execution plan directly without the device to be available.
The API to create a device property descriptor is:
auto device_prop = std::make_shared<cudnn_frontend::DeviceProperties>();
Ways to initialize the device properties:
set_handle(cudnnHandle_t handle); // initialize from a cuDNN handle
set_device_id(int32_t device_id); // initialize from a specific device
deserialize(const std::vector<uint8_t>& serialized_buf); // deserialize from json
The API to set a device property descriptor is:
graph.set_device_properties(device_prop)