Debug macro to check for CUDA errors.

In a non-release build, this macro will synchronize the specified stream before error checking. In both release and non-release builds, this macro checks for any pending CUDA errors from previous calls. If an error is reported, an exception is thrown detailing the CUDA error that occurred.

The intent of this macro is to provide a mechanism for synchronous and deterministic execution for debugging asynchronous CUDA execution. It should be used after any asynchronous CUDA call, e.g., cudaMemcpyAsync, or an asynchronous kernel launch.

© Copyright 2023, NVIDIA. Last updated on Aug 23, 2023.