This code snippet demonstrates the how the DNN module with DNN Tensors is typically used. Note that error handling is left out for clarity.
Initialize network from file.
If the model has been generated on DLA using --useDLA
option with tensorrt_optimization tool, the processor type should be either DW_PROCESSOR_TYPE_DLA_0
or DW_PROCESSOR_TYPE_DLA_1
depending on which DLA engine the inference should take place. Otherwise, the processor type should always be DW_PROCESSOR_TYPE_GPU
.
contextHandle
is assumed to be a previously initialized dwContextHandle_t
.
Check that the loaded network has the expected number of inputs and outputs.
uint32_t numInputs = 0;
uint32_t numOutputs = 0;
if (numInputs != 1) {
std::cerr << "Expected a DNN with one input blob." << std::endl;
return -1;
}
if (numOutputs != 2) {
std::cerr << "Expected a DNN with two output blobs." << std::endl;
return -1;
}
Ask the DNN about the order of the input and output blobs. The network is assumed to contain the input blob "data_in" and output blobs "data_out1" and "data_out2".
uint32_t inputIndex = 0;
uint32_t output1Index = 0;
uint32_t output2Index = 0;
Initialize tensors.
&metadata.dataConditionerParams, cudaStream,
contextHandle);
dwDNNTensorStreamerHandle_t streamer1;
dwDNNTensorStreamerHandle_t streamer2;
dwDNNTensorStreamer_initialize(&streamer1, &outputPropsHost1, outputPropsHost1.
tensorType, m_sdk);
dwDNNTensorStreamer_initialize(&streamer2, &outputPropsHost2, outputPropsHost2.
tensorType, m_sdk);
Convert DNN input from image to tensor, then perform DNN inference and stream results back. All operations are performed asynchronously with the host code.
dwRect roi{0U, 0U, imageWidth, imageHeight};
cudaAddressModeClamp, dataConditioner);
dwDNNTensorStreamer_producerSend(outputTensor1, streamer1);
dwDNNTensorStreamer_consumerReceive(&outputTensorHost1, streamer1);
dwDNNTensorStreamer_producerSend(outputTensor2, streamer2);
dwDNNTensorStreamer_consumerReceive(&outputTensorHost2, streamer2);
void* data1;
void* data2;
doit(data1, data2);
dwDNNTensorStreamer_consumerReturn(&outputTensorHost1, streamer1);
dwDNNTensorStreamer_producerReturn(nullptr, 1000, streamer1);
dwDNNTensorStreamer_consumerReturn(&outputTensorHost2, streamer2);
dwDNNTensorStreamer_producerReturn(nullptr, 1000, streamer2);
Finally, free previously allocated memory.
dwDNNTensorStreamer_release(streamer1);
dwDNNTensorStreamer_release(streamer2);
For more information see: