VPI - Vision Programming Interface

3.1 Release

Image Histogram


An image histogram is the representation of the tonal distribution in a digital image. It counts the number of pixels for each tonal value. For more information, see [1].

Input Parameters Output
start = 0
end = 256
numBins = 256


This algorithm partitions the distribution into a number of bins and counts the number of occurrences of each pixel value within the bins.

A pixel with intensity 'i' will result in incrementing histogram bin \(b_i\) where

\[ b_i = \frac{(i - \text{start}) * \text{numBins}}{\text{end}-\text{start}} \]

The histogram result from CPU and CUDA backends do not match in some corner cases because these backends perform floating point calculations differently. Floating point calculations on CPU and CUDA are gauranteed to be the same only to some level of precision. These minor difference may cause bin indexes to be calculated differently. For example, when start = 0, end = 245, number of bins = 50, in case of pixel value = 147, the CPU backend will return bin index = 30 while CUDA backend will return bin index = 29.999998.

C API functions

For list of limitations, constraints and backends that implements the algorithm, consult reference documentation of the following functions:

Function Description
vpiCreateHistogramEven Creates payload for Image Histogram Even algorithm.
vpiSubmitHistogram Computes the image histogram.


  1. Import VPI module
    import vpi
  2. Use the CUDA backend calculate the histogram of the input VPI image. The output is a VPI array with numBins elements.
    with vpi.Backend.CUDA:
    output = input.histogram(numBins, range=(start, end))
  1. Initialization phase
    1. Include the header that defines the image histogram function.
      Declares functions that compute image histogram.
    2. Define the input image.
      VPIImage input = /*...*/;
      struct VPIImageImpl * VPIImage
      A handle to an image.
      Definition: Types.h:256
    3. Create the output array.
      VPIArray output;
      vpiArrayCreate(numBins, VPI_ARRAY_TYPE_U32, 0, &output);
      VPIStatus vpiArrayCreate(int32_t capacity, VPIArrayType type, uint64_t flags, VPIArray *array)
      Create an empty array instance.
      struct VPIArrayImpl * VPIArray
      A handle to an array.
      Definition: Types.h:232
      @ VPI_ARRAY_TYPE_U32
      Unsigned 32-bit.
      Definition: ArrayType.h:76
    4. Since this algorithm needs temporary memory buffers, create the payload for it on the CUDA backend.
      VPIPayload payload;
      #define VPI_IMAGE_FORMAT_U8
      Single plane with one 8-bit unsigned integer channel.
      Definition: ImageFormat.h:100
      VPIStatus vpiCreateHistogramEven(uint64_t backends, VPIImageFormat fmt, float start, float end, int32_t numBins, VPIPayload *payload)
      Creates payload for Image Histogram Even algorithm.
      struct VPIPayloadImpl * VPIPayload
      A handle to an algorithm payload.
      Definition: Types.h:268
      CUDA backend.
      Definition: Types.h:93
    5. Create the stream where the algorithm is to be submitted for execution.
      VPIStream stream;
      vpiStreamCreate(0, &stream);
      struct VPIStreamImpl * VPIStream
      A handle to a stream.
      Definition: Types.h:250
      VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
      Create a stream instance.
  2. Processing phase
    1. Submit the algorithm to the stream using the CUDA backend, along with all parameters.
      vpiSubmitHistogram(stream, VPI_BACKEND_CUDA, payload, input, output, 0);
      VPIStatus vpiSubmitHistogram(VPIStream stream, uint64_t backend, VPIPayload payload, VPIImage input, VPIArray output, uint64_t flags)
      Computes the image histogram.
    2. Optionally, wait until the processing is done.
      VPIStatus vpiStreamSync(VPIStream stream)
      Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
  3. Cleanup phase
    1. Free resources held by the stream and the input and output images.
      void vpiArrayDestroy(VPIArray array)
      Destroy an array instance.
      void vpiImageDestroy(VPIImage img)
      Destroy an image instance.
      void vpiPayloadDestroy(VPIPayload payload)
      Deallocates the payload object and all associated resources.
      void vpiStreamDestroy(VPIStream stream)
      Destroy a stream instance and deallocate all HW resources.

For more information, see Image Histogram in the "C API Reference" section of VPI - Vision Programming Interface.


For information on how to use the performance table below, see Algorithm Performance Tables.
Before comparing measurements, consult Comparing Algorithm Elapsed Times.
For further information on how performance was benchmarked, see Performance Benchmark.



  1. https://en.wikipedia.org/wiki/Image_histogram