BoxFilter#
Overview#
The BoxFilter(also known as box linear filter or box blur) is a low pass filter that when applied to an image, it replaces the value of each pixel with the average value of the pixels in a surrounding neighborhood defined by a kernel. The current implementation supports 3x3, 5x5 and 7x7 kernels.
Box Filter Parameters: kernel size = 5×5
Algorithm Description#
Box filter is implemented as a convolution operation on the input image where the kernel has the following weights:
where \(n\) is the width and height of the kernel (3, 5, or 7).
Implementation Details#
Parameters#
KernelSize (int) : the size of the kernel used for filtering. The current implementation only supports 3x3, 5x5 and 7x7 kernels.
BorderMode (NVCVBorderType) : the border mode used for padding the source image, can be constant or replicate.
BorderValue (int) : the value used for constant border padding.
Kernel Quantization#
The box filter kernel is originally generated in floating point format. The kernel coefficients are then quantized to integers using the following formula:
where \(w_{fixed}\) is the quantized kernel coefficient, \(w_{float}\) is the original floating point kernel coefficient, and \(qbits\) is the number of bits used for quantization. \(qbits\) is set to 8 for uint8/int8 types and 16 for uint16/int16 types.
To ensure the sum of the quantized kernel coefficients equals to \(2^{qbits}\), we use floor operation in quantization and distribute the difference across kernel elements. The adjustment process can be described as follows:
Let \(\Delta = 2^{qbits} - \sum_{i=0}^{n^2-1} w_{fixed}\) be the difference between the target sum and the current sum. Since floor operation always reduces values, we have \(0 \leq \Delta \leq n^2\).
The adjustment is distributed across kernel elements using a step-based algorithm that increments elements at indices \(0, \text{step}, 2\text{step}, \ldots\) until exactly \(\Delta\) adjustments are made, where:
The algorithm ensures exactly \(\Delta\) elements are incremented by using a counter that stops when the target number of adjustments is reached, distributing the adjustments as evenly as possible across the kernel.
Dataflow Configuration#
Refer to GaussianFilter operator documentation for more details. BoxFilter operator uses the same dataflow configuration as GaussianFilter operator.
Buffer Allocation#
Refer to GaussianFilter operator documentation for more details. BoxFilter operator uses the same buffer allocation as GaussianFilter operator.
Kernel Implementation#
Refer to GaussianFilter operator documentation for more details. BoxFilter operator uses the same kernel implementation as GaussianFilter operator.
Performance#
Execution Time is the average time required to execute the operator on a single VPU core.
Note that each PVA contains two VPU cores, which can operate in parallel to process two streams simultaneously, or reduce execution time by approximately half by splitting the workload between the two cores.
Total Power represents the average total power consumed by the module when the operator is executed concurrently on both VPU cores.
Idle power is approximately 7W when the PVA is not processing data.
For detailed information on interpreting the performance table below and understanding the benchmarking setup, see Performance Benchmark.
ImageSize |
DataType |
KernelSize |
Execution Time |
Submit Latency |
Total Power |
|---|---|---|---|---|---|
1920x1080 |
U8 |
3x3 |
0.158ms |
0.023ms |
16.252W |
1920x1080 |
U8 |
5x5 |
0.161ms |
0.023ms |
17.295W |
1920x1080 |
U8 |
7x7 |
0.166ms |
0.024ms |
17.777W |
1920x1080 |
S8 |
3x3 |
0.158ms |
0.024ms |
16.632W |
1920x1080 |
S8 |
5x5 |
0.161ms |
0.025ms |
17.295W |
1920x1080 |
S8 |
7x7 |
0.166ms |
0.025ms |
18.056W |
1920x1080 |
U16 |
3x3 |
0.301ms |
0.023ms |
16.913W |
1920x1080 |
U16 |
5x5 |
0.356ms |
0.023ms |
17.858W |
1920x1080 |
U16 |
7x7 |
0.457ms |
0.023ms |
16.592W |
1920x1080 |
S16 |
3x3 |
0.301ms |
0.023ms |
17.295W |
1920x1080 |
S16 |
5x5 |
0.356ms |
0.025ms |
17.856W |
1920x1080 |
S16 |
7x7 |
0.458ms |
0.024ms |
16.972W |
Compatibility#
Requires PVA SDK 2.6.0 and later.