Reflections on Non Maximum Suppression (NMS)

Before NMS and after NMS
SSD and Yolo object detection networks ( from 12) .
select only rectangles above a confidence threshold
sort the thresholded rectangles in descending order
create an empty set of kept rectangle
loop over the sorted thresholded rectangles:
loop over the set of kept rectangles:
compute IOU between the rectangles
if IOU is above IOU threshold break loop
if all IOU are below the IOU threshold add to kept
create a priority queue of rectangles based on their scores
create an empty set of selected rectangles
loop over priority queue :
loop over selected set :
compute IOU between rectangles
if IOU above threshold break loop
if loop did not break add priority queue rectangle to selected set
radix sort rectangles in score descending order(DeviceRadixSort)
flip boxes to get x1<x2 and y1<y2 if necessary
for each box, compute bitmask of other boxes with IOU > threshold (NMSKernel)
build a global bit mask for selected boxes (NMSReduce)
(each thread handles a number of boxes)
make all bits of bitmask 0xFFFFFFF (e.g. all boxes are selected)
loop over all boxes
if the bit corresponding to the box is still 1
Bitwise AND inverse of this thread's box of global mask with bitmask of box
NMS execution time vs number of input boxes.
NMS execution time vs number of distinct ( non-overlapping) boxes, with input boxes fixed at 54000.
radix sort rectangles in score descending order(CUB DeviceRadixSort)
run the kernel as follows
for each rectangle
calculate IOU with each rectangle lower in score
if IOU is above threshold mark by setting -1 in the index
extract the non -1 indices (CUB DeviceSelect::If)
  1. OpenCV NMS https://github.com/opencv/opencv/blob/master/modules/dnn/src/nms.inl.hpp
  2. tf.image.non_max_suppression https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/image/non_max_suppression
  3. Tensorflow 1.15 NMS — CPU https://github.com/tensorflow/tensorflow/blob/r1.15/tensorflow/core/kernels/non_max_suppression_op.cc
  4. Improving Object Detection With One Line of Code https://arxiv.org/pdf/1704.04503.pdf
  5. Tensorflow 1.15 NMS — GPU https://github.com/tensorflow/tensorflow/blob/r1.15/tensorflow/core/kernels/non_max_suppression_op.cu.cc
  6. Learning non-maximum suppression, http://openaccess.thecvf.com/content_cvpr_2017/papers/Hosang_Learning_Non-Maximum_Suppression_CVPR_2017_paper.pdf
  7. MaxpoolNMS: Getting Rid of NMS Bottlenecks in Two-Stage Object Detectors, http://openaccess.thecvf.com/content_CVPR_2019/papers/Cai_MaxpoolNMS_Getting_Rid_of_NMS_Bottlenecks_in_Two-Stage_Object_Detectors_CVPR_2019_paper.pdf
  8. non_max_suppression GPU version is 3x slower than CPU version in TF 1.15, https://github.com/tensorflow/tensorflow/issues/33708
  9. WORK-EFFICIENT PARALLEL NON-MAXIMUM SUPPRESSION FOR EMBEDDED GPU ARCHITECTURES, http://rapid-project.eu/_docs/icassp2016.pdf
  10. An efficient end-to-end object detection pipeline on GPU using CUDA, https://pure.tue.nl/ws/portalfiles/portal/130181034/Wang_XiaoweiMaster_Thesis_3_.pdf
  11. Code to experiment with NMS ops( or other ops) in Tensorflow 1.x., https://github.com/whatdhack/tf-nms .
  12. SSD: Single Shot MultiBox Detector, https://arxiv.org/pdf/1512.02325.pdf .
  13. Daedalus: Breaking Non-Maximum Suppression in Object Detection via Adversarial Examples, https://arxiv.org/abs/1902.02067

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store