Notebook

Digital Image Processing

In [1]:
%run load_lib_ch6.py

Chapter 6: Keypoint Detection

Code used in this chapter can be found here

Keypoints, also called interest points, are point locations within an image that define a feature. These features are invariant of projective transformations such as rotation, translation or scaling and alsp inavariant of photometric transformations such as brighness adjustment. This means that the keypoints in two different digital images that contain the same object can be matched together. This allows for various computer vision problems to be solved. These include panorama stitching, 3D reconstruction, and object recognition.

Therefore the features of a good keypoint detector are:

  • Repeatability - Detection of the same object's keypoints in different images should be the same.
  • Uniqueness - The keypoints detected need to be 'interesting', features that are common for an image would be harder to describe and match to counterparts from other images.
  • Invariance - Geometric and photometric transformations should not affect the produced keypoints.

Some of the most distinctive features of an object are its corners - corners are points with large intensity variation in every directions and they remain corners after most transformations which makes them interesting points. There are many algorithms that implement corner detection.

However, a corner that is scaled up might appear as a straight line. To tackle this problem the correct scale must be found. There are many ways to find the appropriate scale for the tested image location. Most common ones are Laplacian of Gaussian and its approximations.

Common Keypoint Detectors

The keypoint detectors described in this chapter will show the results against:

  • a normal image,
  • a rotated image,
  • an enlarged brightened image (although displayed with the same size).
In [2]:
img_norm, img_rot, img_bright = keypoint_test_images(img)
  • Harris Corner Detection

The Harris detector finds corners by calculating the [eigenvectors] of an image and then applying a function to determine whether they are edges, corners or flat areas.

The keypoints on the third image appear smaller because the image is twice as big. By observing the test below, it can be seen that the Harris detector is quite fast but does not perform that well on the transformed images. Moreover, most of the keypoints are clustered around the most detailed objects - the feather and the eyes, while ignoring the rest of the image.

In [3]:
img_harris= harris_example([img_norm, img_rot, img_bright])
Finding keypoints took 0.0020 seconds
Finding keypoints took 0.0010 seconds
Finding keypoints took 0.0070 seconds
In [4]:
compare_result_3(img_harris[0],img_harris[1],img_harris[2])
  • Shi-Tomasi Corner Detection

The Shi-Tomasi method is an improvement on the Harris detector. The resultant points are more spaced out and therefore more accurate in comparison with the Harris detector example. There are noticably more points located in the rotated image that match their counterparts from the other two images. However it can still be obvserved that many of the points do not match. It is also not scale-invariant and does not perform as good on the third image.

In [5]:
img_shi = shi_tomasi_example([img_norm, img_rot, img_bright], n=100)
Finding keypoints took 0.0030 seconds
Finding keypoints took 0.0010 seconds
Finding keypoints took 0.0080 seconds
In [6]:
compare_result_3(img_shi[0],img_shi[1],img_shi[2])
  • SIFT

Zoomed-up corners no longer appear as corners for a detector with a fixed-size window element. To tackle this problem a Laplacian of Gaussians (LoG) can be used to calculate a scaling factor for the feature detector. However LoG computation is costly. This is why Scale-Invariant-Feature-Transform (SIFT) method is using Difference of Gaussians (DoG) which is an approximation of LoG. It is also useful to eliminate points that are insignificant.

For each location $x$, features are detected at local exmtremes of $D(x) = D + \cfrac{dD^T}{dx}x +0.5*x^T*\cfrac{d^2D^T}{dx^2}*x$ where $D$ is the DoG. The resultant features are scale invariant and are locatied on the edges. To select only the corners and eliminated the flat edges the Harris corner detection described earlier is used. This detection algorithm is repeatable and scale invariant. Finally, the dominant orientation is determined by computing the local miximum of gradient direction which makes is rotation invariant.

In [7]:
img_sift = sift_example([img_norm, img_rot, img_bright])
Finding keypoints took 0.0441 seconds
Finding keypoints took 0.0356 seconds
Finding keypoints took 0.0882 seconds
In [8]:
compare_result_3(img_sift[0],img_sift[1],img_sift[2])
  • SURF

As can be observed, SIFT is significantly slower than the previous methods which could make it unsuitable for applications that require fast computation. This is why Speeded-Up-Robust-Features (SURF) was published by Herbert Bay et al. It is an improved, faster version of SIFT. Instead of approximating the LoG as a DoG, it is approximated as a Box Filter. The advantage of this method is that multiple scales can be computed in parallel.

It can be noticed that the performance of SURF is a lot better than SIFT with about 33% less time needed.

In [9]:
img_surf = surf_example([img_norm, img_rot, img_bright])
Finding keypoints took 0.1996 seconds
Finding keypoints took 0.0130 seconds
Finding keypoints took 0.0521 seconds
In [10]:
compare_result_3(img_surf[0],img_surf[1],img_surf[2])
  • FAST

Although significantly faster, SURF is still not fast enough for some real time applications on slow computing machines. Features from Accelerated Segment Test (FAST) is an even faster feature detector by Edward Rosten and Tom Drummond.

The results of the test show that FAST detector generates significantly more keypoints than any of the previous methods and is as fast as the simplest Harris detector. However this improvement comes at the price of the quality of the detectors and also the space between them.

In [11]:
img_fast = fast_example([img_norm, img_rot, img_bright])
Finding keypoints took 0.0010 seconds
Finding keypoints took 0.0010 seconds
Finding keypoints took 0.0010 seconds
In [12]:
compare_result_3(img_fast[0],img_fast[1],img_fast[2])

In conclusion, there is no single 'best' detector - a FAST detector can be better suited for a mobile application while a SURF could be better suited for image stitching. It is important to note that some of these detectors (SURF and SIFT) have a patent and may not be used in real products without proper licensing.

Table of Contents