Digital Image Processing

In [1]:
% run

Chapter 7: Feature Extraction and Image Matching

Feature Extractors

In the previous chapter various keypoint detectors were described. Some of them are also feature descriptors (SURF, SIFT) but others (eg. FAST) provide the location of the points but do not describe their features. This is why some feature extractors have been developed to extract a feature from a given point. Some common feature extractors are described here. To learn more about them check the scientific papers listed in the additional resources.


The most popular descriptors, SIFT and SURF, take a lot of memory per feature. This is not suitable for many applications such as those for embedded systems. BRIEF takes advantage of the fact that the full dimension of those descriptors is unnecessary and find the binary string descriptors. It takes a smoothed crop of the input image at the given location. It selects $n_d$ location pairs. Then each location pair is compared and the result is stored in a binary string. The drawback of BRIEF is that it is not as good for images with much rotation.

  • ORB

ORB is a combination of FAST keypoint detector and BRIEF descriptor with some modifications. Orientation is determined by the relative location between the centre of the keypoint and the intensity weighted centroid of the local pixels. The test locations used by BRIEF are rotated according to orientation to produce a steered rotation invariant version. This algorithm however doesn't do much to achieve scale invariance.


BRISK is another approach to describe points located by FAST. It calculates FAST scores for each scale space for $n$ octave images with scales $2^i$ and intra-octave images with scales $1.5 \times 2^i$. Non maximal suppression is applied on each octave except intra-octve $i=-1$. The score is compared with the maximum values in the scale images above and below. A local maximum is chosen and its scale is selected as the scale for this feature. Pixel pairs are sampled into two categories - long- and short-distance. Long distance pairs are used to determine orientation via local gradients. Short distance pairs are then rotated and generate binary descriptors via pixel intensity. The sampling pattern used is evenly spread around concenrtic circles with varying Gaussian smoothing applied based on separation of pixels.


Feature matching is the processes of assigning corresponding features between two or more images of the same object or scene. Generally for most matching algorithms the process is the same - each feature of the first input set is compared to every feature of the other image(s) and a distance metric is calculated. The one with the closest difference is returned. A threshold is used to determined whether the 'match' is a true positive.