Assignment 4: Feature points, matching, homography Machine perception Solved

40.00 $

Category: Tags: , , , , , , ,
Click Category Button to View Your Next Assignment | Homework

You'll get a download link with a: zip solution files instantly, after Payment

Securely Powered by: Secure Checkout

Description

5/5 - (1 vote)

 

Create a folder assignment4 that you will use during this assignment. Unpack the content of the assignment4.zip that you can download from the course webpage to the folder. Save the solutions for the assignments as Python scripts to assignment4 folder. In order to complete the exercise you have to present these files to the teaching assistant. Some assignments contain questions that require sketching, writing or manual calculation. Write these answers down and bring them to the presentation as well. The tasks that are marked with (cid:70) are optional. Without completing them you can get at most 75 points for the exercise. The maximum total amount of points for each assignment is 100. Each optional task has the amount of additional points written next to it.

Introduction

This assignment will deal with automatic searching for correspondence points between two images. Correspondences are a key step when dealing with the task of aligning two or more images, for example building a panorama from multiple images. Methods that find correspondences between images usually start by finding feature points in them. Feature points denote regions in the image that have a high chance of re-detection in an image where the same scene is captured from a slightly different angle or viewing conditions.

Exercise 1: Feature points detectors

In this exercise you will implement two frequently used feature point detectors: the Hessian algorithm [1](p. 44) and the Harris algorithm.

(a) The Hessian detector is based on second derivatives matrix H(x, y) (also named Hessian matrix, hence the name of the algorithm) at point (x, y) in the image:

H(x, y) =

(cid:20) Ixx(x, y; σ) Ixy(x, y; σ) Ixy(x, y; σ) Iyy(x, y; σ)

(cid:21)

,

(1)

where σ explicitly states that the second derivative is computed on a smoothed image. Hessian detector selects point (x, y) as a feature point if the determinant of the Hessian matrix exceeds a given threshold value t. If we want to create a variant of the Hessian detector that is scale independent (independent of the Gaussian filter that is used to smooth the image) we have to include a normalization factor of σ4 and thus we get a feature detection rule:

det(H(x, y)) = σ4(Ixx(x, y; σ)Iyy(x, y; σ) − Ixy(x, y; σ)2) > t.

(2)

1

Implement a function hessian_points, that computes a Hessian determinant using the equation (2) for each pixel of the input image. As this computation can be very slow if done pixel by pixel, you have to implement it using vector operations (without explicit for loops). Test the function using image from test_points.jpg as your input (do not forget to convert it to grayscale) and visualize the result. Extend the function hessian_points by implementing a non-maximum suppres- sion post-processing step that only retains responses that are higher than all the neighborhood responses and whose value are higher than a given threshold value thresh. The function should return the list of points that satisfy these properties. Create a function that you will use plot the detected points (use plt.plot()) over the input image. Load the image test_points.jpg, process it and visualize the result. Set the input parameters of the Hessian detector to thresh = 100 and σ = 1, and then change their values to get a feeling for the effect of the parameters.

Question: What kind of structures in the image are detected by the algorithm? How does the parameter σ affect the result?

(b) Implement the Harris feature point detector. This detector is based on the auto- correlation matrix C that measures the level of self-similarity for a pixel neigh- borhood for small deformations. At the lectures you have been told that the Har- ris detector chooses a point (x, y) for a feature point if both eigenvalues of the auto-correlation matrix for that point are large. This means that the neighbor- hood of (x, y) contains two well defined rectangular structures – i.e. a corner. Auto- correlation matrix can be computed using the first partial derivatives at (x, y) that are subsequently smoothed using a Gaussian filter:

(cid:20) G(x, y; ˜σ) ∗ I 2

C(x, y; σ, ˜σ) = σ2

G(x, y; ˜σ) ∗ IxIy(x, y; σ) G(x, y; ˜σ) ∗ I 2 where ∗ stands for convolution. Computing eigenvalues λ1 and λ2 of matrix C(x, y; σ, ˜σ) is expensive, therefore we will use the following relations1

x(x, y; σ) G(x, y; ˜σ) ∗ IxIy(x, y; σ) y (x, y; σ)

(3)

(cid:21)

,

det(C) = λ1λ2 trace(C) = λ1 + λ2

(4) (5)

1In the following text we will omit the parameters of an auto-correlation matrix C(x, y; σ, ˜σ) for clarity

and write C instead.

2

to compute the ratio r = λ1/λ2. If we assume that

trace2(C) det C

=

(λ1 + λ2)2 λ1λ2

=

(rλ2 + λ2)2 rλ2λ2

=

(r + 1)2 r

,

we can express the feature point condition for (x, y) as:

det(C) − αtrace2C > t.

(6)

(7)

In practice we use ˜σ = 1.6σ, α = 0.06 for parameter values. Implement the condition (7) without using explicit computation of matrix C and without for loops, as you did for the core part of the Hessian detector.

Implement the feature point detector as function harris_points that computes values of equation (7) for all pixels, performs the non-maximum suppression post- processing step as well as thresholding using threshold thresh. The function should return the list of points that satisfy these properties.

Load the image test_points.jpg and compute the Harris feature points. Compare the result with the feature points detected by the Hessian detector. Experiment with different parameter values. Do the feature points of both detectors appear on the same structures in the image?

Exercise 2: Matching local regions

On of the uses of feature points is searching for similar structures in different images. To do this we will need descriptors of the regions around these points. In this assignment you will implement some simple descriptors as well as their matching.

(a) Use the function simple_descriptors to calculate descriptors for a list of feature points. Then, write a function find_correspondences that, given two lists of de- scriptors, calculates the similarities between all descriptors in both lists. Use the Hellinger distance. Finally, for each descriptor from the first list, find the most similar descriptor from the second list. A good idea might be to use the same list (perhaps shuffled) for both inputs to sanity check your code.

Note: Use your own functions gauss and gaussdx from previous assignments in order for the descriptor function to work correctly.

3

(b) Write a script that loads images graf/graf1_small.jpg and graf/graf2_small.jpg, runs the function find_correspondences and visualizes the result. Use the function display_matches from the supplementary material for visualization. Experiment with different parameters for descriptor calculation and report on the changes that occur.

(c) Implement a simple feature point matching algorithm. Write a function find_matches that is given two images as an input and returns a list of matched feature points from image 1 to image 2. The function should return a list of index pairs, where the first element is the index for feature points from the first image and the second element is the index for feature points from the second image.

Follow the algorithm below:

• Execute a feature point detector to get stable points for both images (you can

experiment with both presented detectors),

• Compute simple descriptors for all detected feature points

• Find best matches between descriptors in left and right images using the Hellinger distance, i.e. compute the best matches from the left to right image and then the other way around. In a post-processing step only select symmetric matches. A symmetric match is a match where a feature point P i in the left 1 image is matched to point P j in the right image and at the same time point 2 P j in left image. This way we get 2 a set of point pairs where each point from the left image is matched to exactly one point in the right image as well as the other way around.

in the right image is matched to the point P i 1

Use the function display_matches from the supplementary material to display all the symmetric matches. Write a script that loads images graf/graf1_small.jpg and graf/graf2_small.jpg, runs the function find_matches and visualizes the result.

Question: What do you notice when visualizing the correspondences? How robust are the matches?

4

(d) (cid:70) (5 points) Incorrect matches can occur when matching descriptors. Suggest and implement a simple method for eliminating at least some of these incorrect matches. You can use the ideas that were presented at the lectures or test your own ideas. Either way, you need to explain the idea behind the method as well as well as demonstrate that the number of incorrect matches is lowered when using the pro- posed method.

(e) (cid:70) (25 points) Implement a local feature detector of your choice (e.g. SIFT, SURF, BRIEF, HOG). Test it on other assignments and report on the changes (increased robustness etc.).

(f) (cid:70) (10 points) Record a video with your phone or webcam. Use OpenCV to detect keypoints of your choice, display them using cv2.drawKeypoints and save the video with the displayed keypoints. The video must demonstrate the keypoint detector’s robustness to rotation and scale change. Make the video at least 15s long.

Exercise 3: Homography estimation

In this assignment we are dealing with planar images, we can try and estimate a homog- raphy matrix H that maps one image to another using planar correspondences. In this exercise you will implement an algorithm that computes such a transformation using min- imization of the mean square error. For additional information about the method, consult the lecture notes as well as the course literature [1] (in the literature the term direct linear transform (DLT) is frequently used to describe the idea).

We will start with a short overview of the minimization of mean square error on a simpler case of similarity transform estimation. The similarity transform is a linear transform that accounts for translation, rotation, and scale. The transformation of a point x can be written as x(cid:48) = f (x, p), where p = [p1, p2, p3, p4] is a vector of four parameters that define the transform such that (cid:20) x(cid:48) y(cid:48)

(cid:20) xp1 − yp2 + p3 xp2 + yp1 + p4

x(cid:48) =

=

(cid:21)

(cid:21)

.

Question: Looking at the equation above, which parameters account for translation

and which for rotation and scale?

Question: Write down a sketch of an algorithm to determine similarity transform n)]. For more details

from a set of point correspondences P = [(x1, x(cid:48) consult the lecture notes.

2), . . . (xn, x(cid:48)

1), (x2, x(cid:48)

5

You will now implement a similar, but more complex algorithm for homography esti- mation. For a reference point xr in the first image we compute a corresponding point xt in the second image as:

h11 h12 h13 h21 h22 h23 1 h31 h32

xr yr 1

Hxr = xt 

 =

 ; where

xt yt zt

1 z(cid:48) t

x(cid:48) t y(cid:48) t z(cid:48) t

 =

 

  ,

x(cid:48) t z(cid:48) t y(cid:48) t z(cid:48) t 1

(8)

where points xr and xt are written in homogeneous coordinates 2. Using the equations (8) we get a system of linear equations with eight unknowns h11, h12, h13, h21, h22, h23, h31, h32:

h11xr + h12yr + h13 h31xr + h32yr + 1 h21xr + h22yr + h23 h31xr + h32yr + 1

= xt

= yt,

that can be transformed to

h11xr + h12yr + h13 − xth31xr − xth32yr − xt = 0 h21xr + h22yr + h23 − yth31xr − yth32yr − yt = 0.

(9)

(10)

(11) (12)

If we want to estimate the eight parameters that determine a homography, we need at least four pairs of matched feature points. As some matches can be imprecise, we can increase n). the accuracy of our estimate by using a larger number of matches (x1, x(cid:48) This way we get an overdetermined system of equations:

1), . . . , (xn, x(cid:48)

         

0

0

0

0

0 −xt1xr1 −xt1yr1 −xt1 xr1 yr1 1 0 xr1 yr1 1 −yt1xr1 −yt1yr1 −yt1 0 0 0 −xt2xr2 −xt2yr2 −xt2 xr2 yr2 1 0 xr2 yr2 1 −yt2xr2 −yt2yr2 −yt2 0 0 … … … … … xrn yrn 1 0 0

… 0 −xtnxrn −xtnyrn −xtn 0 xrn yrn 1 −ytnxrn −ytnyrn −ytn

… 0

… 0

         

            

Ah = 0 

            

=

           

h11 h12 h13 h21 h22 h23 h31 h32 h33

(13)

, (14)

           

0 0 0 0 0 … 0 0

that can be (similarly to estimation of similarity transform) solved as a minimization of mean square error. If the matrix A is a square matrix then we get an exact solution of the system. In case of an over-determined system (e.g., n > 4) the matrix A is not square. This problem is usually [1] solved using a matrix pseudo-inverse AT A, that is square and can be therefore be split to eigenvectors and eigenvalues. A solution of such a system is the unit eigenvector of AT A that corresponds to the lowest eigenvalue (using function eig). The same solution can be obtained more efficiently using the singular-value

2Homogeneous coordinates for a point in 2D space are obtained by adding a third coordinate and

setting it to 1

6

decomposition (SVD) as an eigenvector that corresponds to the lowest eigenvalue of A (using function svd).

svd= UDVT = U

A

 

λ11 … 0

 

 

. . . 0 … . . . · · · λ99

v11 … v91

. . . v19 … . . . · · · v99

 T

 

(15)

A vector h that contains the parameters of the homography matrix is obtained from the last column of matrix V and by normalizing the vector with the value of v99 to set the last element in h to 1:

h =

[v19, . . . , v99]T v99

.

(16)

(a) Write function estimate_homography, that approximates a homography between two images using a given set of matched feature points following the algorithm below.

• Construct a matrix A using the equation (13). • Perform a matrix decomposition using the SVD algorithm: [U, S, V] = svd(A). • Compute vector h using equation (16). • Reorder the elements of h to a 3×3 matrix H (e.g. using the function reshape).

(b) Load the two New York cityscape images (newyork/image1.jpg and newyork/im- age2.jpg), and four hand-annotated correspondence pairs in the file newyork.txt3. Use the function display_matches to display the pairs of points. Using these pairs estimate the homography matrix H from the first image to the second image and use function cv2.warpPerspective to transform the first image to the plane of the second image using the given homography, then display the result. Also test your al- gorithm with images graf/graf1.jpg, graf/graf2.jpg and points from graf.txt. Note: You can check the correctness of your implementation by comparing with the file newyork/H.txt

(c) When you get the correct result for the given points, repeat the procedure by gener- ating a set of matching points automatically using the find_matches function that you have implemented in the previous assignment. Select the first 10 best matches and use them to compute a homography. Visualize the result by transforming the first image to the space of the second image as well as displaying the matches be- tween the images. How well have you managed to estimate the homography? Does the estimate improve if you use more/less matching pairs? What happens if you include imperfect matches in the set? Set the parameters (feature point detector, parameters, number of selected points, etc.) to get the best performance of the method.

Note: This does not need to produce perfect results every time because DLT is not robust to outliers.

3Use read_data function to read the points. Reshape the data into four columns. The first two columns contain x and y coordinates of the points in the first image and the second two columns contain the corresponding points for the second image.

7

(d) (cid:70) (15 points) Implement and test a more robust homography estimation algorithm using iterative reweighed least squares approach that you have heard about at lec- tures. Test the robustness of the algorithm on the image pairs from the previous task. Experiment with different numbers of correspondences used as an input to the estimation.

(e) (cid:70) (10 points) Write your own function for mapping points using the homography matrix. It should work like OpenCV’s warpPerspective(). It should accept an image and a homography matrix and return the input image as remapped by the homography matrix. The size of the output should match the size of the input image.

Hint: You will need to use homogeneous coordinates.

References

[1] D. A. Forsyth and J. Ponce. Computer Vision: A Modern Approach. Prentice Hall,

2002.

8

 

  • assignment_4-gxxl7p.zip