CS6643 Exam 2 Solved

35.00 $ 17.50 $

Category:
Click Category Button to View Your Next Assignment | Homework

You'll get a download link with a: . zip solution files instantly, after Payment

Description

5/5 - (2 votes)
  1. In the basic stereo imaging setup below, the origin of the world coordinate system W is located at the lens center of the left camera. The distance between the lens centers of the two cameras is 12 cm. The two cameras have a focal length of 50 mm and the sensor chips (real image planes) of the cameras have a physical size of 1.2 cm Γ—2 cm. The output of the cameras is a pair of digital stereo images, each of size 512 Γ— 512 pixels. The tip of vertical pole # 1 appears in the left image at pixel location (𝑖𝑖, 𝑗𝑗) = (185,125) and appears in the right image at location (𝑖𝑖, 𝑗𝑗) = (185,115). The tip of vertical pole # 2 appears in the left image at pixel location (𝑖𝑖, 𝑗𝑗) = (185,179) and appears in the right image at location (𝑖𝑖, 𝑗𝑗) = (185,169). Compute the horizontal distance between the tips of the two poles in the world coordinate system (horizontal distance = distance in the π‘₯π‘₯ direction.) Show all work to get full credits. (The integer image plane uses the i-j coordinate system with i going from top to bottom and j going from left to right.)
  2. Β We would like to use a minimum-distance classifier formulated using linear discriminant functions 𝐷𝐷𝑖𝑖(𝑋𝑋) to classify input X into one of three classes. The prototype vectors for the three classes are given below. Find the equation of the decision boundary between classes 1 and 3 and simplify the equation into an algebra equation (not matrix equation) and then plot the decision boundary as a graph.
  3. Β Given an input grayscale image, we would like to use Harris Corner Detector to detect interest points from the image. Write the pseudo code to compute the Local Structure Matrix A of the image at every pixel location. Do not write more than 10 lines in your pseudo code.
  4. We would like to use the signed representation of the Histogram of Oriented Gradients (HOG) descriptor to detect human in images. In the signed representation, the histogram has 18 bins.
  • What is the dimension of the descriptor if we assume the following parameter settings:

detection window size = 296 x 168 pixels (rows x columns), cell size = 8 x 8 pixels, block size = 3 x 3 cells, and block overlap = 8 pixels.

  • The bin centers for the 18 histogram bins, the gradient magnitudes and gradient angles of an 8 x 8 cell are as given below, compute the histogram of the cell (before block normalization.)

Β 

Bin # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Bin centers (in degrees) 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340

 

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 220 0 0 0 0 0
0 0 0 0 180 0 0 0
0 120 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0

Β Β Β Β Β Β Β Β Β Β Β  Gradient Magnitudes

Β 

200 45 23 98 130 260 255 250
125 295 85 90 130 265 249 240
123 35 85 95 125 260 250 240
100 90 45 90 120 265 240 230
95 99 105 106 355 120 100 110
90 205 110 120 120 130 125 120
85 90 100 110 110 120 120 110
80 80 100 110 100 100 100 110

Gradient Angles

  1. Suppose we have already computed the normalized co-occurrence matrix 𝑃𝑃[𝑖𝑖, 𝑗𝑗] of an input image using displacement vector 𝑑𝑑 = (𝑑𝑑𝑑𝑑, 𝑑𝑑𝑑𝑑), can we obtain the normalized co-occurrence matrix

𝑃𝑃′[𝑖𝑖, 𝑗𝑗] for displacement vector 𝑑𝑑′ = (βˆ’π‘‘π‘‘π‘‘π‘‘, βˆ’π‘‘π‘‘π‘‘π‘‘) without referring to the original input image? If so, how do we do that? Do not write more than six sentences. (Hint: displacement vector 𝑑𝑑′ has the same magnitude as d but in the opposite direction.)

  1. Consider the camera coordinate system C and the world coordinate system W as

shown in the figure below. The origin of the camera coordinate system is located at 𝑀𝑀(π‘₯π‘₯,𝑦𝑦,𝑧𝑧)=𝑀𝑀(6,2,0) with respect to the world coordinate system. The x axis of the camera coordinate system is parallel to the y axis of the world coordinate system, the y axis of the camera coordinate system is parallel but points in the opposite direction of the x axis of the world coordinate system, and the z axis of the camera coordinate system is parallel to the z axis of the world coordinate system. The camera has a focal length of 45 mm and the real image plane (π‘₯π‘₯β€², 𝑦𝑦′) of the camera is of size 1 cm Γ— 1 cm. The real image plane is digitized into a digital image of size 1024 Γ— 1024 pixels. Derive the πŸ‘πŸ‘ Γ— πŸ’πŸ’ camera transform that transforms points in the world coordinate system to the pixel coordinate system of the camera.

Note: Assume that the real image plane has origin at the lower left corner, with the π‘₯π‘₯β€² axis pointing to the right and the 𝑦𝑦′ axis pointing upward. The digital image plane has origin (0,0) at the upper left corner, with the i axis pointing downward and the j axis pointing to the right. The range for both i and j is [0, 1023].

  1. Β In the LeNet-5 convolutional neural network below, (a) what is the total number of links between the input layer and the C1 layer? (b) How many different parameters need to be trained for the links between the input layer and the C1 layer?
  2. Β A deep neural network has been designed to classify the input into one of five classes. The final output layer of the network is a Softmax Suppose the input to the Softmax layer is [0 7 5 0 1]𝑇𝑇, what are the final outputs of the neural network?

Hint: the formula for the Softmax function is:

  1. In the Eigenface method for face recognition, we compute the distance between an input face and its reconstruction as 𝑑𝑑0 = dist(𝐼𝐼𝑅𝑅⃗, 𝐼𝐼⃗). The distance between an input face image and its reconstruction should be small. Explain why the distance will be large for a non-face input image. Do not write more than six sentences.
  • Final-cjqkfn.zip