COMP3711 Homework 2 Solved

35.00 $

Category: Tags: ,

Description

Rate this product

Problem 1:

Let A be an array of n elements. A heavy element of A is any element that appears more than 3/20 times, i.e., is more than 15% of the items. For example, if n = 14 then a heavy element is one that appears at least 3=􏰖3 ·14􏰗times.

20

As examples, consider the arrays A, B and C below:
i 1 2 3 4 5 6 7 8 9 10 11 12 13 14

A[i]2635638827 4 9 9 4 i 1 2 3 4 5 6 7 8 9 10 11 12 13 14

B[i]3335638367 4 9 9 4 i 1 2 3 4 5 6 7 8 9 10 11 12 13 14

C[i]3336638366 6 9 9 9
A contains no heavy items. B contains the unique heavy item 3. C contains

the three heavy items 3, 6 and 9.
Design an O(n log n) time divide-and-conquer algorithm for finding all

the heavy items (if any) in an array.

Your solution should be split into the following three parts. Write the answer to each part separately.

  1. (a)  (i) Write documented pseudocode for a procedure Heavy(i,j) that returns the set of heavy items in subarray A[i . . . j].

    (ii) Below the pseudocode, provide a description in words (as opposed to code) of what your algorithm is doing.

  2. (b)  Explain (prove) why your algorithm is correct.
  3. (c)  Let T(n) be the worst case number of total operations of all types performed by your algorithm. Derive a recurrence relation for T(n) (explain how you derived the recurrence relation).

    Show that T (n) = O(n log n).
    Note: If the recurrence relation is in the form

    ∀n>1, T(n)≤T􏰏􏰒n􏰓􏰐+T􏰏􏰔n􏰕􏰐+c1n and T(1)=c2 (1) 22

    for some c1, c2 ≥ 0 you may immediately conclude that T (n) = O(n log n) without providing any further explanation other than saying you are calling the Master Theorem (and explaining which case of the master theorem it is and why).

    Otherwise, you must prove from first principles that whatever recurrence relation you derived implies T (n) = O(n log n) for all values of n.

2

See below for requirements, hints and general comments. Requirements, Hints and Comments:

  • Hint: The solution to Tutorial problem DC12 will help you get started.
  • Hint: Let S = Heavy(i, j). Prove that subarray A[i . . . j] cannot con-

    tain more than 6 heavy items. You will need this fact in your analysis

  • Sorting is not permitted.
    The only comparisons allowed to items in the array are equalities. Inequalities are not permitted. More specifically:

    • –  You MAY write “If A[i] = A[j]” or “If A[i] = x”.
    • –  You MAY NOT write “If A[i] < A[j]” or “If A[i] < x”.
      Note that this immediately implies that you can NOT sort the array (since sorting requires inequality comparisons).
    • –  In particular you MAY NOT solve the problem by sorting the array and then counting all of the items with the same value.
  • Part (a) requires two different types of explanations as to what your algorithm is doing. The first is the documentation IN the pseudocode. The second is the explanation AFTER the pseudo-code. BOTH must be provided.
  • Part (b) is a proof of correctness. Write it as you would a mathematical proof. Explicitly state any facts upon which the proof depends. In- complete proofs will have points deducted. See the Practice Homework on the Tutorial page for pointers as to how this should be structured.
  • Answer Parts (a) and (b) separately. Do not worry if your explanation in Part (a) and your proof of correctness in Part (b) overlap somewhat.
  • We are specifically asking for a divide-and-conquer algorithm here, e.g., a generalization of DC12. We are aware that faster non divide- and-conquer algorithms do exist and you might be able to find one by Googling. Such algorithms will not be accepted as answers. The purpose of this exercise is to use the divide-and-conquer approach to solve the problem.
  • The call Heavy(1, n) should run in O(n log n) time and return a set. You can do this by writing Return(S), where S is a set.
  • Set Notation: Your pseudocode will need to use a set data structure with associated set operations. You should use standard mathematical notation for writing set operations (do not make up your own set notation or use a notation from a programming language you know).

    See the next page for more information on how to write set operations in pseudocode along with associated running times.

3

Set notation in pseudocode: In what follows, S, S1, S2 are all sets. Recall that sets do not contain repeated items.

So,ifS1 ={a,b},S2 ={b,c}andS=S1∪S2,thenS={a,b,c}. |S|denotesthesizeofS,e.g,|S1|=|S2|=2,|S1 ∪S2|=3,|S1 ∩S2|=1.

If S = ∅ (the empty set), then |S| = 0.
“∪” denotes the union of sets. ‘∩” denotes the intersection of sets.
S1 \S2 (equivalently, S1 −S2) is the set S1 with all items in S1 ∩S2 removed.

  • Create sets using the notation “Create Set S”. This creates empty set S in O(1) time.
  • Evaluating |S| requires O(1) time.
  • Setting S1 = S2 will require O(|S2| + 1) time.
  • YoucanaddelementxtosetSbywritingS=S∪{x}.

    •EvaluatingS=S1∪S2,S=S1∩S2 andS=S1\S2 allrequire O((|S1| + 1)(|S2| + 1)) time.

    • Evaluating the statement “x ∈ S” requires O(|S|+1) time and returns TRUE if x is in S and FALSE otherwise.
    • You may write pseudocode in the form ∀x ∈ S do

      W ork(x)
      where Work(x) is some code dependent upon x.

      If S = {x1 …,xk}, the time required to run this code will be O(1) plus thetotaloftheworkrequiredtorunallofWork(x1), Work(x2),…,Work(xk).

      Example 1: The following code finds the unique items in array A: Create Set S

      For i = 1 to n do S = S ∪ {A[i]}

      It uses O(n · m) time, where m is the number of unique items in A[1 . . . n].

      Example 2: The following code constructs S1 ∩ S2 : Create Set S

      ∀x∈S1 do Ifx∈S2 then

      S = S ∪ {x}
      It runs in O((|S1| + 1)(|S2| + 1)) time.

4

Problem 2 [28 pts] Indicator Random Variables

A[1 . . . n] is an array. In addition to A you are given 2 other structures that are built from A: a binary tree T and a m×m 2D matrix M. In each of these structure we also define N(v), the set of neighbors of item v. Each of these structures and their neighbor definitions are defined below. See the diagram on the next page for examples.

  1. (1)  Array A: This is just the standard array. Neighbors are the items to the left and right of i.

    {i−1,i+1} if1<i<n 

    N(i)= {2} ifi=1 {n − 1} if i = n

  2. (2)  Binary Tree T :
    Inthiscaseweassumethatn=2k −1forsomeintegerk≥0.
    The correspondence between A and T is given below (and is also in the Heapsort slides page 9 onwards. Please see those for more expla- nation).
    Each node in the tree corresponds to some element of array A. Node i will have value T [i] = A[i].

    The root of the tree is node 1, and has no parent.
    The parent of node i ̸= 1 is node ⌊i/2⌋. The left and right children of node i are nodes 2i and 2i + 1, if they exist

    {2, 3} if v = 1 

    NT(i)= {2i,2i+1,⌊i/2⌋} if1<i<(n+1)/2 {⌊i/2⌋} if (n + 1)/2 ≤ i ≤ n

(3) m×m 2D matrix M.

In this case we assume that n = m2.
M(i, j) = A[(i − 1)m + j]. An element’s neighbors are its (at most 4) vertical and horizontal neighbors. That is,

NM((i,j))={(i+1,j),(i−1,j),(i,j+1),(i,j−1)}􏰘{(x,y) : 1≤x≤m,1≤y≤m}.

Now solve the six subproblems below. See the end of this problem for an explanation of what “Derive the expected number” means and how your solutions should be structured.

5

Local minima are shaded gray; saddle points, light blue. 1 2

T

1 2 3 4 5 6 7 8 9101112131415

2 13 12 8 3 11 9 1 7 10 6 4 5 15 14

i

A[i]

A has 5 local minima and 6 saddle points; T has 7 local minima and 2 saddle points.

3 12

213
48 53 611 79

81 97101011612413514151514

ij 1 2 3 4 M

16

6

14

7

11

10

13

4

1

8

15

9

3

2

12

5

1 i 162 A[i] 53 4

A has 6 local minima and 4 saddle points; M has 5 local minima and 3 saddle points.

For all of the six problems assume that A is a uniform random permu- tation of < 1,2,…,n > . This means that every one of the n! possible permutations is equally likely to occur.

The corresponding T and M are built from A as previously described. The solutions to all six of the problems must use the indicator

random variable technique to derive the result. (a) Counting local minima.

(i) A[i] is a local minimum in A if it is smaller than all of its neighbors, i.e, A[i] < minv∈N(i) A[v].
Derive the expected number of local minima in A.
Hint: Let Xi be the indicator random variable for the event that A[i] is a local minimum. What is E(Xi) (this might depend upon i)? How does knowing the E(Xi) let you answer the problem?

(ii) T [i] is a local minimum in T if it is smaller than all of its neighbors, i.e, T [i] < minv∈NT (i) T [v].
Derive the expected number of local minima in T.

(iii) M(i,j) is a local minimum in M if it is smaller than all of its neighbors, i.e, M(i, j) < min(x,y)∈NM (i,j) M(x, y).
Derive the expected number of local minima in M.

1 2 3 4 5 6 7 8 9101112131415

1661471110134 1 8159 3 212

6

(b) Counting Saddle points.

(i) A[i]isasaddlepointinAifi̸=1,i̸=n,andA[i]islargerthanone of its neighbors and smaller than the other one of its neighbors. Derive the expected number of saddle points in A.

(ii) T[i]isasaddlepointinT if1<i<(n+1)/2andT[i]issmaller than its parent but larger than both of its children.
Derive the expected number of saddle points in T.

(iii) M(i,j) is a saddle point in M if M(i,j) is smaller than all of its neighbors in the same row and larger than all of its neighbors in the same column.
Derive the expected number of saddle points in M.

Structure of the Solution: The solution to each of the six subprob- lems must be written separately, with space between the solution to each subproblem

For each subproblem, you should derive the expected number of local min- ima or saddle points. as follows, using a three part solution:

(A) First clearly define the indicator random variable(s) that you will be using to solve this problem.

(B) Next, derive the expectation of each such indicator random variable. (C) Finally, solve the full subproblem, by deriving a closed formula in n (no

summations “􏰑” allowed) for the requested value.

7

Problem 3 [24 pts] (Using Selection)

Note: Recall the Selection problem of calculating Select(A, p, r, i) that we learned in class. We actually learned a randomized linear, i.e., O(r − p + 1), time algorithm for solving Selection. We also stated that a deterministic linear time algorithm for solving this problem exists. For the sake of this problem, you may assume that you have been given this deterministic linear time algorithm and can use it as a subroutine. (calling it exactly as Select(A, p, r, i)). You may not use any other algorithm as a subroutine without explicitly providing the code and proving correctness from scratch.

The Manhattan Distance between two 2-dimensional points p = (x, y) and p′ =(x′,y′)is

d1(p, p′) = |x − x′| + |y − y′|.
In what follows you are given real numbers x1,…,xn and y1,…,yn. They

are NOT necessarily sorted.

Define the n two-dimensional points pi = (xi,yi). The Manhattan Ware- house problem is to find a point p = (x,y) that minimizes the average distance to all the pi. That is, to find p such that

nn
∀p′ = (x′, y′) ∈ R2, 􏰇 d1(p, pi) ≤ 􏰇 d1(p′, pi).

i=1 i=1 (Dividing both sides of the equation by n1 gives the average.)

The goal of this problem is to design a O(n) worst case time algorithm for solving the Manhattan Warehouse problem. We split this into two parts.

(A) Solve the one-dimensional problem.
Given (unsorted) x1, . . . xn, define the function

n
f (x) = 􏰇 |x − xi |.

i=1 Call x ̄ a center if it minimizes f(x), i.e.,

∀x ∈ R, f(x ̄) ≤ f(x).

(a) Prove that there exist z1, z2 such that z1 ≤ z2 and
(i) f(x) is monotonically decreasing for x ∈ [−∞,z1].

(ii) f(z1) = f(x) = f(z2) for all x ∈ [z1,z2].
(iii) f(x) is monotonically increasing for x ∈ [z2,∞].

Note that it is possible that z1 = z2. 8

(b) GiveanO(n)timealgorithmforfindingacenterofnone-dimensional points x1,…,xn.
First give documented pseudocode for your algorithm.
In addition, below your algorithm, explain in words/symbols what your algorithm is doing

(c) Prove correctness of your algorithm. Your proof must be mathe- matically formal and understandable.

(d) Explain why your algorithm runs in O(n) time. (B) Solve the Manhattan Warehouse Problem.

(a) Give an O(n) time algorithm for solving the Manhattan Ware- house Problem given n points p1, p2, . . . , pn where pi = (xi, yi). First give documented pseudocode for your algorithm.
In addition, below your algorithm, explain in words/symbols what your algorithm is doing

(b) Prove correctness of your algorithm. Your proof must be mathe- matically formal and understandable.

(c) Explain why your algorithm runs in O(n) time. Notes:

  • For simplicity, you may assume that none of the xi or yi repeat, i.e., there is no pair i,j such that xi =xj or yi =yj.
  • Part A(a) is only meant as a hint to get you started. Note that if you know z1,z2, then (you must prove this) any x ∈ [z1,z2] will be a solution to the one-dimensional center problem.

    A more detailed understanding of the behavior of f(x), that lets you find z1,z2, therefore permits solving the remainder of A. You may, if you like, fully analyze the behavior of f(x) in A(a), write this as a mathematical lemma and then quote that lemma in A(c).

  • We strongly recommend that before trying to solve part A you choose 5 or 6 points xi and then graph the resultant function f(x). Seeing the diagram should help you solve the problem.
  • The algorithm for part (B) can use the result of part (A) as a subroutine.
  • The main work in solving this problem is in understanding how to solve it using selection. Once you understand that, the actual pseudocode might be quite short.

9

P4: [18 pts] Recurrence Relations

Give asymptotic upper bounds for T (n) satisfying the following recurrences. Make your bounds as tight as possible. For example, if T(n) = Θ(n2) then T (n) = O(n2) is a tight upper bound but T (n) = O(n2 log n) is not. Your upper bound should be written in the form T(n) = O(nα(logn)β) where α, β are appropriate constants.

A correct answer will gain full credits. It is not necessary to show your work BUT, if your answer was wrong, showing your work steps may gain you partial credits.

If showing your work, you may quote theorems shown in class. For (a), (b) and (c) you may assume that n is a power of 2; for (d) you may assume that n is a power of 4;
for (e) that n is a power of 7; for (f) that n is a power of 3

(a) T(1)=1;T(n)=4T(n/2)+n2√nforn>1. (b) T(1)=1;T(n)=9T(n/2)+n3log2nforn>1.

(c) T(1)=1;T(n)=4T(n/2)+􏰑ni=1iforn>1. (d) T(1)=1;T(n)=3T(n/4)+1forn>1.

(e) T(1)=1;T(n)=2T(n/7)+2log7n forn>1. (f) T(1)=1;T(n)=9T(n/3)+log3(n!)forn>1.

10

  • HW2-yuvbh4.zip