Description
Problem 1: Pattern-matching: The brute-force
Problem 1.1: The brute-force pattern-matching algorithm
Describe a text D and a pattern P such that the brute-force pattern-matching algorithm runs in Ω(dp) time.The lengths of D and P are d and p, respectively.
Problem 1.2: Python’s str class and pattern-matching
In this part, you are asked to modify three pattern matching programs given to you (See ap- pendix). Run your modified programs for varying-length patterns and show your results.
The count method in Python’s str class takes a text D and a pattern P and returns the maximum number of non-overlapping occurrences of a P within D. As an example ‘cdcdcd- cdc’.count(‘cdc’) returns 2.
- Modify the brute-force pattern-matching to return non-overlapping occurrences of a P within D.
- Similar to the previous question (Problem 1.2.1), do the same on the Boyer-Moore pro- gram.
- Similar to problem 1.2.1, modify the KMP program.
Problem 2: Experimental Analysis of Pattern-Matching Algorithms
Perform an experimental analysis of pattern matching algorithms in terms of:
- Number of character comparison: Perform an experimental analysis of the efficiency of the brute-force, the KMP and Boyer-Moore pattern matching algorithms for varying-length patterns.
- Relative speed comparison: Perform an experimental comparison of the brute-force, KMP, and Boyer-Moore pattern-matching algorithms. Run each algorithm against large text doc- uments using varying-length patterns and report the relative running times.
Assignment No 5 Page 2
Problem 3: Matrix-chain Multiplication
The matrix-chain multiplication problem: Given a chain of < D1, D2, . . . , Dn > of n matrices fully parenthesize the product < D1 · D2 · · · Dn > in a way so that the number of scalar multiplications isminimized. EachDi hasapi−1 ×pi dimensionandi=1,2,…,n.
1. The Brute-Force: [10 pt.]: Implement a Python program to solve the matrix-chain multipli- cation problem by the brute force algorithm.
2. Bottom-up Dynamic Programming [20 pt.]: Implement a Python program to solve the matrix-chain multiplication problem using bottom-up dynamic programming approach.
3. Dynamic Programming with Memoization [Extra Credit, 10 pt.]: Implement a Python pro- gram to solve the matrix-chain multiplication problem using dynamic programming with memoization.
Problem 4: Longest Common Sub-sequence (LCS) Problem [20 pt.]
Implement a Python program to solve LCS problem using dynamic programming. Run your program to find the best sequence alignment between DNA strings. Show your results.
Longest Common Sub-sequence (LCS) problem: Given two character strings over some alphabet, find a longest string that is a sub-sequence of given two strings.
Data source: https://www.ncbi.nlm.nih.gov/genbank/
Directions
Please follow the syllabus guidelines in turning in your homework. While testing your programs, run them with a variety of inputs and discuss your findings. This homework is due Sunday, Nov 14, 2021 10:00pm. OBSERVE THE TIME. Absolutely no homework will be accepted after that time. All the work should be your own.
Assignment No 5 Page 3
1 2 3 4 5 6 7 8 9
10 11 12
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Appendix
Python program for the Brute-Force pattern-matching algorithm
# Brute force
def find_brute(T, P):
n, m = len(T), len(P)
# every starting position
for
i in range(n-m+1):
k=0
# conduct O(k) comparisons while k < m and T[i+k] == P[k]:
k += 1 ifk==m:
return i return -1
Python program for the Boyer-Moore pattern-matching algorithm
# Boyer-Moore
def find_boyer_moore(T, P): n, m = len(T), len(P) ifm==0:
return 0 last = {}
for k in range(m): last[P[k]] = k
i = m-1
k = m-1 whilei<n:
# If match , decrease i,k
if T[i] == P[k]: if k == 0:
return i else:
# Not
else: j i k return -1
i -= 1
k -= 1
match , reset the positions
= last.get(T[i], -1) += m – min(k, j+1)
= m-1
Assignment No 5
Page 4
Python program for the Knuth-Morris-Pratt pattern-matching algorithm
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19 20
# KMP failure function
def
compute_kmp_fail(P): m = len(P) fail=[0]*m
j=1
k=0 whilej<m:
if P[j] == P[k]: fail[j] = k+1
- j  += 1
- k  += 1
elif k > 0:
k = fail[k-1] else:
j +=1 return fail
# KMP
def find_kmp(T, P):
n, m = len(T), len(P) ifm==0:
return 0
fail = compute_kmp_fail(P) # print(fail)
j=0
k=0
whilej<n:
if T[j] == P[k]:
j
k else: j return -1
+=
1 1 0:
k elif k >
if k
== m-1: return j-m+1
+=
= fail[k-1] += 1
Assignment No 5
Page 5