Part 1: Words Rhyme Sorting and Suffix Sharing (50 Points)
In this part, you are going to implement the QuickSort algorithm and use it to sort words based on rhyme (i.e., the correspondence of sound between the endings of words). In addition, you will then group the words that share suffixes (e.g., the two words “Sorting ” and “Sharing ” share the three-character suffix: “-ing”). Given a list of words, you are required to provide the following two operations:
- Rhyme Operation : Sort all words in a rhyming order (25 points):
To accomplish this task you need to apply the following steps:
- Read the input list of words into an array of strings.
- Reverse the letters in each word (e.g., “confound” becomes “dnuofnoc”). Capitalization does not matter.
- Sort the resulting array of words using your own implementation of QuickSort.
- Reverse the letters in each word back to their original state.
- Print the words to the output file.
Example 1: Rhyming Order:
Given the list of words: “compound”, “crabbit”, “confound”, “flashpit”, “astound”, and “trampit” The correct rhyming order of these words is: confound – compound – astound – crabbit – flashpit – trampit
- Suffix Sharing Operation (25 points):
For this operation, you need to Identify the suffixes which have at least k words ending with it (i.e., where k is an input argument). You will group words based on suffix sharing. You need to make use of use of the rhyme order computed previously in this operation. This will enable you to implement this operation efficiently without the need to do a brute force check.
Example 2: Suffix Sharing (k=2)
Given: “rabbit”, “crabbit”, “flashpit”, “trampit” and k=2, will yield the following suffixes: t – crabbit, flashpit, rabbit, trampit it – crabbit, flashpit, rabbit, trampit bit – crabbit, rabbit pit – flashpit, trampit bbit - crabbit, rabbit abbit - crabbit, rabbit
rabbit - crabbit, rabbit
Note that if k=3 for the same list of words, output would be:
t – crabbit, flashpit, rabbit, trampit
it – crabbit, flashpit, rabbit, trampit
For both operations, you will need to apply the sorting on the reversed version of the words (i.e., for the word: “astound” -> “dnuotsa”). You may implement this reverse function or use the string library for C++ or Java. However, you need to provide a full implementation of the QuickSort algorithm, which should work in-place.
In your implementation of the QuickSort, you do not necessarily need to implement string comparing functions, You can use strcmp functions in C++. Also, in Java, just make sure you implements the Comparator interface, so that you can use the (<,>) operators against strings. For the Suffix share implementation, you may want to use a hashmap to store the word lists associated with each suffix. Feel free to use the data structure that comes with the language library (C++ or Java). The test cases will measure the correctness and performance of your implementation using different input sequences.
Sample Input/output format:
- The first integer is a 1 or a 2 indicating this is the input file for part 1 or 2, respectively;
- If it is part 1, second line integer implies required operation 1: rhyme or 2: suffix
- If it is rhyme operation: Third line integer implies that there are n lines following (i.e., n words).
- If it is suffix operation: Third line integer implies k and then the fourth line integer implies that there are n lines following (i.e., n words). ● After that, each line will have one word.
Separate test cases will be given for the two operations and you will be graded independently. For example, if you only complete the Rhyme order, you will get 25pts.
|Sample Input for operation 1: RhymeOrder||Sample Output|
crabbit confound flashpit astound trampit
|confound compound astound crabbit flashpit
|Sample Input for operation 2: SuffixShare||Sample Output|
3 4 rabbit crabbit flashpit trampit
|t -> [crabbit, flashpit, rabbit, trampit] it -> [crabbit, flashpit, rabbit, trampit]|
Part 2: Binary Search Tree (BST) (50 Points)
A Binary Search Tree (BST) is a tree data structure in which each node has at most two children, which are referred to as the left child and the right child and the topmost node in the tree is called the root. It additionally satisfies the binary search property, which states that the key in each node must be greater than or equal to any key stored in the left sub-tree, and less than to any key stored in the right subtree.
For this project, you can assume keys and values are the same (so maintaining the key is enough). The keys will be integers and unique. You are required to complete the following tasks.
Basic Operation: (10 pts)
- Insert a key into BST: insertKey(int key)- A new key is always inserted at a leaf. We start searching for a key from the root until we hit a leaf node. Once a leaf node is found, the new node is added as a child of the leaf node.
- Delete a key: deleteKey (int key)- Remove the key node from the tree. If the key is not in the tree, do nothing.
- Search a key: searchKey(int key) – Search for the key in the BST; return true if found, return false otherwise. This function will be used by you elsewhere.
- Range Sum: rangeSum(int left, int right)- Return the summation of all
the keys in the BST falls in the range [left, right] inclusive. For example, in the above case, the rangeSum(11,17) will return 26 [12 14]. Print “none” to file if no elements found.
- Height of a node: int height(int key) – find and print the height of a node in BST. If a node with the key is not found, print “none” to file. For example, the 14 node in the tree above has height 3.
- Postorder Traversal Print : – postorder() – get the Postorder traversal of the BST. For the above figure, Postorder (Left, Right, Root) should print “4 12 8 22 20”.
- Level Order Traversal Print : – levelorder() – get the Breadth First or Level order traversal of the BST. For the above figure, Breadth First or Level Order Traversal should print “20 8 22 4 12”. One way to implement level order is to use a queue. For all of the above traversal, print “none” in file if no elements present in tree.
- Finding the lowest common ancestor (LCA) between two nodes in BST:
int LCA(int key1, int key2)
Let T be a rooted tree. The lowest common ancestor between two nodes n1 and n2 is defined as the lowest node in T that has both n1 and n2 as descendants (where we allow a node to be a descendant of itself).
The LCA of n1 and n2 in T is the shared ancestor of n1 and n2 that is located farthest from the root.
For example in the above graph,
LCA of 10 and 14 is 12
LCA of 14 and 8 is 8
LCA of 10 and 22 is 20
If any key is invalid or not present, print “none” in file.
- Find Floor and Ceil of a key in BST : Floor(int key) Floor of key means, given a key you need to find the node equal or the largest key smaller than given key. For example, in the above example,
Floor(21) prints “20”
Floor(4) prints “4”
Floor(3) prints “none”
ceil(int key) Ceil of key means, given a key you need to find the node equal or the smallest key larger than given key. For example, in the above example,
Ceil(21) prints “22”
Ceil(4) prints “4”
Ceil(22) prints “none”
- Determining the distance between pairs of nodes in a tree: dist(int key1, int key2)
Given two keys of BST, the distance between two nodes is the minimum number of edges to be traversed to reach one node from other. For example in the above example,
dist(8, 22) is 2 dist(10,22) is 4
If any key is not valid, print “none” in file.
Red Black Tree Insertion (15pts):
In this part, modify your code to handle Red Black Tree insertion portion only. There will be no deletion, therefore your tree will remain balanced. The insert method used may be slightly different for the C++ and the Java sections. Please follow the instructions accordingly.
- Insert to maintain balance in tree : insertRB(int key): Follow the insertion procedure described in Data Structures and Algorithms in C++, 2nd
Edition, M. T. Goodrich, R. Tamassia, D. M. Mount, February
Sample example from book,
|insertRB 4 insertRB 7 insertRB 12 insertRB 15 insertRB 3 insertRB 5 insertRB 14 insertRB 18 insertRB 16 insertRB 17|
- Insert to maintain balance in tree : insertRB(int key): Follow the insertion procedure described in section 3.3 of Algorithms, 4th Edition by Robert Sedgewick and
Sample example from book,
- Black Height of a Red-Black Tree : Black height is number of black nodes on a path from a node to a leaf (excluding given node). Leaf nodes are also counted black nodes.
In the above scenario, getBlackHeight(14)would print “2” to file. Print “none” if it’s an invalid key.
Sample Input/output format:
- The first integer 1 or 2 indicates this is the input file for part 1 or 2.
- The second integer n implies that there are n lines below.
- You can assume all numbers will be of Integer datatype.
- The command and corresponding input/output can be understandable from the sample test cases given bellow.
- For part2, each line of output should end with \r\n and there will be an additional line at the end. Match your output exactly with provided test cases.
Test cases will be given for each of the 12 functionalities listed above. Your score will be based on how many complete units you are passing. For example, if you only complete basic operations, you will get 10pts. If you do basic and traversals you would get total of 25 pts.
|Sample Input||Sample Output|
|2 17 insert 20 insert 22 insert 8 insert 12 insert 4 insert 10 insert 14 height 10 height 22 height 20 height 12 delete 10 delete 15 search 14 search 15 range 1 22 range 11 17
0 2 deleted deletion failed found not found
|2 9 postorder levelorder insert 20 insert 22 insert 8 insert 12 insert 4 postorder levelorder
4 12 8 22 20
20 8 22 4 12
|2 19 insert 20 insert 22 insert 8 insert 12 insert 4 insert 10 insert 14 lca 4 10 lca 20 14 lca 11 20 ceil 21 ceil 22 ceil 23 floor 5 floor 4 floor 3 dist 10 22 dist 4 14 dist 1 0||8 20 none 22 22 none 4 4 none 4 3 none
|Section||RB Tree Sample Input||RB Tree Sample Output|
|C++||2 13 insertRB 4 insertRB 7 insertRB 12 insertRB 15||2
14 7 16 4 12 15 18 3
|insertRB 3 insertRB 5 insertRB 14 insertRB 18 insertRB 16 insertRB 17 Bheight 14 Bheight 16 levelorder|
|Java||2 13 insertRB 1 insertRB 2 insertRB 3 insertRB 4 insertRB 5 insertRB 6 insertRB 7 insertRB 8 insertRB 9 insertRB 10 Bheight 1 Bheight 4 levelorder
4 2 8 1 3 6 10 5 7 9
Part 3: Programming Environment and Grading
Assignments will be tested in a Linux environment. You will be able to work on the assignments using the Linux workstations in HAAS and LAWSON (use your username and password).
- Compiler we use: javac (v10.0.2)
- File to submit: Rhymer.java, BinarySearchTree.java, Node.java, Main.java
Your project must compile using the javac compiler (v10.0.2) on data.cs.purdue.edu. Main.java is provided to you so that you can generate output file on your own to see if your code works correctly, you may change it as your wish.
Compiling process: following commands assume you are at project root and you have NOT changed project file structure! ● To compile Main:
- To run Main: java -cp src/ Main InputFilePath OutputFilePath
Note: You should replace InputFilePath OutputFilePath with actual file path in the above command.
- Compiling your program with JUnit test files using javac v10.0.2.
- Running JUnit tests. It read input files and manipulates data structures implemented by you and see if it works as expected. Any forms of difference between your output and expected output will cause points deduction.
- Inspecting your source code.
The compilation will be done using g++ and makefiles. You must submit all the source code as well as the Makefile that compiles your provided source code into an executable named “program”.
Your project must compile using the standard g++ compiler (v 4.9.2) on data.cs.purdue.edu. For convenience, you are provided with a template Makefile and C++ source file. You are allowed to modify such file at your convenience as long as it follows the I/O specification. Note some latest features from C++14 are not available in g++ 4.9.
The grading process consists of:
- Compiling and building your program using your supplied makefile.
- The name of the produced executable program must be “program” (must be lowercase)
- Running your program automatically with several test input files we have pre-made according to the strict input file format of the project and verifying correct output files thus follow the above instruction for output precisely – do not “embellish” the output with additional characters or formatting – if your program produces different output such as extra prompt and space, points will be deducted.
- Inspecting your source code.
The input to the programming projects will be via the command line:
program input-test1.txt output-test1.txt
The file output-test1.txt will be tested for proper output.