COMP352 Assignment4-CVR Solved

35.00 $

Category:

Description

5/5 - (1 vote)

Assume a hash table utilizes an array of 13 elements and that collisions are handled by separate chaining. Considering the hash function is defined as: h(k)=k mod 13.

  1. Draw the contents of the table after inserting elements with the following keys: 32, 147, 265, 195, 207, 180, 21, 16, 189, 202, 91, 94, 162, 75, 37, 77, 81, 48.
  2. What is the maximum number of collisions caused by the above insertions?

Question 2

To reduce the maximum number of collisions in the hash table described in Question 1 above, someone proposed the use of a larger array of 15 elements (that is roughly 15% bigger) and of course modifying the hash function to:  h(k)=k mod 15. The idea is to reduce the load factor and hence the number of collisions.

Does this proposal hold any validity to it? If yes, indicate why such modifications would actually reduce the number of collisions. If no, indicate clearly the reasons you believe/think that such proposal is senseless.

Question 3

Assume an open addressing hash table implementation, where the size of the array is N = 19, and that double hashing is performed for collision handling. The second hash function is defined as:                                                        d(k) = q – k mod q,

where k is the key being inserted in the table and the prime number q is = 7. Use simple modular operation (k mod N) for the first hash function.

  1. Show the content of the table after performing the following operations, in order: put(25), put(12), put(42), put(31), put(35), put(39), remove(31), put(48), remove(25), put(18), put(29), put(29), put(35).
  2. What is the size of the longest cluster caused by the above insertions? iii) What is the number of occurred collisions as a result of the above operations? iv)            What is the current value of the table’s load factor?

 

Question 4 Given the following tree, which is assumed to be an AVL tree:

  1. i) Are there any errors with the tree as shown? If so, indicate what the error(s) are, correct these error(s) [you should attempt applying the smallest possible number of changes to correct the tree], show the corrected AVL tree, then proceed to the following questions (Questions ii to iv) and start with the tree that you have just corrected. If no errors are there in the above tree, indicate why the tree is correctly an AVL tree, then proceed to the following questions (Questions ii to iv) and continue working on the tree as shown above. ii) Show the AVL tree after put(74) operation is performed. Give the complexity of this operation in terms of Big-O notation.
  • Show the AVL tree after remove(62) is performed. Give the complexity of this operation in terms of Big-O notation.
  1. Show the AVL tree after remove(93) is performed. Show the progress of your work step-by-step. Give the complexity of this operation in terms of Big-O notation.

Question 5

Show the steps that a radix sort takes when sorting the following array of integer keys:

832   91   411   172   243   573   326   292   682   489   96

Part 2: Programming Question 

CARFAX, is a commercial service that supplies vehicle history reports to individuals and businesses on used cars and light trucks for the American and Canadian consumers. It keeps records of all Vehicle Identification Number (VIN). Where each VIN is unique and consists of alphanumeric characters (e.g. 4DUSR6IQ8LN209176). The composition of Vehicle Identification Number is unique for each vehicle with maximum length is restricted to 17 alphanumeric characters.  CARFAX has some lists that are local for cities and areas, where n counts a few hundred properties. Others are at the provincial/ state level, that is n counts tens of thousands or more, or even at country levels, that is n counts millions or more. Furthermore, it is important to have access to the vehicle history. Such a historical record for a Vehicle Identification Number should be kept in reverse chronological order (i.e.  from most recent)

CARFAX, needs your help to design a Customized “Vehicle history Report listing” data structure called CVR. Keys of CVR entries are vehicle identification numbers, that are strings composed of 10-17 alphanumeric characters, and one can retrieve the key of a CVR or access a single element by its key. Furthermore, similar to sequences, given a vehicle identification number in a CVR one can access its predecessor or successor (if it exists).

CVR adapts to its usage and keeps the balance between memory and runtime requirements. For instance, if a CVR contains only a small number of entries (e.g., few hundreds), it might use less memory overhead but slower (sorting) algorithms. On the other hand, if the number of contained entries is large (greater than 1000 or even in the range of tens of thousands of elements or more), it might have a higher memory requirement but faster (sorting) algorithms. CVR might be almost constant in size or might grow and/or shrink dynamically. Ideally, operations applicable to a single CVR entry should be between O (1) and O (log n) but never worse than O(n). Operations applicable to a complete CVR should not exceed O(n2).

You are asked to design and implement CVR, a customized data structure which automatically adapts to the dynamic content that it operates on. In other words, it accepts the size (total number of Vehicle Identification Number) as a parameter and uses an appropriate (set of) data structure(s) from the one(s) studied in class in order to perform the operations below efficiently[1].

The CVR must implement the following methods:

  • setThreshold(Threshold): where 100 ≤ Threshold ≤ ~900,000 is an integer number that defines when a listing should be implemented with a data structure such as a Tree,

Hash Table, AVL tree, binary tree, if its size is greater than or equal to value of Threshold. Otherwise it is implemented as a Sequence.

  • setKeyLength(Length): where 10 ≤ Length ≤ 17 is an integer number that defines the fixed string length of keys.
  • generate(n): randomly generates a sequence containing n new non-existing keys of alphanumeric characters.
  • allKeys(): return all keys as a sorted sequence (lexicographic order)
  • add(key,value2): add an entry for the given key and value
  • remove(key): remove the entry for the given key
  • getValues(key): return the values of the given key nextKey(key): return the key for the successor of key.
  • prevKey(key): return the key for the predecessor of key
  • prevAccids(key): returns a sequence (sorted in reverse chronological order) of accidents(previously) registered with the given key (dates).
  1. Write the pseudo code for each of the methods above.
  2. Write the Java code that implements the methods above.
  3. Discuss how both the time and space complexity change for each of the methods above if the underlying structure of your CVR is an array or a linked list?

You have to submit the following deliverables:

  1. A detailed report about your design decisions and specification of your CVR ADT including a rationale and comments about assumptions and semantics.
  2. Well-formatted and documented Java source code and the corresponding class files with the implemented algorithms.
  3. Demonstrate the functionality of your CVR by documenting at least 10 different but representative data sets. These examples should demonstrate all cases of your CVR ADT functionality (e., all operations of your ADT for different sizes).
  • Create one zip file, containing the necessary files (.java, .doc, .html, etc.). Please name your file following this convention: If the work is done by 1 student: Your file should be called a#_studentID, where # is the number of the assignment studentID is your student ID number.

If the work is done by 2 students: The zip file should be called a#_studentID1_studentID2, where # is the number of the assignment, and studentID1 and studentID2 are the ID numbers of each student.

  • If working in a group, only one of the team members can submit the programming part. Do not upload 2 copies.

Very Important: Again, the assignment must be submitted in the right folder of the assignments. Depending on your section, you will either upload to EAS or to Moodle (your instructor will indicate which one to use). Assignments uploaded to an incorrect folder will not be marked and result in a zero mark. No resubmissions will be allowed.

ð Additionally, for the programming part of the assignment, an online demo maybe required (please refer to the courser outline for full details). In such case, the markers will inform you about the demo times and how it will take place online. If an online demo is required, please notice that failing to demo your assignment will result in zero mark regardless of your submission. If working in a team, both members of the team must be present during the online demo time. More details on these online demos will be provided to you around the due date of the assignment.  

[1] The lower the memory and runtime requirements of the ADT and its operations, the better will be your grades. 2 Value here could be any feature of the vehicle’s report.

  • Assignment-4-CVR-a82ngc.zip