CIND110 Assignment3-Association rules Solved

35.00 $

Category:

Description

5/5 - (1 vote)
  1. Association rules:

One of the major techniques in data mining involves the discovery of association rules. These rules correlate the presence of a set of items with another range of values for another set of variables. The database in this context is regarded as a collection of transactions, each involving a set of items, as shown below.

Trans ID      Items Purchased

  • Meat, Potato, Onion
  • Meat, Noodle
  • Noodle, Spinach
  • Meat, Potato, Onion
  • Onion, Potato, Noodle
  • Eggs, Spinach
  • Eggs, Noodle
  • Meat, Potato, Salt, Onion
  • Salt, Spinach
  • Meat, Potato
  • Apply the Apriori algorithm on this dataset.

Note that, the set of items is {Meat, Potato, Onion, Noodle, Spinach, Eggs, Salt}.  You may use 0.3 for the minimum support value.

  • Show the rules that have a confidence of 0.8 or greater for an itemset containing three items.
  1. Classification:

Classification is the process of learning a model that describes different classes of data and

the classes should be pre-determined. Consider the following set of data records:

ID Age City Gender Education Profile
 

101

 

20-30

 

NY

F  

College

 

Employed

102 31-40 NY F College Employed
103 51-60 NY F College Unemployed
104 20-30 LA M High School Unemployed
105 41-50 NY F College Employed
106 41-50 NY F Graduate Employed
107 20-30 LA M College Employed
108 20-30 NY F High School Unemployed
109 20-30 NY F College Employed
       110         51-60          SF             M        College                       Unemployed

 

 

Assuming, that the class attribute is Profile, apply a classification algorithm to this dataset.

  1. Clustering: Consider the following set of two-dimensional records:

 

RID                  Age                    Years of Service

  • 30 5
  • 50 25
  • 50 15
  • 25 5
  • 30 10
  • 55 25

 

  • Marks:

Use the K-means algorithm to cluster this dataset. You can use a value of 2 for K and can assume that the records with RIDs 103, and 104 are used for the initial cluster centroids.

  • Marks:

What is the difference between describing discovered knowledge using clustering and describing it using classification?

  • Assignment-3-vziqoz.zip