Homework #1 CS 260: Machine Learning Algorithms Solved

25.00 $

Category:
Click Category Button to View Your Next Assignment | Homework

You'll get a download link with a: . zip solution files instantly, after Payment

Description

5/5 - (2 votes)

1           Sequence of Coin Flips

Suppose you have a biased coin with probability of heads equal to p. Imagine that you flip this coin until observing the first heads. Let X denote the number of flips needed to observe the first heads.

  1. Describe P[X = k] as a function of p.
  2. For any x0 1, show that the probability P[X    x0] = (1  p)x0 1.
  3. Suppose we have a prior belief that p is uniformly distributed in the [0,1] interval. Assuming that the first coin flip equals heads, compute the probability that p > 1/2, i.e., compute P[p > 1/2|X = 1]. Does our belief about the probability of the event {p > 1/2} increase or decrease after observing the head in the first flip. Hint: Use Bayes Rule.

2           Convex Functions and Information Theory

  1. Show that the function f(x) = |x| + exp(x) is convex.
  2. Suppose the random variable X is distributed according to a k-class multi-nominal distributions with class probabilities p1,p2,…,pk, such that = 1. Find the values of pi,i = 1,…,k such that the entropy of X is maximized.

3           Linear Algebra

  1. The covariance matrix of a random vector X is defined as = E[(X EX)(X EX)T], where EX is the expectation of X. Is positive-semidefinite?
  2. Let A and B be two RDD symmetric matrices. Suppose A and B have the exact same set of eigenvectors u1,u2,·· ,uD with the corresponding eigenvalues 1,↵2,··· ,↵D for A, and 1, 2,··· , D for B. Please write down the eigenvectors and their corresponding eigenvalues for the following matrices:
    • C = A + B
    • D = A B
    • E = AB
    • F = A 1B (assume A is invertible)

1

4           KNN Classification in MATLAB/Octave

In this problem, you will implement a KNN classifier and deploy it on a real-world dataset. Below, we describe the steps that you need to take to accomplish this programming assignment.

You will work with a preprocessed version of the Car Evaluation Dataset from UCI’s machine learning data repository. The training/validation/test sets are provided along with the assignment as cars train.data, cars valid.data, and cars test.data. For a description of the dataset and to determine which field corresponds to the label, please refer to http://archive.ics.uci.edu/ml/datasets/Car+Evaluation.

  1. The first step in every data analysis experiment involves inspecting the data and to make sure it is properly formatted. You will find that the features in the provided dataset are categorical. However, KNN requires the features to be real-valued numbers. To convert a categorical feature with K categories to a real-valued number, you can create K new binary The ith binary feature indicates whether the original feature belongs to the ith category or not. This strategy is called ‘one-hot encoding.’
  2. Please fill in the function knn classify in knn m. The inputs of this function are training data, new data (either validation or testing data) and k. The function needs to output the accuracy on both training and new data (either validation or testing).
  3. Consider k = 1,3,5,·· ,23. For each k, report the training and validation accuracy. Identify the k with the highest validation accuracy, and report the test accuracy with this choice of k. Note: if multiple values of k result in the highest validation accuracy, then report test accuracies for all such values of k.
  4. Apply kNN on the mat dataset which is a binary classification dataset with only two features. You need to run kNN with k = 1,5,15,20 and examine the decision boundary. A simple way to visualize the decision boundary is to draw 10000 data points on a uniform 100 ⇥ 100 grid in the square (x,y) 2 [0,1] ⇥ [0,1] and classify them using the kNN classifier. Then, plot the data points with di↵erent markers corresponding to di↵erent classes. Repeat this process for all k and discuss the smoothness of the decision boundaries as k increases.

 

  • Assignment-1.zip