# Homework #4 CS 260: Machine Learning Algorithms Solved

30.00 \$

Category:
Click Category Button to View Your Next Assignment | Homework

You'll get a download link with a: . ` zip` solution files instantly, after Payment

# 1           Linear Regression with Heterogenous Noise

In the standard linear regression model, we consider the model that the observed response variable y is the prediction perturbed by noise, namely

y = x> +

where is a Gaussian random variable with mean 0 and variance 2. Notably, we are assuming that for all observations in the training data, the corresponding noises are identically and independently distributed. In other words, for the n-th observation xn, the observed response is

where n ⇠N(0, 2).

This assumption is not applicable in some cases. For example, in the example of predicting the sale prices of houses, the variances for larger houses (e.g., houses with larger xn which is the square footage) tend to be bigger, as the sale prices for larger houses seem to be more variable. In this case, we can model the data in the following way:

where n are independently distributed but do not have to be identically distributed. In particular, each one could have a di↵erent variance, namely, n ⇠N(0, n2).

• Suppose our training dataset contains {(xn,yn),n = 1,2,…,N} such observations. Write down the log-likelihood function of the data. This function should be a function of the data as well as and all n.
• Derive the maximum likelihood estimate of , and express it in terms of the data as well as all the              n.

You should assume        n is known to you — you do not need to estimate them from the data.

# 2           Linear Regression with Smooth Coe          cients

Consider a dataset with n data points (xi,yi), xi 2Rp⇥1, drawn from the following linear model:

y = x> + “,

where is a Gaussian noise. Suppose the features xi1,…,xip for all i = 1,…,n have a natural ordering. Several examples have this ordering property; for example in the study of the impact of proteins on certain types of cancer, the proteins are ordered sequentially on a line. Intuitively, we can encode the natural ordering information by introducing a condition that requires the di↵erence ( i i+1)2 cannot be large, for i = 1,…,p 1.

1

• State the condition as a regularizer. Write the new optimization problem for finding by combining both this regularization and L2 (10 points)
• Find the optimal by solving the problem in part (a). (5 points)

# 3           Linearly Constrained Linear Regression

Consider a dataset with n data points (xi,yi), xi 2Rp⇥1, drawn from the following linear model:

y = x> + “,

where is Gaussian noise. Suppose we have additional information about that requires A = b where A 2 Rqp and b 2 Rq⇥1. Suppose the constraint A = b has a non-empty set of solutions; thus the optimization has feasible solutions. Find the maximum likelihood estimation of under this constraint.

# 4           Online Learning

The perceptron algorithm often makes harsh updates, as it is strongly biased towards the current mistakenlylabeled sample. Suppose at the ith step, the classifier is wi and we want to make a more conservative update based on observation of (xi,yi) to a new classifier wi+1. Derive a new update method for the perceptron such that it makes the smallest di↵erence from the previous model, that is, it minimizes kwi+1 wik2 while ensuring that wi+1 classifies the current sample correctly. You need to provide the closed form analytical equation for the update rule.

• Assignment-4.zip