STAT542 Assignment 1 Solved

30.00 $

Category:
Click Category Button to View Your Next Assignment | Homework

You'll get a download link with a: zip solution files instantly, after Payment

Securely Powered by: Secure Checkout

Description

5/5 - (1 vote)

This assignment is related to the simulation study described in Section 2.3.1 (the so-called Scenario 2) of “Elements of Statistical Learning” (ESL).

Scenario 2: the two-dimensional data X R2 in each class is generated from a mixture of 10 different bivariate Gaussian distributions with uncorrelated components and different means, i.e.,

,

where k = 0,1, l = 1 : 10, P(Y = k) = 1/2, and P(Z = 1) = 1/10. In other words, given Y = k, X follows a mixture distribution with density function

.

You can choose your own values for s and the twenty 2-dim vectors mkl, or you can generate them from some distribution.

Repeat the following simulation 20 times. In each simulation, following the data generating process,

  1. generate a training sample of size 200 and a test sample of size 10,000, and
  2. calculate the training and test errors (the averaged 0/1 error1 ) for the following four procedures:
  • Linear regression with cut-off value 0.5,
  • quadratic regression with cut-off value 0.5,
  • kNN classification with k chosen by 10-fold cross-validation, and
  • the Bayes rule (assume your know the values of mkl’s and s).

Summarize your results on training errors and test errors graphically, e.g., using boxplot or stripchart. Also report the mean and standard error for the selected k values.

R packages you are allowed to use are class (for kNN) and ggplot2 (for graphs).

1For each sample, the incurred error is 1 if there is a mistake, and 0 otherwise.

What you need to submit?

A PDF file and an R Markdown file that produces the PDF file.

  • Name your files starting with

Assignment 1 xxxx netID where “xxxx” is the last 4-dig of your University ID.

For example, the submission for Max Y. Chen with UID 672757127 and netID mychen12 would be named as

Assignment 1 7127 mychen12 MaxChen.Rmd/.pdf

You can add whatever characters after your netID.

  • Set the seed at the beginning of your code to be the last 4-dig of your University ID. So once we run your code, we can get the same result.
  • A1-3n7eqp.zip