Description
- Using the dataset gala we discussed in class.
- (a) Fit a regression model with “Endemics” as the response and “Area”, “Elevation”, “Nearest”, “Scruz”, “Adjacent” as predic- tors. Give a short summary of what you find. please also provide a boxplot of the residual.
- (b) Which observation has the largest absolute residual? Please give the case number.
- (c) Compute the mean and median of the residuals. Explain what the difference between the mean and the median indicates. Should you worry about this difference here?
- (d) Compute the correlation of the absolute values of the residuals and the fitted values. Plot the absolute values of the residu- als against the fitted values. Please comment on what you have learned from this correlation and the plot.
Hints: Useful R functions for the homework: library(), data(), lm(), summary(), residuals(), fitted(), which.max(), abs(), mean(), median() and cor().
- Let r be the sample correlation of X and Y ; both are continuous vari- abels. Also let sd(X) and sd(Y ) be their sample standard deviation. Now let βˆx be the LS estimated slope when regressing Y on X. Please give an equation that relates r to βˆx.
Hints: Your equation should contain sd(X) and sd(Y ).
- For the problem above, if we now regress X on Y and let the estimated
LS slope be βˆy. Would βˆy = 1/βˆx?
- On page 35 of the class notes # 3 (STATS 500 Note-3), you were asked to perform a t-test of H0 : βddpi = 0.5, vs. HA : βddpi ̸= 0.5. Please report your test outcome here. Please also explain why the p-value for your t-test should be the same as the p-value obtained by the F-test on page 35 (value = 0.6475).