CNS Project 3 Solved

30.00 $

Category:
Click Category Button to View Your Next Assignment | Homework

You'll get a download link with a: zip solution files instantly, after Payment

Securely Powered by: Secure Checkout

Description

Rate this product

The writing of the project report, either in LaTeX or R Markdown, should detail all the theoretical aspects of the methods involved in this project and provide all the necessary graphical displays essential for the understanding of the particular cases. From the three groups/topics below you should choose only two.

The Exponential Family

1. Let X ∼ W eibull(α, β), with β known and α unknown, which has p.d.f.

that

with

2

f(x; α, β) = exp􏰋 􏰀 ηi(α, β)Ti(x) − A(α, β) + c(x)􏰌 i=1

α􏰅x􏰆β−1 −(x)β +

βα

e α , α,β > 0, x ∈ R .

f(x;α,β) =

  1. (a)  Show that this distribution belongs to the exponential family.
  2. (b)  Clearly identify the canonical link and the sufficient statistic. Do you already have the canonical form? If not, write it down.
  3. (c)  Use the canonical form to
    i. compute E(Xβ) and V (Xβ)

    ii. write the score function Sn(α) and see if it is possible to analytically derive the maximum likelihood estimator of α, αMLE

    iii. compute the Fisher Information In(α)
    iv. report the asymptotic variance of the maximum likelihood estimator αMLE

2. Say some distribution depending on unknown parameters (α, β) ∈ R+ × R+ has p.d.f. such

• η(α,β)=(η1(α,β),η2(α,β))=(α,−β) • (T1(x),T2(x))=(logx,x)
• A(α,β)=−αlogβ+logΓ(α)
• c(x)=−logx

Use the canonical form of the p.d.f. to compute E(X), V(X) and In(α,β). Report the asymptotic variance of the maximum likelihood estimator (αMLE,βMLE).

1

Deadline: 12/01/22

Generalized Linear Models

The class of generalized linear models also comprises the Poisson distribution, with the log link function being the mathematically convenient option as it allows the linear predictor to span the entire real line. The Poisson regression method is often employed for the statistical analysis of data that involve counts of events occurring within a certain amount of time. Such is the case, for example, of epidemiological studies that require the calculation of rates, typically rates of death or incidence rates of a chronic or acute disease. Here, the parameter of interest is usually the expected counts per unit of observed time, i.e., the rate at which events occur.

  1. The following data refers to the number of new AIDs cases (y) each year (x) in Belgium, from 1981 to 1993 (Venables & Ripley, Modern Applied Statistics with S)
                         12 14 33 50 67 74 123 141 165 204 253 246 240
    

    The question here is whether these data support evidence that the increase in the rate of new case generation is slowing down.

    You may use here the R built-in glm() function.

    1. (a)  Fit a Poisson regression model to the data (Y,x), i.e., to Y ∼ x. Report the model fit residual deviance and AIC, plot the residual plots for the fitted model and comment on model fit adequacy.
    2. (b)  Fit a Poisson regression model again for the relationship Y ∼ x + x2. Report the model fit residual deviance and AIC, plot the residual plots for the fitted model and comment on model fit adequacy when compared to the previous model fit.
    3. (c)  One now wishes to compare the previous two models by means of an analysis of variance table. Describe model selection via the ANOVA table. Use the R built-in anova() function to compare both models. Which model better fits the data?
    4. (d)  Provide model summary, confidence intervals for the fixed parameters and model in- terpretation for the selected model. Plot the data. Predict 100 values from the fitted model (use the R built-in predict() function). Plot the data versus the fitted line. Add confidence bands for the fitted line at ±2se. Make an attempt at answering the main question.
  2. Fully describe and implement the IRWLS for Poisson regression such that your implementa- tion returns the same summary output table as the R glm() function. Use your implementa- tion to fit the Poisson model to the previous dataset considering Y ∼ x.

2

Bayesian Inference and Computation

The janka data frame, from the R package SemiPar, has 36 observations on Australian timber samples, which refer to measurements of the density (predictor variable) and hardness (response variable) of the timber.

  1. Describe in detail, and fit from scratch, a Bayesian linear model to the janka data using the Gibbs sampler. Report point estimates, credible intervals and whatever more you feel is important regarding the analysis of these data.
  2. Go to

    https://cran.r-project.org/web/packages/bayestestR/vignettes/bayestestR.html

    and install the R package bayestestR. Use the stan glm() to fit the Bayesian linear model to the janka data and compare these results with the results from 1.. You can use the rjags library instead or other that you feel is more convenient for you.

  3. Compare the Bayesian results (point estimates and credible intervals) with the results from the classical analysis performed in week 5, slides 12 − 14, part II.

3

  • 3rd-Project-ighayt.zip