Description
Implement the EM algorithm for a p-dimensional Gaussian mixture model with G components:
G
X
pk · N(x;µk,Σ). k=1
Store the parameters as a list in R with three components
- prob: a G-dimensional probability vector;
- mean: a p-by-G matrix with the k-th column being µk, the p-dimensional mean for the k-th Gaussian component;
- Sigma: a p-by-p covariance matrix shared by all G
Your code should have the following structure.
Estep <- function(data, G, para){
# Your Code # return the n-by-G probability matrix } Mstep <- function(data, G, para, post.prob){ # Your Code # Return the updated parameters } myEM <- function(data, T, G, para){ for(t in 1:T){ post.prob <- Estep(data, G, para) para <- Mstep(data, G, para, post.prob) } return(para) } |
You should test your code on the faithful data from the R package mclust with G = 2. The estimated parameters from your algorithm and the one from mclust after T = 10 iterations should be the same.
library(mclust)
n <- nrow(faithful) Z <- matrix(0, n, 2) Z[sample(1:n, 120), 1] <- 1 Z[, 2] <- 1 – Z[, 1] ini0 <- mstep(modelName=”EEE”, faithful, Z)$parameters # Output from my EM alg para0 <- list(prob = ini0$pro, mean = ini0$mean, Sigma = ini0$variance$Sigma) myEM(T=10, para=para0) # Output from mclust Rout <- em(modelName = “EEE”, data = faithful, control = emControl(eps=0, tol=0, itmax = 10), parameters = ini0)$parameters list(Rout$pro, Rout$mean, Rout$variance$Sigma) |