# E510 Assignment 3 Solved

35.00 \$

Category:
Click Category Button to View Your Next Assignment | Homework

You'll get a download link with a: . ` zip` solution files instantly, after Payment

## Description

Rate this product

Problem 1Â

You are given two timeseries x1 and x2 (data.mat or data.csv), each one containing 310 points in time. For each timeseries, you’d like to investigate how the characteristic temporal patterns change in time, so you decide to use the singular spectrum analysis (applied on each timeseries separately) with a total lag of L= 50 days.

• Plot the timeseries and create the lagged matrix for each of the timeseries. Show (in symbolic matrix form) how your lagged matrix looks like for x1. [1 point for the plot and 1 point for correct lagged matrix for x1]
• Perform SSA. Plot the eigenvectors and PCs of the most important modes (decide yourself how many modes are important) for x1 and x2. Hint: in SSA we are interested in the pairs of modes. [1 point for each plot: eigenvectors and PCs for x1; eigenvectors and PC2 for x2 -> total 4 points]
• How much variance is carried by the dominant signals (signals of different frequencies) in x1 and how much in x2? Note that in SSA, a signal of given frequency is usually captured by two modes. [1 point for the correct

Problem 2Â

You are given a data (data_problem2.mat or data_problem2.csv) that contains one year of normalized daily streamflow from 194 rivers in Alberta, Canada (i.e. there are 194 stations, each with 365 days of normalized streamflow). The locations of each station are given by a latitude/longitude coordinate pair in stationLon.mat and stationLat.mat (or stationLon.csv and stationLat.csv). ABlon.csv and ABlat.csv give coordinates of the Alberta border for plotting (e.g. plt.plot(lon,lat) or figure; plot(lon, lat) will plot the border).

Following the guidelines below, perform two types of clustering to investigate how to cluster these stations across the region on the basis of similarity in their streamflow regimes.

Note: apply PCA on the data first (m=365, n=194) and then perform clustering (hierarchical clustering and SOM) on the first few modes only. Most likely the first 3 modes will be enough to keep. In the final plots, make sure that you reconstruct the data (streamflow) from the clustered PC modes, as was done in the Tutorial example on SST dataset.