Econ613 Introduction to R Solved

30.00 $

Category:
Click Category Button to View Your Next Assignment | Homework

You'll get a download link with a: zip solution files instantly, after Payment

Securely Powered by: Secure Checkout

Description

Rate this product

1 Introduction to R

The goal of this assignment is to introduce you to R. It is not graded, but essential for the rest of the class. Solutions will be posted in a week.

1.1 Introduction

Using this sample code,

install.packages("BB")
library(BB)
source("A1.R")
?for
??rpareto
dir()
1+1
2/2
save.image("misc.RDATA")
1:10
30%%4
setwd("/Users/ms486/Dropbox/Papers/Progress")
getwd()
ls()
2/0
log(-1)
sum(1:10)

Exercise 1

Introduction

1. Create a directory for this class and store your script “a0.R”
2. Install the packages, Hmisc, gdata,boot,xtable,MASS,moments,snow,mvtnorm 3. Set your working directory

1

4. List the content of your directory and the content of your environment 5. Check whether 678 is a multiple of 9
6. Save your environment
7. Find help on the function mean, cut2

8. Find an operation that returns NaN (Not A Number)

1.2 Ob jects

Vectors, Matrix, Arrays

vec0  = NULL
vec1  = c(1,2,3,4)
vec2  = 1:4
vec3  = seq(1,4,1)
vec4  = rep(0,4)
sum(vec1)
str(vec1)
prod(vec1)
mat1  = mat.or.vec(2,2)
mat2  = matrix(0,ncol=2,nrow=2,byrow=T)
mat3  = cbind(c(0,0),c(0,0))
mat4  = rbind(c(1,1),c(0,0))
mat5  = matrix(1:20,nrow=5,ncol=4)
mat5[1:2,3:4]
mat5[1,]
arr1  = array(0,c(2,2))
dim(mat4)
dim(vec2)
length(vec2)

2

length(mat1)
class(mat4)

Exercise 2 Object Manipulation

1. Print Titanic, and write the code to answer these questions (one function (sum) , one operation)

(a) Total population (b) Total adults

(c) Total crew
(d) 3rd class children

(e) 2nd class adult female (f) 1st class children male (g) Female Crew survivor

(h) 1st class adult male survivor 2. Using the function prop.table, find

(a) The proportion of survivors among first class, male, adult (b) The proportion of survivors among first class, female, adult

(c) The proportion of survivors among first class, male, children (d) The proportion of survivors among third class, female, adult

Exercise 3 Vectors – Introduction

1. Use three different ways, to create the vectors

(a) a = 1,2,…,50

(b) b = 50,49,…,1 Hint : rev

3

2. Create the vectors

(a) a = 10,19,7,10,19,7,…,10,19,7 with 15 occurrences of 10,19,7

(b) b = 1,2,5,6,…,1,2,5,6 with 8 occurrences of 1,2,5,6 Hint : rep

  1. Create a vector of the values of log(x)sin(x) at x = 3.1, 3.2, . . . , 6
  2. Using the function sample, draw 90 values between (0,100) and calculate the mean. Re-do

    the same operation allowing for replacement.

  3. Calculate

(a) 􏰀20 a=1

(b) 􏰀20 a=1

􏰀15 exp(√a)log(a5) b=1 5 + cos(a)sin(b)

􏰀a exp(√a)log(a5)
b=1 5 + exp(ab)cos(a)sin(b)

6. Create a vector of the values of exp(x) cos(x) at x = 3, 3.1, …6. Exercise 4 Vectors – Advanced

  1. Create two vectors xV ec and yV ec by sampling 1000 values between 0 and 999.
  2. SupposexVec=(x1,…,xn)andyVec=(y1,…,yn)

(a) Create the vector (y2 −x1,…,yn −xn−1) denoted by zVec.
(b) Createthevector(sin(y1),sin(y2),…,sin(yn−1))denotedbywVec.

cos(x2 cos(x3 cos(xn
(c) Create a vector subX which consists of the values of xV ec which are ≥ 200.

(d) What are the index positions in yV ec of the values which are ≥ 600.

Exercise 5

􏰂􏰂

􏰂􏰂 1 1 3 􏰂􏰂 􏰂􏰂

1. CreatethematrixA=􏰂􏰂 5 2 6 􏰂􏰂 􏰂􏰂

􏰂􏰂 􏰂􏰂−2 −1 −3􏰂􏰂

Matrix

(a) Check that A3=0 (matrix 0).

4

(b) Bind a fourth column as the sum of the first and third column (c) Replace the third row by the sum of the first and second row

(d) Calculate the average by row and column. 2. Consider this system of linear equations:

3. Solve this equation.

2x + y + 3z = 10 (1) x+y+z=6 (2) x + 3y + 2z = 13 (3)

Exercise 6 Functions

1. Write a function fun1 which takes two arguments (a,n) where (a) is a scalar and n is a positive integer, and returns

2. Consider the function

a2a3 an a+2+3+…+n

f(x)= x2+3+log(1+x) if0≤x<2; (4) 

 

 x2 + 2x + |x| if x < 0; 

 x2 + 4x − 14 if x ≥ 2. Evaluate the function at -3, 0 and 3.

Exercise 7 Indexes

  1. Sample 36 values between 1 and 20 and name it v1
  2. Use two different ways to create the subvector of elements that are not in the first position

    of the vector. Hint: which and subset can not be used. Check x[a] and x[-a].

  3. Create a logical element (TRUE or FALSE), v2, which is true if v1 > 5. Can you convert

    this logical element into a dummy 1 (TRUE) and 0 (FALSE)? 5

  1. Create a matrix m1 [6 × 6] which is filled by row using the vector v1.
  2. Create the following object
      x = c(rnorm(10),NA,paste("d",1:16),NA,log(rnorm(10)))
    
  3. Test for the position of missing values, and non-finite values. Return a subvector free of missing and non-finite values.

    Exercise 8 Data Manipulation

  1. Load the library AER, and the dataset (data(”GSOEP9402”)) to be named dat.
  2. What type of object is it? Find the number of rows and column? Can you provide the names of the variables?
  3. Evaluate and plot the average annual income by year.
  4. Create an array that illustrates simultaneously the income differences (mean) by gender, school and memployment.

    Exercise 9 First regression

  1. Load the dataset (data(”CASchools”)) to be named dat1.
  2. Using the function lm, run a regression of read on the following variables: district, school, county, grades, students, teachers, calworks, lunch, computer, expenditure, income and english. Store this regression as reg1.
  3. Can you run a similar regression by specifying,
      formula = y ~ x. lm(formula)
    

    Create reg2, that uses only the 200 first observations.

    Exercise 10 Advanced indexing

    6

1.

2.

3. 4. 5.

Create a vector lu of 200 draws from a pareto distribution (1,1). How many values are higher than 10. Replace these values by draws from a logistic distribution (6.5,0.5).

Create a vector de of 200 draws from a normal distribution (1,2). Set de = log(de), and count the number of missing values or negative values. Replace these values by draws from a normal distribution (0,1) truncated at 0. hint:truncnorm

Create two vectors, orig and dest as 200 draws from a uniform distribution [0,1]. Create two matrices, hist and dist as 200*200 draws from a uniform distribution [0,1]. Consider this function

qjl(w)= r+dejw+lujlog(w)−lul(1+log(w))+r+dej 􏰃sujk−􏰃sulk+r+dej 􏰃sejk−􏰃selk

where

6. 7. 8. 9.

List

li
li[[1]] = mat1

r+del r+del k̸=j k̸=l r+del k̸=j

suj,l = log(origj + destl + distj,l)/(1 + log(origj + destl + distj,l)) sej,l = exp(origj + destl + histj,l)/(1 + exp(origj + destl + histj,l))

k̸=l

(6) (7)

Create the matrices su and se.

Set r = 0.05. Create a function to evaluate qjl(.). Evaluate qjl(9245) for all pairs (j,l).

Create gridw, which consists of a sequence from 9100 to 55240 of length 50.

Using the function sapply, evaluate qjl. Store the ouput into an array of dimension (50 × 200 × 200). How long does it take to evaluate qjl() for each value of w?

= list()

7

(5)

li[[2]] = Titanic
li1     = list(x=mat1,y=Titanic)
li1$x
li2$y

Dataframe

data=data.frame(x=rnorm(100),y=runif(100))
data
browse(data)
edit(data)
data[,1]
data[1,]
data$x
names(data)
attach(data)
x
detach(data)
y

Tests and Conversion

is.na()
is.list()
is.factor()
is.matrix()
is.vector()
is.array()
is.finite()
a==b

a=>b a<=b

as.list()
as.factor()

8

1.3 Basic functions

Exercise 11 Tests and indexing

  1. Test if c(1, 2, 3) is an array? a vector? a matrix?
  2. x0 = rnorm(1000); Using the function table() count the number of occurrences of x0 > 0,

    x0 > 1, x0 > 2, x0 > 0.5, x0 < 1 and x0 > −1

  3. x1 = cut2(runif(100,0,1),g=10)
      levels(x1)=paste("q",1:10,sep="")
    
  4. Test whether or not x1 is a factor?
  5. Verify that ”q1” has 10 occurences.
  6. Convert x1 into a numeric variables. What happens to the levels?
  7. rand = rnorm(1000)
  8. Using the function which() find the indexes of positive values.
  9. Create the object w of positive values of x using:

    (a) Which (b) Subset

    (c) By indexing directly the values that respect a condition

9

Function
abs(x)
sqrt(x)
ceiling(x)
floor(x)
trunc(x)
round(x, digits=n) signif(x, digits=n) log(x)

exp(x)
substr(x, start=n1, stop=n2)

grep(pattern, x ) sub(pattern, replacement, x) strsplit(x, split) strsplit(”abc”, ””)
paste(…, sep=””) toupper(x)
tolower(x)

1.4 Language

Table 1: Basic Functions

Description
absolute value
square root
ceiling(3.475) is 4
floor(3.475) is 3
trunc(5.99) is 5
round(3.475, digits=2) is 3.48 signif(3.475, digits=2) is 3.5 logarithm

ex

Extract or replace substrings in a character vector.
x = ”abcdef” , substr(x, 2, 4) is ”bcd”
Search for pattern in x.
Find pattern in x and replace with replacement text. Split the elements of character vector x at split. returns 3 element vector ”a”,”b”,”c”

Concatenate strings Uppercase Lowercase

if (condition) statement
for (i in range) statement
while (condition) statement
fun = function(input) {calculation return(output)}
fun = function(input) {calculation output}

Exercise 12 Programming

Write a program that asks the user to
type an integer N and compute u(N) defined with :
u(0)=1
u(1)=1
u(n+1)=u(n)+u(n-1)

1. Evaluate 12 +22 +32 +…4002.

10

Functions apply by eapply lapply mapply rapply tapply

Table 2: Apply functions

Usage
Apply Functions Over Array Margins
Apply a Function to a Data Frame Split by Factors Apply a Function Over Values in an Environment Apply a Function over a List or Vector
Apply a Function to Multiple List or Vector Arguments Recursively Apply a Function to a List
Apply a Function Over a Ragged Array

  1. Evaluate 1×2+2×3+3×4+…+249×250
  2. Create a function ”crra” with two arguments (c, θ) that returns c1−θ . Add an if condition

1−θ such that the utility is given by the log when θ ∈ [0.97, 1, 03] ≈ 1

4. Create a function ”fact” that returns the factorial of a number

Exercise 13 Apply Functions

  1. Using this object,
      m = matrix(c(rnorm(20,0,10), rnorm(20,-1,10)), nrow = 20, ncol = 2)
    

    Calculate the mean, median, min, max and standard deviation by row and column.

  2. Using the dataset iris in the package ”datasets”, calculate the average Sepal.Length by

    Species. Evaluate the sum log of Sepal.Width by Species.

  3. y1 = NULL; for (i in 1:100) y1[i]=exp(i) y2 = exp(1:100)
    y3 = sapply(1:100,exp)

    (a) Check the outcome of these three operations.

    (b) Using proc.time() or system.time(), compare the execution time of these three equiv- alents commands.

    11

name dname( ) pname( ) qname( ) rname( )

description
density or probability function cumulative density function quantile function
random deviates

Table 3: Statistical distributions

Function
mean(x, trim=0,na.rm=FALSE) sd(x), var(x)
median(x)
quantile(x, probs)
range(x)
sum(x)
diff(x, lag=1)
min(x)
max(x)

Description
mean of ob ject x
standard deviation, variance of object(x)
median
x is the numeric vector and probs is a numeric vector with probabilities range
sum
lagged differences, with lag indicating which lag to use
minimum
maximum

Table 4: Statistical Functions

Table 5: Statistical distributions

Distribution
Beta
Lognormal Binomial
Negative Binomial Cauchy

R name beta lnorm binom nbinom cauchy norm chisq pois exp

t

Normal
Chisquare
Poisson
Exponential
Student t Ff

Uniform Gamma
Tukey Geometric Weibull Hypergeometric Wilcoxon Logistic

12

unif gamma tukey geom weib hyper wilcox logis

1.5 Statistics

Exercise 14 Simulating and Computing

  1. Simulate a vector x of 10,000 draws from a normal distribution. Use the function summary to provide basic characteristics of x.
  2. Create a function dsummary that returns, the minimum, the 1st decile, the 1st quartile, the median, the mean, the standard deviation, the 3rd quartile, the 9th decile, and the maximum.
  3. Suppose X ∼ N(2,0.25). Evaluate f(0.5),F(2.5),F−1(0.95)
  4. Repeat if X has t-distribution with 5 degrees of freedom.
  5. SupposeX ∼P(3,1),wherePistheparetodistribution. Evaluatef(0.5),F(2.5),F−1(0.95)

    Exercise 15 Moments

Consider a vector V = rnorm(100, −2, 5).

  1. Evaluate n as the length of V.
  2. Compute the mean m = 1 􏰀i=n Vi n i=1
  3. Compute the variance s2 = 1 􏰀ni (Vi − m)2 n−1
  4. Compute the skewness γ1 = 1 (Vi − m)3 n s3

    1 (Vi − m)4

  5. Compute the kurtosis k1 = n s4 − 3

    Exercise 16 OLS

  1. Create a matrix X of dimension (1000,10). Fill it with draws from a beta distribution with shape1 parameter 2, and shape 2 parameter 1. Make sure that there is no negative.
  2. Create a scalar denoted by σ2 and set it to 0.5. Generate a vector β of size 10. Fill it with draws from a Gamma distribution with parameters 2 and 1.

    13

Function (Operator)

A∗B A%∗%B t(A) diag(a) diag(A) Solve(A)

Description
Element wise multiplication
matrix multiplication
Transpose
Create a diagonal matrix with a elements Return the diagonal of A
inverse of A

Table 6: Matrix operation

3. Create a vector ε of 1000 draws from a normal distribution. √

4.CreateY=Xβ+ σ2∗ε

  1. Recover βˆ = (X′X)−1(X′Y )
  2. Evaluate 􏰄ε = y􏰄 − y. Plot the histogram (filled in grey) and the kernel density of the distribution of the error term.

  1. Create param that binds (β, V (β)). Using the command lm, check these estimates.
  2. Construct a confidence interval for β.
  3. Redo the exercise by setting σ2 = 0.01. How are your confidence intervals for β.

􏰄ε 􏰄ε
7. Estimate σ2 = n − p − 1, and V(β􏰄) = σ2(X′X)−1

􏰅

􏰄

14

  • A0-2a4xry.zip